Difference between revisions of "fcl-xml"

From Free Pascal wiki
Jump to navigationJump to search
(delink htmldefs since it has been nominated for deletion)
 
(24 intermediate revisions by 10 users not shown)
Line 1: Line 1:
 +
The package FCL-XML contains units that parse XML and HTML files to DOM, and can render the DOM tree to HTML, XHTML and XML.
  
= Units =
+
Most notably the HTML part still needs some work.
  
{| BORDER="1" CELLSPACING="0"
+
==Units==
!STYLE="background:#ffdead;"|'''Unit'''
+
 
!STYLE="background:#ffdead;"|'''unit group '''
+
{| class="wikitable"
!STYLE="background:#ffdead;"|'''comment'''
+
! Unit !! Unit group !! Comment
 +
|----
 +
|[[dom]]||  -  || Implements the DOM level 2 Core specification and some of the DOM level 3 Core properties/methods.
 
|----
 
|----
|[[dom]]||  -  || Implements most of the DOM level 1 specification and supports some of the DOM level 2 extensions.
+
|[[dom_html]]||  -  || DOM extensions for HTML, THTMLDocument
 
|----
 
|----
|[[dom_html]]||  -  ||  
+
|[[dtdmodel]]||  -  || Classes represententing the DTD (document type definition), used by xmlread and dom.
 
|----
 
|----
|[[htmldefs]]||  -  || Contains basic HTML declarations.
+
|htmldefs||  -  || Contains basic HTML declarations.
 
|----
 
|----
 
|[[htmlelements]]||  -  || Implements a DOM for HTML content. Contains a TDOMElement descendent for all valid HTML 4.1 tags.
 
|[[htmlelements]]||  -  || Implements a DOM for HTML content. Contains a TDOMElement descendent for all valid HTML 4.1 tags.
Line 17: Line 20:
 
|[[htmlwriter]]||  -  || Implements a verified HTML producer.
 
|[[htmlwriter]]||  -  || Implements a verified HTML producer.
 
|----
 
|----
|[[htmwrite]]||  -  ||
+
|[[htmwrite]]||  -  || Writes a DOM structure as UTF-8 encoded HTML data into file or stream.
 
|----
 
|----
|[[sax]]||  -  ||
+
|[[sax]]||  -  || Base classes for a parser after [[Wikipedia: Simple API for XML|SAX]] model.
 
|----
 
|----
|[[sax_html]]||  -  ||
+
|[[sax_html]]||  -  || HTML plugin (implementation) for the html SAX parser.
 
|----
 
|----
|[[sax_xml]]||  -  ||
+
|[[sax_xml]]||  -  || XML plugin for the SAX parser.
 
|----
 
|----
|[[xhtml]]||  -  ||
+
|[[xhtml]]||  -  ||   XHTML helper classes(?)
 
|----
 
|----
|[[xmlcfg]]||  -  ||
+
|[[xmlcfg]]||  -  || Implements TXMLConfig class, which enables applications to store their configuration data in XML files.
 
|----
 
|----
|[[xmlconf]]||  -  ||
+
|[[xmlconf]]||  -  || An improved version of xmlcfg, based on DOMString instead of AnsiString. Provides better Unicode support, faster/smaller code due to absense of conversions, and TRegistry-like OpenKey/CloseKey methods.
 
|----
 
|----
 
|[[xmliconv]]||  -  ||  Registers an any-to-UTF-16 decoder based on libiconv ([[iconvenc]] package)
 
|[[xmliconv]]||  -  ||  Registers an any-to-UTF-16 decoder based on libiconv ([[iconvenc]] package)
 
|----
 
|----
|[[xmliconv_windows]]||  -  ||  Registers an any-to-UTF-16 decoder based on libiconv ([[iconvenc]] package), Windows version. (still uses iconv!)
+
|[[xmliconv_windows]]||  -  ||  Registers an any-to-UTF-16 decoder based on libiconv (windows dependant iconv header), Windows version. (still uses iconv!)
 
|----
 
|----
|[[xmlread]]||  -  || Provides an XML reader, which can read XML data from a file or stream.
+
|[[xmlread]]||  -  || Provides routines and classes to read XML data from a file or stream into DOM.
 
|----
 
|----
|[[xmlstreaming]]||  -  ||
+
|[[xmlstreaming]]||  -  || (Not functional) An initial attempt to support standard component streaming in XML format. The working variant of this unit is available in Lazarus (components/codetools/laz_xmlstreaming.pas).
 
|----
 
|----
|[[xmlutils]]||  -  ||
+
|[[xmlutils]]||  -  || Implements utility functions and classes that are used by other units in the package.
 
|----
 
|----
 
|[[xmlwrite]]||  -  || Writes a DOM structure as XML data into a file or stream. It can deal both with XML files and XML fragments.
 
|[[xmlwrite]]||  -  || Writes a DOM structure as XML data into a file or stream. It can deal both with XML files and XML fragments.
 
|----
 
|----
|[[xpath]]||  -  || Just a XPath implementation. Should be fairly completed, but there hasn't been further development recently.
+
|[[xpath]]||  -  || Just an XPath implementation. Should be fairly completed, but there hasn't been further development recently.
 +
|----
 +
|[[xmlreader]]||  -  || Contains TXMLReader, an abstract base class for .NET style streamed XML reading.
 
|----
 
|----
 +
|[[xmltextreader]]||  -  || Contains TXMLTextReader, a TXMLReader descendant which parses text. This is the core of XML reading functionality.
 
|}
 
|}
  
 
Include files
 
Include files
{| BORDER="1" CELLSPACING="0"
+
{| class="wikitable"
!STYLE="background:#ffdead;"|'''include file'''
+
! include file !! comment
!STYLE="background:#ffdead;"|'''comment'''
 
 
|----
 
|----
|names.inc      ||
+
|names.inc      || Included by xmlutils. Contains character tables for XML names.
 
|----
 
|----
 
|tagsimpl.inc    ||
 
|tagsimpl.inc    ||
Line 61: Line 66:
 
|----
 
|----
 
|wtagsintf.inc  || contains all possible tags for htmlwriter
 
|wtagsintf.inc  || contains all possible tags for htmlwriter
 
+
|----
 +
|xpathkw.inc    || contains the perfect hash function for XPath keywords
 
|----
 
|----
 
|}
 
|}
  
Go to back [[Package_List|Packages List]]
+
== Notes==
 +
 
 +
* beware, both dom_html and htmlelements seem to define a THTMLDocument class.
 +
* The nodemanager of DOM is not Delphi (2009 in my case) compatible, which can lead to strange crashes when attempting to free code. This might have to do with different minimal size of an object.
 +
 
 +
== Tutorial ==
 +
 
 +
* Here: [[XML Tutorial]]
 +
 
 +
==Links==
 +
 
 +
* [http://www.w3.org/DOM/DOMTR Dom specs]
 +
* [http://htmlparser.sourceforge.net/ Sax based html parser] (Java)
 +
* [http://www.saxproject.org/ The Sax project] (Java)
 +
 
 +
==Cleanup==
 +
 
 +
* <s>lots of "dynamic" methods. Afaik FPC doesn't implement it (always virtual iirc), since it was mainly needed in 16-bit envs.</s>  Fixed in r13382
 +
* ifdef usedynarrays in sax.pp is so 1.0.x
 +
 
 +
==See also==
 +
 
 +
* [[Package_List|Packages List]]
 +
 
 +
[[Category:FCL]]
 +
[[Category:Packages]]
 +
[[Category:XML]]

Latest revision as of 15:43, 6 August 2022

The package FCL-XML contains units that parse XML and HTML files to DOM, and can render the DOM tree to HTML, XHTML and XML.

Most notably the HTML part still needs some work.

Units

Unit Unit group Comment
dom - Implements the DOM level 2 Core specification and some of the DOM level 3 Core properties/methods.
dom_html - DOM extensions for HTML, THTMLDocument
dtdmodel - Classes represententing the DTD (document type definition), used by xmlread and dom.
htmldefs - Contains basic HTML declarations.
htmlelements - Implements a DOM for HTML content. Contains a TDOMElement descendent for all valid HTML 4.1 tags.
htmlwriter - Implements a verified HTML producer.
htmwrite - Writes a DOM structure as UTF-8 encoded HTML data into file or stream.
sax - Base classes for a parser after SAX model.
sax_html - HTML plugin (implementation) for the html SAX parser.
sax_xml - XML plugin for the SAX parser.
xhtml - XHTML helper classes(?)
xmlcfg - Implements TXMLConfig class, which enables applications to store their configuration data in XML files.
xmlconf - An improved version of xmlcfg, based on DOMString instead of AnsiString. Provides better Unicode support, faster/smaller code due to absense of conversions, and TRegistry-like OpenKey/CloseKey methods.
xmliconv - Registers an any-to-UTF-16 decoder based on libiconv (iconvenc package)
xmliconv_windows - Registers an any-to-UTF-16 decoder based on libiconv (windows dependant iconv header), Windows version. (still uses iconv!)
xmlread - Provides routines and classes to read XML data from a file or stream into DOM.
xmlstreaming - (Not functional) An initial attempt to support standard component streaming in XML format. The working variant of this unit is available in Lazarus (components/codetools/laz_xmlstreaming.pas).
xmlutils - Implements utility functions and classes that are used by other units in the package.
xmlwrite - Writes a DOM structure as XML data into a file or stream. It can deal both with XML files and XML fragments.
xpath - Just an XPath implementation. Should be fairly completed, but there hasn't been further development recently.
xmlreader - Contains TXMLReader, an abstract base class for .NET style streamed XML reading.
xmltextreader - Contains TXMLTextReader, a TXMLReader descendant which parses text. This is the core of XML reading functionality.

Include files

include file comment
names.inc Included by xmlutils. Contains character tables for XML names.
tagsimpl.inc
tagsintf.inc contains all possible tags for htmlelements
wtagsimpl.inc
wtagsintf.inc contains all possible tags for htmlwriter
xpathkw.inc contains the perfect hash function for XPath keywords

Notes

  • beware, both dom_html and htmlelements seem to define a THTMLDocument class.
  • The nodemanager of DOM is not Delphi (2009 in my case) compatible, which can lead to strange crashes when attempting to free code. This might have to do with different minimal size of an object.

Tutorial

Links

Cleanup

  • lots of "dynamic" methods. Afaik FPC doesn't implement it (always virtual iirc), since it was mainly needed in 16-bit envs. Fixed in r13382
  • ifdef usedynarrays in sax.pp is so 1.0.x

See also