Difference between revisions of "XML Tutorial/zh CN"

From Free Pascal wiki
Jump to navigationJump to search
(Created page with "{{XML Tutorial}} 可扩展标记语言(XML)是一个世界万维网组织(或[http://www.w3.org/ W3C])推荐的在不同系统间交换信息的语言。 它是一种基于...")
 
m (Fixed syntax highlighting)
 
(10 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
{{XML Tutorial}}
 
{{XML Tutorial}}
  
可扩展标记语言(XML)是一个世界万维网组织([http://www.w3.org/ W3C])推荐的在不同系统间交换信息的语言。
+
可扩展标记语言(XML)是一个世界万维网组织([http://www.w3.org/ W3C])推荐的在不同系统间交换信息的语言。
  
它是一种基于文本的储存信息的方式,而不是直接基于二进制数据的。
+
它是一种基于文本的信息储存方式,而不是直接基于二进制数据。
  
许多现代的电子数据交换语言,比如XHTML,包括世界上最流行的网络技术都是基于XML的。这篇WIKI只能给出一些关于XML的极短的描述,主要的关注点在如何在Pascal中使用XML文件,如果你对XML十分感兴趣,你可以访问[http://zh.wikipedia.org/wiki/XML/ XML的维基百科]
+
许多现代的电子数据交换语言,比如XHTML,包括世界上最流行的网络技术都是基于XML的。这篇WIKI只能给出一些关于XML极短的描述,主要的关注点在如何在Pascal中使用XML文件,如果你对XML十分感兴趣,你可以访问[http://zh.wikipedia.org/wiki/XML XML的维基百科]
  
  
 
== 介绍 ==
 
== 介绍 ==
  
Currently there is a set of units that provides support for XML on Free Pascal. These units are called "XMLRead", "XMLWrite" and "DOM" and they are part of the Free Component Library (FCL) from the Free Pascal Compiler. The FCL is already on the default search path for the compiler on Lazarus, so you only need to add the units to your uses clause in order to get XML support. The FCL is not documented currently (October / 2005), so this short tutorial aims at introducing XML access using those units.
+
目前,有一大堆的单元提供Pascal中的XML支持,这些单元包括 "XMLRead","XMLWrite""DOM",他们是FPC中免费组件库(FCL)的一部分,FCL已经包括在Lazarus编译器的单元搜索路径中了,所以你只需要在uses中加上要使用的单元名就可以获得XML的支持。FCL的文档并不完整(至少在2005年10月),所以这篇简短的教程将将把侧重点放在如何使用这些单元上。
  
The XML DOM (Document Object Model) is a set of standardized objects that provide a similar interface for using XML on different languages and systems. The standard only specifies the methods, properties and other interface parts of the object, leaving the implementation free for different languages. The FCL currently fully supports the [http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/ XML DOM 2.0] and a subset of [http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/ XML DOM 3.0] listed [[dom|here]].
+
XML DOM(文档对象模型)是一些在不同语言与系统中提供了相似XML支持的标准对象,这个标准只描述了方法、属性和其他一些对象的接口,把具体的实现交给了各种语言。FCL完全支持[http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/ XML DOM 2.0]以及它的子集[http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/ XML DOM 3.0]列出的[[dom|这些内容]].
  
 
== 使用范例 ==
 
== 使用范例 ==
  
Below there is a list of XML data manipulation examples with growing complexity.
+
下面有一些从简单到复杂的XML操作示例。
  
 
=== 在Uses中引用的单元:Unicode或Ansi ===
 
=== 在Uses中引用的单元:Unicode或Ansi ===
  
FPC comes with XML units which utilize ANSI-encoded routines and therefore in each platform the encoding might be different and might not be Unicode. Lazarus comes with another separate set of XML units in the package LazUtils which fully support UTF-8 Unicode in all platforms. The units are compatible and one can change from to the other just by changing the uses clause.
+
FPC带来的XML单元通常使用ANSI编码所以在任何一个平台上编码都有可能不同,可能不是Uniocde。
 +
Lazarus带来另一个单独的XML单元且在所有平台都完全支持UTF-8。这些单元是互相兼容而且可以通过修改Uses来进行切换。
 +
 
 +
FPC中使用系统编码的字符串XML支持单元:
  
The units for using the FPC XML support which uses strings encoded in the system encoding are:
 
 
* DOM
 
* DOM
 
* XMLRead
 
* XMLRead
Line 30: Line 32:
 
* XMLStreaming
 
* XMLStreaming
  
The units for using the Lazarus XML support which has full UTF-8 Unicode support are:
+
Lazarus中完全使用UTF-8 Unicode的XML支持单元:
 +
 
 
* laz2_DOM
 
* laz2_DOM
 
* laz2_XMLRead
 
* laz2_XMLRead
Line 36: Line 39:
 
* laz2_XMLCfg
 
* laz2_XMLCfg
 
* laz2_XMLUtils
 
* laz2_XMLUtils
* laz_XMLStreaming.
+
* laz_XMLStreaming  
  
Not all of them are needed in every example, though. You will need DOM as it defines several types including TXMLDocument.
+
然而,并不是在每个实例中都需要其中所有的单元,你需要DOM单元因为它定义了很多的类型包括TXMLDocument。
  
 
=== 读取一个XML节点 ===
 
=== 读取一个XML节点 ===
  
For Delphi Programmers:
+
致Delphi程序员:
Note that when working with TXMLDocument, the text within a Node is considered a separate TEXT Node.  As a result, you must access a node's text value as a separate node. Alternatively, the '''TextContent''' property may be used to retrieve content of all text nodes beneath the given one, concatenated together.
 
  
The '''ReadXMLFile''' procedure always creates a new '''TXMLDocument''', so you don't have to create it beforehand. However, be sure to destroy the document by calling '''Free''' when you are done.
+
请注意使用TXMLDocument时,节点中的文本被囊括在独立的TEXT节点中,所以,你必须作为单独的节点访问一个节点的文本值。
 +
另外,"TextContent"属性可以检索在一个节点中所有的文本节点并衔接在一起。
  
For instance, consider the following XML:
+
'''ReadXMLFile'''过程总是创建一个新的'''TXMLDocument'''对象,所以你不必预先创建,但是,当你使用完后请牢记使用'''Free'''过程释放XML文档。
 +
 
 +
例如,思考下面的XML文档:
  
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
Line 56: Line 61:
 
</request></syntaxhighlight>
 
</request></syntaxhighlight>
  
The following code example shows both the correct and the incorrect ways of getting the value of the text node (add the units '''laz2_XMLRead''' and '''laz2_DOM''' to the used units list):
+
下面的代码同时展示了正确与错误的获取文本值得方式:(添加'''laz2_XMLRead''''''laz2_DOM''' 到uses列表)
  
<source>var
+
<syntaxhighlight lang=pascal>var
 
   PassNode: TDOMNode;
 
   PassNode: TDOMNode;
 
   Doc: TXMLDocument;
 
   Doc: TXMLDocument;
 
begin
 
begin
 
   try
 
   try
     // Read in xml file from disk
+
     // 从磁盘读取XML文档
 
     ReadXMLFile(Doc, 'test.xml');
 
     ReadXMLFile(Doc, 'test.xml');
     // Retrieve the "password" node
+
     // 获取Password节点
 
     PassNode := Doc.DocumentElement.FindNode('password');
 
     PassNode := Doc.DocumentElement.FindNode('password');
     // Write out value of the selected node
+
     // 输出节点的值
     WriteLn(PassNode.NodeValue); // will be blank
+
     WriteLn(PassNode.NodeValue); // 将会输出空白
     // The text of the node is actually a separate child node
+
     // 节点的文本值在独立的子节点中
     WriteLn(PassNode.FirstChild.NodeValue); // correctly prints "abc"
+
     WriteLn(PassNode.FirstChild.NodeValue); // 正确地输出 "abc"
     // alternatively
+
     // 另一种选择
 
     WriteLn(PassNode.TextContent);
 
     WriteLn(PassNode.TextContent);
 
   finally
 
   finally
     // finally, free the document
+
     // 最后,释放XML文档
 
     Doc.Free;
 
     Doc.Free;
 
   end;
 
   end;
end;</source>
+
end;
 +
</syntaxhighlight>
  
Note that ReadXMLFile(...) ignores all leading whitespace characters when parsing a document. The section  [[#Whitespace_characters|whitespace characters]] describes how to keep them.
+
请注意ReadXMLFile在解析时忽略所有空白字符,[[#Whitespace_characters|空白字符]]类目将会教授你如何保留它们
  
 
=== 输出每一个节点和属性的名字 ===
 
=== 输出每一个节点和属性的名字 ===
  
If you want to navigate the DOM tree: when you need to access nodes in sequence, it is best to use '''FirstChild''' and '''NextSibling''' properties (to iterate forward), or '''LastChild''' and '''PreviousSibling''' (to iterate backward).
+
如果你希望浏览DOM树: 当你需要访问序列中的节点, 最好的方法是使用 '''FirstChild''' '''NextSibling''' 属性 (从第一个节点向后访问), 或者 '''LastChild''' '''PreviousSibling''' (从最后的节点向前访问).
  
For random access it is possible to use '''ChildNodes''' or '''GetElementsByTagName''' methods, but these will create a TDOMNodeList object which eventually must be freed. This differs from other DOM implementations like MSXML, because the FCL implementation is object-based, not interface-based.
+
如果想随机的访问节点,应该使用 '''ChildNodes''' 或者 '''GetElementsByTagName''' 方法, 但这样需要创建一个TDOMNodeList 对象,并且这个对象最终必须被释放. 由于 FCL 的DOM实现是基于对象的,而不是基于接口的,所以这和其它的DOM实现,如MSXML,是有区别的.
  
The following example shows how to print the names of nodes to a TMemo placed on a form.
+
下面的示例演示了如何将节点的名称显示在一个 TMemo 控件中.
  
Below is the XML file called 'test.xml':
+
名为'test.xml'的 XML文件如下:
  
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
Line 99: Line 105:
 
</images></syntaxhighlight>
 
</images></syntaxhighlight>
  
And here the Pascal code to execute the task:
+
示例Pascal代码如下:
  
<source>var
+
<syntaxhighlight lang=pascal>var
 
   Doc: TXMLDocument;
 
   Doc: TXMLDocument;
 
   Child: TDOMNode;
 
   Child: TDOMNode;
Line 109: Line 115:
 
     ReadXMLFile(Doc, 'test.xml');
 
     ReadXMLFile(Doc, 'test.xml');
 
     Memo.Lines.Clear;
 
     Memo.Lines.Clear;
     // using FirstChild and NextSibling properties
+
     // 使用FirstChild和NextSibling属性
 
     Child := Doc.DocumentElement.FirstChild;
 
     Child := Doc.DocumentElement.FirstChild;
 
     while Assigned(Child) do
 
     while Assigned(Child) do
 
     begin
 
     begin
 
       Memo.Lines.Add(Child.NodeName + ' ' + Child.Attributes.Item[0].NodeValue);
 
       Memo.Lines.Add(Child.NodeName + ' ' + Child.Attributes.Item[0].NodeValue);
       // using ChildNodes method
+
       // 使用 ChildNodes方法
 
       with Child.ChildNodes do
 
       with Child.ChildNodes do
 
       try
 
       try
 
         for j := 0 to (Count - 1) do
 
         for j := 0 to (Count - 1) do
 
           Memo.Lines.Add(format('%s %s (%s=%s; %s=%s)',
 
           Memo.Lines.Add(format('%s %s (%s=%s; %s=%s)',
                                [
+
                          [
 
                                   Item[j].NodeName,
 
                                   Item[j].NodeName,
 
                                   Item[j].FirstChild.NodeValue,
 
                                   Item[j].FirstChild.NodeValue,
                                   Item[j].Attributes.Item[0].NodeName,  // 1st attribute details
+
                                   Item[j].Attributes.Item[0].NodeName,  // 第1个属性值
 
                                   Item[j].Attributes.Item[0].NodeValue,
 
                                   Item[j].Attributes.Item[0].NodeValue,
                                   Item[j].Attributes.Item[1].NodeName,  // 2nd attribute details
+
                                   Item[j].Attributes.Item[1].NodeName,  // 第2个属性值
 
                                   Item[j].Attributes.Item[1].NodeValue
 
                                   Item[j].Attributes.Item[1].NodeValue
 
                                 ]));
 
                                 ]));
Line 135: Line 141:
 
     Doc.Free;
 
     Doc.Free;
 
   end;
 
   end;
end;</source>
+
end;</syntaxhighlight>
  
This will print:
+
在TMemo显示结果如下:
  
 
<pre>imageNode graphic.jpg
 
<pre>imageNode graphic.jpg
Line 145: Line 151:
 
=== 将树状图转化为XML ===
 
=== 将树状图转化为XML ===
  
One common use of XML files is to parse them and show their contents in a tree like format. You can find the TTreeView component on the "Common Controls" tab on Lazarus.
+
XML文件的一种常见的用法是将其解析并将内容显示为树状图形式. 在Lazarus中,你可以在"Common Controls"控件页中找到TTreeView控件.
 +
下面的函数将一个先前加载的或代码生成的XML文档的内容生成对应的树状图TreeView. 每个节点的标题作为该节点的第一个属性内容.
  
The function below will take a XML document previously loaded from a file or generated on code, and will populate a TreeView with it´s contents. The caption of each node will be the content of the first attribute of each node.
+
<syntaxhighlight lang=pascal>procedure TForm1.XML2Tree(tree: TTreeView; XMLDoc: TXMLDocument);
 
 
<source>procedure TForm1.XML2Tree(tree: TTreeView; XMLDoc: TXMLDocument);
 
 
var
 
var
 
   iNode: TDOMNode;
 
   iNode: TDOMNode;
Line 158: Line 163:
 
     s: string;
 
     s: string;
 
   begin
 
   begin
     if Node = nil then Exit; // Stops if reached a leaf
+
     if Node = nil then Exit; // 如果到达树末端,则停止
 
      
 
      
     // Adds a node to the tree
+
     // 添加节点到树
 
     if Node.HasAttributes and (Node.Attributes.Length>0) then
 
     if Node.HasAttributes and (Node.Attributes.Length>0) then
 
       s := Node.Attributes[0].NodeValue
 
       s := Node.Attributes[0].NodeValue
Line 167: Line 172:
 
     TreeNode := tree.Items.AddChild(TreeNode, s);
 
     TreeNode := tree.Items.AddChild(TreeNode, s);
  
     // Goes to the child node
+
     // 转到子节点
 
     cNode := Node.FirstChild;
 
     cNode := Node.FirstChild;
  
     // Processes all child nodes
+
     // 处理所有子节点
 
     while cNode <> nil do
 
     while cNode <> nil do
 
     begin
 
     begin
Line 182: Line 187:
 
   while iNode <> nil do
 
   while iNode <> nil do
 
   begin
 
   begin
     ProcessNode(iNode, nil); // Recursive
+
     ProcessNode(iNode, nil); // 递归
 
     iNode := iNode.NextSibling;
 
     iNode := iNode.NextSibling;
 
   end;
 
   end;
end;</source>
+
end;</syntaxhighlight>
  
Another example that displays the complete XML structure including all attribute values (note: the long line referencing TreeView has been split so it will word wrap for this wiki; when writing it in code you do not have to break the line unless you like the formatting) :
+
另一个示例显示完整的XML结构,包括所有属性值(: 为了使代码在本wiki中的易读,对TreeView的相关的长行引用将被断行处理; 你在书写代码时不必进行断行操作,除非你喜欢这种风格) :
<source>procedure XML2Tree(XMLDoc:TXMLDocument; TreeView:TTreeView);
+
<syntaxhighlight lang=pascal>procedure XML2Tree(XMLDoc:TXMLDocument; TreeView:TTreeView);
  
   // Local function that outputs all node attributes as a string
+
   // 输出全部节点属性到一个字符串的函数
 
   function GetNodeAttributesAsString(pNode: TDOMNode):string;
 
   function GetNodeAttributesAsString(pNode: TDOMNode):string;
 
   var i: integer;
 
   var i: integer;
Line 200: Line 205:
 
           Result := Result + format(' %s="%s"', [NodeName, NodeValue]);
 
           Result := Result + format(' %s="%s"', [NodeName, NodeValue]);
  
     // Remove leading and trailing spaces
+
     // 移除首尾的空格
 
     Result:=Trim(Result);
 
     Result:=Trim(Result);
 
   end;
 
   end;
  
   // Recursive function to process a node and all its child nodes
+
   // 通过递归函数处理节点及其全部子节点
  
 
   procedure ParseXML(Node:TDOMNode; TreeNode: TTreeNode);
 
   procedure ParseXML(Node:TDOMNode; TreeNode: TTreeNode);
 
   begin
 
   begin
     // Exit procedure if no more nodes to process
+
     // 如果已无节点需要处理,则退出过程
 
     if Node = nil then Exit;
 
     if Node = nil then Exit;
  
     // Add node to TreeView
+
     // 添加节点到TreeView
 
     TreeNode := TreeView.Items.AddChild(TreeNode,  
 
     TreeNode := TreeView.Items.AddChild(TreeNode,  
 
                                           Trim(Node.NodeName+' '+  
 
                                           Trim(Node.NodeName+' '+  
Line 218: Line 223:
 
                                         );
 
                                         );
  
     // Process all child nodes
+
     // 处理全部子节点
 
     Node := Node.FirstChild;
 
     Node := Node.FirstChild;
 
     while Node <> Nil do
 
     while Node <> Nil do
Line 231: Line 236:
 
   ParseXML(XMLDoc.DocumentElement,nil);
 
   ParseXML(XMLDoc.DocumentElement,nil);
 
end;
 
end;
</source>
+
</syntaxhighlight>
  
 
=== 修改一个XML文档 ===
 
=== 修改一个XML文档 ===
  
The first thing to remember is that TDOMDocument is the "handle" to the DOM. You can get an instance of this class by creating one or by loading a XML document.
+
首先要记住的是,TDOMDocument是DOM的“句柄”。你可以实例化这个类,创建或加载一个XML文档。
  
 
Nodes on the other hand cannot be created like a normal object. You *must* use the methods provided by TDOMDocument to create them, and later use other methods to put them in the correct place in the tree. This is because nodes must be "owned" by a specific document in DOM.
 
Nodes on the other hand cannot be created like a normal object. You *must* use the methods provided by TDOMDocument to create them, and later use other methods to put them in the correct place in the tree. This is because nodes must be "owned" by a specific document in DOM.
 +
<strike>另一方面节点不能像正常对象一样被创建。你*必须*使用TDOMDocument提供的方法来创建它们,</strike>
  
Below are some common methods from TDOMDocument:
+
下面是TDOMDocument中常用的方法:
  
<source>function CreateElement(const tagName: DOMString): TDOMElement; virtual;
+
<syntaxhighlight lang=pascal>function CreateElement(const tagName: DOMString): TDOMElement; virtual;
 
function CreateTextNode(const data: DOMString): TDOMText;
 
function CreateTextNode(const data: DOMString): TDOMText;
 
function CreateCDATASection(const data: DOMString): TDOMCDATASection; virtual;
 
function CreateCDATASection(const data: DOMString): TDOMCDATASection; virtual;
function CreateAttribute(const name: DOMString): TDOMAttr; virtual;</source>
+
function CreateAttribute(const name: DOMString): TDOMAttr; virtual;</syntaxhighlight>
  
<tt>CreateElement</tt> creates a new element.
+
<tt>CreateElement</tt> 创建新元素。
  
<tt>CreateTextNode</tt> creates a text node.
+
<tt>CreateTextNode</tt> 创建文本节点。
  
<tt>CreateAttribute</tt> creates an attribute node.
+
<tt>CreateAttribute</tt> 创建属性节点。
  
<tt>CreateCDATASection</tt> creates a CDATA section: regular XML markup characters such as <> are not interpreted within the CDATA section. See [https://secure.wikimedia.org/wikipedia/en/wiki/CDATA Wikipedia article on CDATA]
+
<tt>CreateCDATASection</tt> 创建 CDATA 数据: regular XML markup characters such as <> are not interpreted within the CDATA section. 查看 [https://en.wikipedia.org/wiki/CDATA CDATA维基百科]
  
A more convenient method to manipulate attributes is to use <tt>TDOMElement.SetAttribute</tt> method, which is also represented as the default property of <tt>TDOMElement</tt>:
+
操作属性更方便的方法是使用 <tt>TDOMElement.SetAttribute</tt> , 这也表示为<tt>TDOMElement</tt>的默认属性:
  
<syntaxhighlight>
+
<syntaxhighlight lang=pascal>
// these two statements are equivalent
+
// 这两条语句是等价的
 
Element.SetAttribute('name', 'value');
 
Element.SetAttribute('name', 'value');
 
Element['name'] := 'value';
 
Element['name'] := 'value';
Line 264: Line 270:
 
And here an example method that will locate the selected item on a TTreeView and then insert a child node to the XML document it represents. The TreeView must be previously filled with the contents of an XML file using the [[Networking#Populating a TreeView with XML|XML2Tree function]].
 
And here an example method that will locate the selected item on a TTreeView and then insert a child node to the XML document it represents. The TreeView must be previously filled with the contents of an XML file using the [[Networking#Populating a TreeView with XML|XML2Tree function]].
  
<source>procedure TForm1.actAddChildNode(Sender: TObject);
+
<syntaxhighlight lang=pascal>procedure TForm1.actAddChildNode(Sender: TObject);
 
var
 
var
 
   position: Integer;
 
   position: Integer;
Line 300: Line 306:
 
     *******************************************************************}
 
     *******************************************************************}
 
   end;
 
   end;
end;</source>
+
end;</syntaxhighlight>
  
 
=== 从字符串创建TXMLDocument ===
 
=== 从字符串创建TXMLDocument ===
  
Given an XML document in string variable ''MyXmlString'', the following code will create it's DOM:
+
给定一个存放在字符串变量''MyXmlString''中的XML文档, 以下代码将创建对应的DOM:
  
<source>var
+
<syntaxhighlight lang=pascal>var
 
   S: TStringStream;
 
   S: TStringStream;
 
   XML: TXMLDocument;
 
   XML: TXMLDocument;
Line 319: Line 325:
 
     S.Free;
 
     S.Free;
 
   end;
 
   end;
end;</source>
+
end;</syntaxhighlight>
  
 
=== 验证一个XML文档 ===
 
=== 验证一个XML文档 ===
Line 341: Line 347:
 
Loading such document is slightly more complicated. Let's assume we have XML data in a TStream object:
 
Loading such document is slightly more complicated. Let's assume we have XML data in a TStream object:
  
<source>procedure TMyObject.DOMFromStream(AStream: TStream);
+
<syntaxhighlight lang=pascal>procedure TMyObject.DOMFromStream(AStream: TStream);
 
var
 
var
 
   Parser: TDOMParser;
 
   Parser: TDOMParser;
Line 369: Line 375:
 
   if E.Severity = esError then  // we are interested in validation errors only
 
   if E.Severity = esError then  // we are interested in validation errors only
 
     writeln(E.Message);
 
     writeln(E.Message);
end;</source>
+
end;</syntaxhighlight>
  
 
=== 空字符 ===
 
=== 空字符 ===
Line 375: Line 381:
 
Before calling ''Parser.Parse(Src, TheDoc)'' insert the line  
 
Before calling ''Parser.Parse(Src, TheDoc)'' insert the line  
  
<source>Parser.Options.PreserveWhitespace := True;</source>
+
<syntaxhighlight lang=pascal>Parser.Options.PreserveWhitespace := True;</syntaxhighlight>
  
 
This will force the parser to return all whitespace characters. This includes all the newline characters that exist in an XML document to make it more readable!
 
This will force the parser to return all whitespace characters. This includes all the newline characters that exist in an XML document to make it more readable!
Line 381: Line 387:
 
=== 生成一个XML文件 ===
 
=== 生成一个XML文件 ===
  
Below is the complete code to write a XML file.
+
下面是写入一个XML文件的完整源代码.
(This was taken from a tutorial in the DeveLazarus blog)
+
(来自DeveLazarus博客的一个教程)
Please, remember to include the DOM and XMLWrite units in your uses clause.
+
请记着在单元的uses段引用 DOM XMLWrite 单元.
  
<source>unit Unit1;
+
<syntaxhighlight lang=pascal>unit Unit1;
  
 
{$mode objfpc}{$H+}
 
{$mode objfpc}{$H+}
Line 457: Line 463:
 
   {$I unit1.lrs}
 
   {$I unit1.lrs}
  
end.</source>
+
end.</syntaxhighlight>
  
The  result will be the XML file below:
+
结果将生成以下内容的XML文件:
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
 
<syntaxhighlight lang="xml"><?xml version="1.0"?>
 
<register>
 
<register>
Line 468: Line 474:
 
</register></syntaxhighlight>
 
</register></syntaxhighlight>
  
An example where you don't need to reference an item by index.
+
另一个示例,这里不需要使用索引(index)来访问节点项.
  
<source>
+
<syntaxhighlight lang=pascal>
 
procedure TForm1.Button2Click(Sender: TObject);
 
procedure TForm1.Button2Click(Sender: TObject);
 
var
 
var
Line 511: Line 517:
 
     Doc.Free;
 
     Doc.Free;
 
   end;
 
   end;
</source>
+
</syntaxhighlight>
  
Generated XML:
+
生成的 XML:
 
<syntaxhighlight lang="xml">
 
<syntaxhighlight lang="xml">
 
<?xml version="1.0"?>
 
<?xml version="1.0"?>
Line 551: Line 557:
 
== 扩展链接 ==
 
== 扩展链接 ==
  
* [http://www.w3schools.com/xml/default.asp W3Schools] Xml Tutorial
+
* [http://www.w3schools.com/xml/default.asp W3Schools] Xml 教程
 
 
* [http://www.thomas-zastrow.de/texte/fpcxml/index.php Thomas Zastrow article] [http://web.archive.org/web/20080802150722/http://www.thomas-zastrow.de/texte/fpcxml/index.php Alternate link] FPC and XML
 
  
[[Category:Free Component Library]]
+
* [http://www.thomas-zastrow.de/texte/fpcxml/index.php Thomas Zastrow 文章] [http://web.archive.org/web/20080802150722/http://www.thomas-zastrow.de/texte/fpcxml/index.php Alternate link] FPC and XML
[[Category:Tutorials]]
 
[[Category:XML]]
 
[[Category:FPC]]
 
[[Category:Lazarus]]
 

Latest revision as of 02:50, 2 March 2020

Deutsch (de) English (en) español (es) français (fr) magyar (hu) Bahasa Indonesia (id) italiano (it) 日本語 (ja) 한국어 (ko) português (pt) русский (ru) 中文(中国大陆)‎ (zh_CN)

可扩展标记语言(XML)是一个世界万维网组织(W3C)推荐的在不同系统间交换信息的语言。

它是一种基于文本的信息储存方式,而不是直接基于二进制数据。

许多现代的电子数据交换语言,比如XHTML,包括世界上最流行的网络技术都是基于XML的。这篇WIKI只能给出一些关于XML极短的描述,主要的关注点在如何在Pascal中使用XML文件,如果你对XML十分感兴趣,你可以访问XML的维基百科


介绍

目前,有一大堆的单元提供Pascal中的XML支持,这些单元包括 "XMLRead","XMLWrite"和"DOM",他们是FPC中免费组件库(FCL)的一部分,FCL已经包括在Lazarus编译器的单元搜索路径中了,所以你只需要在uses中加上要使用的单元名就可以获得XML的支持。FCL的文档并不完整(至少在2005年10月),所以这篇简短的教程将将把侧重点放在如何使用这些单元上。

XML DOM(文档对象模型)是一些在不同语言与系统中提供了相似XML支持的标准对象,这个标准只描述了方法、属性和其他一些对象的接口,把具体的实现交给了各种语言。FCL完全支持XML DOM 2.0以及它的子集XML DOM 3.0列出的这些内容.

使用范例

下面有一些从简单到复杂的XML操作示例。

在Uses中引用的单元:Unicode或Ansi

FPC带来的XML单元通常使用ANSI编码所以在任何一个平台上编码都有可能不同,可能不是Uniocde。 Lazarus带来另一个单独的XML单元且在所有平台都完全支持UTF-8。这些单元是互相兼容而且可以通过修改Uses来进行切换。

FPC中使用系统编码的字符串XML支持单元:

  • DOM
  • XMLRead
  • XMLWrite
  • XMLCfg
  • XMLUtils
  • XMLStreaming

Lazarus中完全使用UTF-8 Unicode的XML支持单元:

  • laz2_DOM
  • laz2_XMLRead
  • laz2_XMLWrite
  • laz2_XMLCfg
  • laz2_XMLUtils
  • laz_XMLStreaming

然而,并不是在每个实例中都需要其中所有的单元,你需要DOM单元因为它定义了很多的类型包括TXMLDocument。

读取一个XML节点

致Delphi程序员:

请注意使用TXMLDocument时,节点中的文本被囊括在独立的TEXT节点中,所以,你必须作为单独的节点访问一个节点的文本值。 另外,"TextContent"属性可以检索在一个节点中所有的文本节点并衔接在一起。

ReadXMLFile过程总是创建一个新的TXMLDocument对象,所以你不必预先创建,但是,当你使用完后请牢记使用Free过程释放XML文档。

例如,思考下面的XML文档:

<?xml version="1.0"?>
<request>
  <request_type>PUT_FILE</request_type>
  <username>123</username>
  <password>abc</password>
</request>

下面的代码同时展示了正确与错误的获取文本值得方式:(添加laz2_XMLReadlaz2_DOM 到uses列表)

var
  PassNode: TDOMNode;
  Doc: TXMLDocument;
begin
  try
    // 从磁盘读取XML文档
    ReadXMLFile(Doc, 'test.xml');
    // 获取Password节点
    PassNode := Doc.DocumentElement.FindNode('password');
    // 输出节点的值
    WriteLn(PassNode.NodeValue); // 将会输出空白
    // 节点的文本值在独立的子节点中
    WriteLn(PassNode.FirstChild.NodeValue); // 正确地输出 "abc"
    // 另一种选择
    WriteLn(PassNode.TextContent);
  finally
    // 最后,释放XML文档
    Doc.Free;
  end;
end;

请注意ReadXMLFile在解析时忽略所有空白字符,空白字符类目将会教授你如何保留它们

输出每一个节点和属性的名字

如果你希望浏览DOM树: 当你需要访问序列中的节点, 最好的方法是使用 FirstChildNextSibling 属性 (从第一个节点向后访问), 或者 LastChildPreviousSibling (从最后的节点向前访问).

如果想随机的访问节点,应该使用 ChildNodes 或者 GetElementsByTagName 方法, 但这样需要创建一个TDOMNodeList 对象,并且这个对象最终必须被释放. 由于 FCL 的DOM实现是基于对象的,而不是基于接口的,所以这和其它的DOM实现,如MSXML,是有区别的.

下面的示例演示了如何将节点的名称显示在一个 TMemo 控件中.

名为'test.xml'的 XML文件如下:

<?xml version="1.0"?>
<images directory="mydir">
  <imageNode URL="graphic.jpg" title="">
    <Peca DestinoX="0" DestinoY="0">Pecacastelo.jpg1.swf</Peca>
    <Peca DestinoX="0" DestinoY="86">Pecacastelo.jpg2.swf</Peca>
  </imageNode>
</images>

示例Pascal代码如下:

var
  Doc: TXMLDocument;
  Child: TDOMNode;
  j: Integer;
begin
  try
    ReadXMLFile(Doc, 'test.xml');
    Memo.Lines.Clear;
    // 使用FirstChild和NextSibling属性
    Child := Doc.DocumentElement.FirstChild;
    while Assigned(Child) do
    begin
      Memo.Lines.Add(Child.NodeName + ' ' + Child.Attributes.Item[0].NodeValue);
      // 使用 ChildNodes方法
      with Child.ChildNodes do
      try
        for j := 0 to (Count - 1) do
          Memo.Lines.Add(format('%s %s (%s=%s; %s=%s)',
                           [
                                  Item[j].NodeName,
                                  Item[j].FirstChild.NodeValue,
                                  Item[j].Attributes.Item[0].NodeName,  // 第1个属性值
                                  Item[j].Attributes.Item[0].NodeValue,
                                  Item[j].Attributes.Item[1].NodeName,  // 第2个属性值
                                  Item[j].Attributes.Item[1].NodeValue
                                ]));
      finally
        Free;
      end;
      Child := Child.NextSibling;
    end;
  finally
    Doc.Free;
  end;
end;

在TMemo显示结果如下:

imageNode graphic.jpg
Peca Pecacastelo.jpg1.swf (DestinoX=0; DestinoY=0)
Peca Pecacastelo.jpg2.swf (DestinoX=0; DestinoY=86)

将树状图转化为XML

XML文件的一种常见的用法是将其解析并将内容显示为树状图形式. 在Lazarus中,你可以在"Common Controls"控件页中找到TTreeView控件. 下面的函数将一个先前加载的或代码生成的XML文档的内容生成对应的树状图TreeView. 每个节点的标题作为该节点的第一个属性内容.

procedure TForm1.XML2Tree(tree: TTreeView; XMLDoc: TXMLDocument);
var
  iNode: TDOMNode;

  procedure ProcessNode(Node: TDOMNode; TreeNode: TTreeNode);
  var
    cNode: TDOMNode;
    s: string;
  begin
    if Node = nil then Exit; // 如果到达树末端,则停止
    
    // 添加节点到树
    if Node.HasAttributes and (Node.Attributes.Length>0) then
      s := Node.Attributes[0].NodeValue
    else
      s := ''; 
    TreeNode := tree.Items.AddChild(TreeNode, s);

    // 转到子节点
    cNode := Node.FirstChild;

    // 处理所有子节点
    while cNode <> nil do
    begin
      ProcessNode(cNode, TreeNode);
      cNode := cNode.NextSibling;
    end;
  end;
    
begin
  iNode := XMLDoc.DocumentElement.FirstChild;
  while iNode <> nil do
  begin
    ProcessNode(iNode, nil); // 递归
    iNode := iNode.NextSibling;
  end;
end;

另一个示例显示完整的XML结构,包括所有属性值(注: 为了使代码在本wiki中的易读,对TreeView的相关的长行引用将被断行处理; 你在书写代码时不必进行断行操作,除非你喜欢这种风格) :

procedure XML2Tree(XMLDoc:TXMLDocument; TreeView:TTreeView);

  // 输出全部节点属性到一个字符串的函数
  function GetNodeAttributesAsString(pNode: TDOMNode):string;
  var i: integer;
  begin
    Result:='';
    if pNode.HasAttributes then
      for i := 0 to pNode.Attributes.Length -1 do
        with pNode.Attributes[i] do
          Result := Result + format(' %s="%s"', [NodeName, NodeValue]);

    // 移除首尾的空格
    Result:=Trim(Result);
  end;

  // 通过递归函数处理节点及其全部子节点 

  procedure ParseXML(Node:TDOMNode; TreeNode: TTreeNode);
  begin
    // 如果已无节点需要处理,则退出过程
    if Node = nil then Exit;

    // 添加节点到TreeView
    TreeNode := TreeView.Items.AddChild(TreeNode, 
                                          Trim(Node.NodeName+' '+ 
                                           GetNodeAttributesAsString(Node)+ 
                                           Node.NodeValue)
                                        );

    // 处理全部子节点
    Node := Node.FirstChild;
    while Node <> Nil do
    begin
      ParseXML(Node, TreeNode);
      Node := Node.NextSibling;
    end;
  end;

begin
  TreeView.Items.Clear;
  ParseXML(XMLDoc.DocumentElement,nil);
end;

修改一个XML文档

首先要记住的是,TDOMDocument是DOM的“句柄”。你可以实例化这个类,创建或加载一个XML文档。

Nodes on the other hand cannot be created like a normal object. You *must* use the methods provided by TDOMDocument to create them, and later use other methods to put them in the correct place in the tree. This is because nodes must be "owned" by a specific document in DOM. 另一方面节点不能像正常对象一样被创建。你*必须*使用TDOMDocument提供的方法来创建它们,

下面是TDOMDocument中常用的方法:

function CreateElement(const tagName: DOMString): TDOMElement; virtual;
function CreateTextNode(const data: DOMString): TDOMText;
function CreateCDATASection(const data: DOMString): TDOMCDATASection; virtual;
function CreateAttribute(const name: DOMString): TDOMAttr; virtual;

CreateElement 创建新元素。

CreateTextNode 创建文本节点。

CreateAttribute 创建属性节点。

CreateCDATASection 创建 CDATA 数据: regular XML markup characters such as <> are not interpreted within the CDATA section. 查看 CDATA维基百科

操作属性更方便的方法是使用 TDOMElement.SetAttribute , 这也表示为TDOMElement的默认属性:

// 这两条语句是等价的
Element.SetAttribute('name', 'value');
Element['name'] := 'value';

And here an example method that will locate the selected item on a TTreeView and then insert a child node to the XML document it represents. The TreeView must be previously filled with the contents of an XML file using the XML2Tree function.

procedure TForm1.actAddChildNode(Sender: TObject);
var
  position: Integer;
  NovoNo: TDomNode;
begin
  {*******************************************************************
  *  Detects the selected element
  *******************************************************************}
  if TreeView1.Selected = nil then Exit;

  if TreeView1.Selected.Level = 0 then
  begin
    position := TreeView1.Selected.Index;

    NovoNo := XMLDoc.CreateElement('item');
    TDOMElement(NovoNo).SetAttribute('nome', 'Item');
    TDOMElement(NovoNo).SetAttribute('arquivo', 'Arquivo');
    with XMLDoc.DocumentElement.ChildNodes do
    begin
      Item[position].AppendChild(NovoNo);
      Free;
    end;

    {*******************************************************************
    *  Updates the TreeView
    *******************************************************************}
    TreeView1.Items.Clear;
    XML2Tree(TreeView1, XMLDoc);
  end
  else if TreeView1.Selected.Level >= 1 then
  begin
    {*******************************************************************
    *  This function only works on the first level of the tree,
    *  but can easily be modified to work for any number of levels
    *******************************************************************}
  end;
end;

从字符串创建TXMLDocument

给定一个存放在字符串变量MyXmlString中的XML文档, 以下代码将创建对应的DOM:

var
  S: TStringStream;
  XML: TXMLDocument;
begin
  S := TStringStream.Create(MyXMLString);
  try
    // Read complete XML document
    ReadXMLFile(XML, S);             
    // Alternatively: read only an XML Fragment
    ReadXMLFragment(AParentNode, S); 
  finally
    S.Free;
  end;
end;

验证一个XML文档

Since March 2007, DTD validation facility has been added to the FCL XML parser. Validation is checking that logical structure of the document conforms to the predefined rules, called Document Type Definition (DTD).

Here is an example of XML document with a DTD:

<?xml version='1.0'?>
<!DOCTYPE root [
<!ELEMENT root (child)+ >
<!ELEMENT child (#PCDATA)>
]>
<root>
  <child>This is a first child.</child>
  <child>And this is the second one.</child>
</root>

This DTD specifies that 'root' element must have one or more 'child' elements, and that 'child' elements may have only character data inside. If parser detects any violations from these rules, it will report them.

Loading such document is slightly more complicated. Let's assume we have XML data in a TStream object:

procedure TMyObject.DOMFromStream(AStream: TStream);
var
  Parser: TDOMParser;
  Src: TXMLInputSource;
  TheDoc: TXMLDocument;
begin
  try
    // create a parser object
    Parser := TDOMParser.Create;
    // and the input source
    Src := TXMLInputSource.Create(AStream);
    // we want validation
    Parser.Options.Validate := True;
    // assign a error handler which will receive notifications
    Parser.OnError := @ErrorHandler;
    // now do the job
    Parser.Parse(Src, TheDoc);
    // ...and cleanup
  finally
    Src.Free;
    Parser.Free;
  end;
end;

procedure TMyObject.ErrorHandler(E: EXMLReadError);
begin
  if E.Severity = esError then  // we are interested in validation errors only
    writeln(E.Message);
end;

空字符

If you want to preserve leading whitespace characters in node texts, the above method is the way to load your XML document. Leading whitespace characters are ignored by default. That is the reason why the ReadXML(...) function never returns any leading whitespace characters in node texts. Before calling Parser.Parse(Src, TheDoc) insert the line

Parser.Options.PreserveWhitespace := True;

This will force the parser to return all whitespace characters. This includes all the newline characters that exist in an XML document to make it more readable!

生成一个XML文件

下面是写入一个XML文件的完整源代码. (来自DeveLazarus博客的一个教程) 请记着在单元的uses段引用 DOM 和 XMLWrite 单元.

unit Unit1;

{$mode objfpc}{$H+}

interface

uses
  Classes, SysUtils, LResources, Forms, Controls, Graphics, Dialogs, StdCtrls,
  DOM, XMLWrite;

type
  { TForm1 }
  TForm1 = class(TForm)
    Button1: TButton;
    Label1: TLabel;
    Label2: TLabel;
    procedure Button1Click(Sender: TObject);
  private
    { private declarations }
  public
    { public declarations }
  end;
  
var
  Form1: TForm1;
  
implementation

{ TForm1 }

procedure TForm1.Button1Click(Sender: TObject);
var
  Doc: TXMLDocument;                                  // variable to document
  RootNode, parentNode, nofilho: TDOMNode;                    // variable to nodes
begin
  try
    // Create a document
    Doc := TXMLDocument.Create;

    // Create a root node
    RootNode := Doc.CreateElement('register');
    Doc.Appendchild(RootNode);                           // save root node
  
    // Create a parent node
    RootNode:= Doc.DocumentElement;
    parentNode := Doc.CreateElement('usuario');
    TDOMElement(parentNode).SetAttribute('id', '001');       // create atributes to parent node
    RootNode.Appendchild(parentNode);                          // save parent node

    // Create a child node
    parentNode := Doc.CreateElement('nome');                // create a child node
    // TDOMElement(parentNode).SetAttribute('sexo', 'M');     // create atributes
    nofilho := Doc.CreateTextNode('Fernando');         // insert a value to node
    parentNode.Appendchild(nofilho);                         // save node
    RootNode.ChildNodes.Item[0].AppendChild(parentNode);       // insert child node in respective parent node
 
    // Create a child node
    parentNode := Doc.CreateElement('idade');               // create a child node
    // TDOMElement(parentNode).SetAttribute('ano', '1976');   // create atributes
    nofilho := Doc.CreateTextNode('32');               // insert a value to node
    parentNode.Appendchild(nofilho);                         // save node
    RootNode.ChildNodes.Item[0].AppendChild(parentNode);       // insert a childnode in respective parent node

    writeXMLFile(Doc, 'test.xml');                     // write to XML
  finally
    Doc.Free;                                          // free memory
  end;
end;

initialization
  {$I unit1.lrs}

end.

结果将生成以下内容的XML文件:

<?xml version="1.0"?>
<register>
  <usuario id="001">
    <nome>Fernando</nome>
    <idade>32</idade>
  </usuario>
</register>

另一个示例,这里不需要使用索引(index)来访问节点项.

procedure TForm1.Button2Click(Sender: TObject);
var
  Doc: TXMLDocument;
  RootNode, ElementNode,ItemNode,TextNode: TDOMNode;
  i: integer;
begin
  try
    // Create a document
    Doc := TXMLDocument.Create;
    // Create a root node
    RootNode := Doc.CreateElement('Root');
    Doc.Appendchild(RootNode);
    RootNode:= Doc.DocumentElement;
    // Create nodes
    for i := 1 to 20 do
    begin
      ElementNode:=Doc.CreateElement('Element');
      TDOMElement(ElementNode).SetAttribute('id', IntToStr(i));

      ItemNode:=Doc.CreateElement('Item1');
      TDOMElement(ItemNode).SetAttribute('Attr1', IntToStr(i));
      TDOMElement(ItemNode).SetAttribute('Attr2', IntToStr(i));
      TextNode:=Doc.CreateTextNode('Item1Value is '+IntToStr(i));
      ItemNode.AppendChild(TextNode);
      ElementNode.AppendChild(ItemNode);

      ItemNode:=Doc.CreateElement('Item2');
      TDOMElement(ItemNode).SetAttribute('Attr1', IntToStr(i));
      TDOMElement(ItemNode).SetAttribute('Attr2', IntToStr(i));
      TextNode:=Doc.CreateTextNode('Item2Value is '+IntToStr(i));
      ItemNode.AppendChild(TextNode);
      ElementNode.AppendChild(ItemNode);

      RootNode.AppendChild(ElementNode);
    end;
    // Save XML
    WriteXMLFile(Doc,'TestXML_v2.xml');
  finally
    Doc.Free;
  end;

生成的 XML:

<?xml version="1.0"?>
<Root>
  <Element id="1">
    <Item1 Attr1="1" Attr2="1">Item1Value is 1</Item1>
    <Item2 Attr1="1" Attr2="1">Item2Value is 1</Item2>
  </Element>
  <Element id="2">
    <Item1 Attr1="2" Attr2="2">Item1Value is 2</Item1>
    <Item2 Attr1="2" Attr2="2">Item2Value is 2</Item2>
  </Element>
  <Element id="3">
    <Item1 Attr1="3" Attr2="3">Item1Value is 3</Item1>
    <Item2 Attr1="3" Attr2="3">Item2Value is 3</Item2>
  </Element>
</Root>

编码

Starting from FPC version 2.4 (more precisely, from SVN revision 12582), XML reader is able to process data in any encoding by using external decoders. See XML_Decoders for more details.

According to the XML standard, the encoding attribute in the first line of the XML is optional in case the actual encoding is UTF-8 (without BOM - Byte Order Marker) or UTF-16 (UTF-16 BOM).

TXMLDocument has an encoding property since FPC 2.4. It is ignored as WriteXMLFile always uses UTF-8.

  • FPC 2.4 doesn´t generate an encoding attribute in the first line of the XML file
  • FPC 2.6.0 and later explicitly write an UTF8 encoding attribute, as this is needed for some programs that cannot handle the XML without it.

相关链接

扩展链接