Working with XML Document Parts
The Document Object Model (DOM) provides a generic container, the node, that can be used to represent elements, attributes, textual content, comments, processing instructions, entities, CDATA sections, and document fragments. Because each node type has different access methods and content limitations, it is sometimes easier to work with nodes belonging to a particular type. This section provides information about how to work with the following types of nodes.
- Document
- Elements
- Attributes
- Text Nodes
- CDATA Sections
- Processing Instructions
- Comments
- Entity
- Entity References
- Document Fragments
- Document Type
- Namespaces
- Notations
Document
The document map is represented by the DOMDocument object, which contains all of the information about the document. The DOMDocument object acts as the root node for all of the other nodes in the DOM tree. It contains the root element of the document, as well as information from before the root elementthe prologand after the end of the root element.
The DOMDocument node can contain multiple IXMLDOMProcessingInstruction and IXMLDOMComment nodes, and one each of the IXMLDOMElement and IXMLDOMDocumentType nodes. These nodes are treated as children, and will appear in the same sequence in which they appeared in the original XML document.
To create a new document object, you must create a new instance of Msxml2.DOMDocument. The following code is in Microsoft® JScript®.
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
The following code is in Microsoft Visual Basic® Scripting Edition (VBScript).
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
After you create the DOMDocument object, you can set flags on it for different kinds of parsing and processing behavior, load XML documents, create new nodes inside the document to build an XML document from within your program, or persist your DOMDocument object as an XML file.
For more information about properties and methods of DOMDocument objects, see IXMLDOMDocument/DOMDocument.
Elements
Elements within an XML document are represented by IXMLDOMElement objects. Just as elements can contain elements, text, comments, processing instructions, CDATA sections, and entity references within XML documents, these objects can contain IXMLDOMElement, IXMLDOMText, IXMLDOMComment, IXMLDOMProcessingInstruction, IXMLDOMCDATASection, and IXMLDOMEntityReference objects.
Elements do not store attributes directly as children, however. Attributes can be retrieved or modified by name using the getAttribute and setAttribute methods, or manipulated as an IXMLDOMNamedNodeMap through the attributes property.
Creating an element requires that you already have created a DOMDocument object, as the createElement method belongs to DOMDocument. Although you can create and manipulate elements without adding them to the node list of DOMDocument, these elements will disappear as soon as the program completes and will not be persisted in XML documents. Generally, you'll want to assign elements a place in the document tree shortly after creation.
The following JScript code creates a DOMDocument object, and then uses that DOMDocument object to create an IXMLDOMElement object, which is then appended to the DOMDocument to be the root element of the document.
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var rootElement=xmlDoc.createElement("memo");
xmlDoc.appendChild(rootElement);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set rootElement=xmlDoc.createElement("memo")
xmlDoc.appendChild(rootElement)
For more information about properties and methods of IXMLDOMElement, see IXMLDOMElement.
Attributes
Although attributes belong to a particular element, they are not considered child nodes of element nodes. Instead, they behave more like properties of IXMLDOMElement.
Most of the methods for working with attributes come from IXMLDOMElement. Attributes can be manipulated in the following ways.
- Directly, through the
getAttribute and setAttribute methods of IXMLDOMElement.
- As named
IXMLDOMAttribute nodes, with getAttributeNode and setAttributeNode.
- As a set of nodes accessible through the
attributes property and returned as an IXMLNamedNodeMap.
The following JScript example creates a new document containing a <memo> element, and then creates an attribute named author with a value of "Pat Coleman".
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var rootElement=xmlDoc.createElement("memo");
rootElement.setAttribute("author", "Pat Coleman");
xmlDoc.appendChild(rootElement);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set rootElement=xmlDoc.createElement("memo")
rootElement.setAttribute("author", "Pat Coleman")
xmlDoc.appendChild(rootElement)
If you prefer to work with attribute nodes, you can create the attribute, and then create a text node to store its value. Attribute nodes can only contain text nodes and entity reference nodes. (If you need to create an attribute containing an entity reference, you must use this approach.)
Working with attribute nodes requires using the DOMDocument object to create attribute and text (and entity reference, if necessary) nodes before assigning the nodes to the element. The following JScript code uses this approach to perform the same work as the preceding examples, creating a <memo> element with an author attribute holding the value "Pat Coleman".
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var rootElement=xmlDoc.createElement("memo");
var memoAttribute=xmlDoc.createAttribute("author");
var memoAttributeText=xmlDoc.createTextNode("Pat Coleman");
memoAttribute.appendChild(memoAttributeText);
rootElement.setAttributeNode(memoAttribute);
xmlDoc.appendChild(rootElement);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set rootElement=xmlDoc.createElement("memo")
Set memoAttribute=xmlDoc.createAttribute("author")
Set memoAttributeText=xmlDoc.createTextNode("Pat Coleman")
memoAttribute.appendChild(memoAttributeText)
rootElement.setAttributeNode(memoAttribute)
xmlDoc.appendChild(rootElement)
Developers who want to work with attributes as a set can also use the IXMLDOMNamedNodeMap collection returned by the attribute property of IXMLDOMElement.
For more information about properties and methods of IXMLDOMAttribute, see IXMLDOMAttribute.
Text Nodes
Text nodes store the character sequences that make up the content of XML elements and attributes. Text nodes cannot have child nodes because they represent content, not structure. Text nodes must be contained by element, attribute, document fragment, or entity reference nodesthey cannot be contained by the top-level document node, though the DOMDocument object is used to create text nodes. For more information, see Textual Content.
The createTextNode method of DOMDocument object is used to create new text nodes, instances of XMLDOMText. In the following JScript example, createTextNode is called twiceonce to create a text node for an attribute value, and once to create a text node for element content.
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var rootElement=xmlDoc.createElement("memo");
var memoAttribute=xmlDoc.createAttribute("author");
var memoAttributeText=xmlDoc.createTextNode("Pat Coleman");
var toElement=xmlDoc.createElement("to");
var toElementText=xmlDoc.createTextNode("Carole Poland");
memoAttribute.appendChild(memoAttributeText);
xmlDoc.appendChild(rootElement);
rootElement.setAttributeNode(memoAttribute);
rootElement.appendChild(toElement);
toElement.appendChild(toElementText);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set rootElement=xmlDoc.createElement("memo")
Set memoAttribute=xmlDoc.createAttribute("author")
Set memoAttributeText=xmlDoc.createTextNode("Pat Coleman")
Set toElement=xmlDoc.createElement("to")
Set toElementText=xmlDoc.createTextNode("Carole Poland")
xmlDoc.appendChild(rootElement)
rootElement.setAttributeNode(memoAttribute)
rootElement.appendChild(toElement)
toElement.appendChild(toElementText)
Although you can read the textual content of a text node using its nodeValue property, it is more common to use the text property of the IXMLDOMElement object to avoid problems created by elements containing multiple text, CDATA section, and entity reference nodes. The normalize method of IXMLDOMElement can also simplify text processing.
For more information on properties and methods of IXMLDOMText, see IXMLDOMText.
CDATA Sections
CDATA sections allow developers to include the markup characters <, >, and & within element content without using character or entity references. Scripts, style sheets, program code, and sample XML code are frequently contained in CDATA sections. The IXMLDOMCDATASection object behaves like a text node, but preserves knowledge of its special status, making it easy to preserve CDATA sections through multiple load-and-save cycles.
CDATA sections behave like text nodes, except that they cannot be used inside of attribute nodes. CDATA sections may be mixed with text nodes, entity references, and other content containers, and can appear as child nodes of elements, document fragments, entity references, and entities.
The following JScript fragment creates an element named <example> whose contents are protected by a CDATA section using the createCDATASection method of DOMDocument.
var demoElement=xmlDoc.createElement("example");
var demoContent=xmlDoc.createCDATASection("<sample>This is an element</sample>");
demoElement.appendChild(demoContent);
The following code appears in VBScript.
Set demoElement=xmlDoc.createElement("example")
Set demoContent=xmlDoc.createCDATASection("<sample>This is an element</sample>")
demoElement.appendChild(demoContent)
CDATA sections also have an impact on MSXML white space handling. For more information, see Preserving Markup Characters by Using CDATA Sections.
For more information about properties and methods of IXMLDOMCDATASection, see the IXMLDOMCDATASection.
Processing Instructions
Processing instructions provide a loosely-structured mechanism for conveying application-specific information within a document. Processing instructions can appear within document, element, document fragment, entity reference, or entity nodes. They have only two componentsa name (also called the target) and a value. The value may contain content that looks like attribute values (often called pseudo-attributes), but this content is stored as simple text.
Processing instructions can appear in the document contentwithin the root element or an element that is the descendant of the root elementor they can appear before or after the root element. The <?xml-stylesheet?> processing instruction, used to connect XML documents to cascading style sheets or XSL Transformations (XSLT) style sheets, usually appears in the prolog.
The following JScript example identifies which style sheet to use in a processing instruction in the prolog.
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var stylePI=xmlDoc.createProcessingInstruction("xml-stylesheet",
' type="text/xsl" href="show_book.xsl"');
xmlDoc.appendChild(stylePI);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set stylePI = xmlDoc.createProcessingInstruction("xml-stylesheet",
"type=""text/xsl"" href=""show_book.xsl""")
xmlDoc.appendChild(stylePI)
The XML document begins with the following.
<?xml-stylesheet type="text/xsl" href="show_book.xsl"?>
The target of a processing instruction can be retrieved through its nodeName property while the rest of the processing instruction is stored in the nodeValue property. MSXML treats the XML declaration as a processing instruction within its DOM representation. This can have a significant effect on character encoding.
For more information about properties and methods of XMLDOMProcessingInstruction objects, see the IXMLDOMProcessingInstruction.
Comments
While processing instructions provide support for unstructured application-oriented information, comments provide a place for unstructured human-readable information. Comments do not have names, only content, and generally represent information that applications should ignore.
Comments are represented by IXMLDOMComment objects, which are created using the createComment method of DOMDocument. Comments can appear before, after, or within the root element. For example, the following code creates a comment that says "catalog last updated 2000-11-01".
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var dateComment=xmlDoc.createComment("catalog last updated 2000-11-01");
xmlDoc.appendChild(dateComment);
The following code is in VBScript.
Set xmlDoc = CreateObject("Msxml2.DOMDocument.4.0")
Set dateComment=xmlDoc.createComment("catalog last updated 2000-11-01")
xmlDoc.appendChild(dateComment)
The contents of an IXMLDOMComment object are always stored in its nodeValue property.
For more information about properties and methods of IXMLDOMComment, see IXMLDOMComment.
Entities
The entities attribute is the plural form of the entity attribute. Entities allow a reference to an unparsed external entity to appear within a document. The entities attribute contains a list of such entities, delimited by white space. The entities property of the DOM is used to retrieve the list of entities declared in the DOCTYPE declaration.
For more information, see the entities property and the IXMLDOMDocumentType object/interface.
Entity References
Entity references are used to insert references to entities declared in an XML document type declaration. MSXML will expand all entity references used in a document as it parses them, but applications can insert entity references into document object models intended for export as XML documents.
The following JScript fragment creates a header element that contains the text Jay Fluegel & Mindy Martin, expressed as "Jay Fluegel ", the entity reference "amp", and " Mindy Martin".
var headerElement=xmlDoc.createElement("header");
var headerStart=xmlDoc.createTextNode("Jay Fluegel ");
var headerMiddle=xmlDoc.createEntityReference("amp");
var headerEnd=xmlDoc.createTextNode(" Mindy Martin");
headerElement.appendChild(headerStart);
headerElement.appendChild(headerMiddle);
headerElement.appendChild(headerEnd);
The following code is in VBScript.
Set headerElement=xmlDoc.createElement("header")
Set headerStart=xmlDoc.createTextNode("Jay Fluegel ")
Set headerMiddle=xmlDoc.createEntityReference("amp")
Set headerEnd=xmlDoc.createTextNode(" Mindy Martin")
headerElement.appendChild(headerStart)
headerElement.appendChild(headerMiddle)
headerElement.appendChild(headerEnd)
For more information about properties and methods of IXMLDOMEntityReference, see IXMLDOMEntityReference.
Document Fragments
Document fragments are DOM objects that do not correspond precisely to a particular construct in XML 1.0. They represent a portion, not necessarily well-formed, of an XML document. Document fragments provide programmers with a tool for storing nodes like a DOMDocument object with fewer restrictions and overhead.
IXMLDOMDocumentFragment must be created using the createDocumentFragment method of the DOMDocument object. The content stored in the document fragment remains, but the document fragment placeholder disappears when inserted into a document tree.
For more information about properties and methods of XMLDOMDocumentFragment, see IXMLDOMDocumentFragment.
Document Type
The document type node represents the information provided by the DOCTYPE declaration, if one appears. The IXMLDOMDocumentType object, accessible through the doctype property of the DOMDocument object, is read-only. This object provides access to the entities and notations that have been declared in the document type definition (DTD) of the document. Its name property identifies the root element of the document.
For more information about properties and methods of IXMLDOMDocumentType, see the IXMLDOMDocumentType.
Namespaces
MSXML provides full support for XML namespaces. Because the Worldwide Web Consortium (W3C) Namespaces in XML Recommendation appeared after the DOM Level 1 was complete, namespace support in the MSXML DOM is provided through extensions to the core DOM model. Namespace support properties are provided for all nodes in the DOM, but are generally meaningful only for element and attribute nodes. For more information, see Using Namespaces in Documents.
The name property of a node is its qualified name, including the namespace prefix, if any. If you want just the namespace prefix, the prefix property includes that information, while the base name, leaving off the prefix, is available through the baseName property. The namespace URI to which the prefix refers is available through the namespaceURI property.
To create namespace declarations, you must create attributes beginning with xmlns as defined by the Namespaces in XML specification.
Notations
Notation declarations associate a name with an identifier for a notation. The notation declarations are not validated. They are referenced when validating strings that are members of NOTATION simple type definitions are called in the XML document.
For more information, see Notation Declarations.
See Also
Attributes | CDATA Sections | Comments | Document Map | DOCTYPE Declaration | Elements | Character and Entity References | IXMLDOMNamedNodeMap | Processing Instructions | XML Declaration