XML Primer

From MircWiki
Jump to: navigation, search

MIRC uses extensible markup language (XML) to encode configuration data, indexed information, and MIRCdocuments themselves. To use MIRC, an understanding of XML is not necessary, but administrators may find it helpful. To assist readers who are new to XML, the following description will provide enough background to read and understand the XML in the MIRC documentation.

An XML document is typically a text file, although it can also be a text stream obtained from a server via a network. Content in an XML document is contained in XML elements. An XML element is identified by a name enclosed in < > brackets. For example, an element called MIRCdocument is coded as <MIRCdocument>. The notation <elementname> is called a tag. Every element tag must be paired with a tag that defines its end. The end tag for <MIRCdocument> is coded as </MIRCdocument>. The value of an element is the content between the element tag and its end tag.

An element may contain attributes. An attribute is coded within the element tag itself. An attribute consists of a name, an equals sign, and an attribute value. The attribute value is enclosed in quotes. The following is an example of an element with an attribute:

<MIRCdocument docref="http://www.somewhere.edu/mydocument.xml">
	This text is the value of the element.
</MIRCdocument>

An element may be empty, meaning that it has no element value (although it may have attributes). Such an element is usually coded as:

<element attribute="attribute value"/>

(note the slash before the >), although the form:

<element attribute="attribute value"></element>

is equally acceptable.

The value of an element may contain other elements, thus allowing the nesting of elements in an XML document. A well-formed XML document must have exactly one top-level, or root, element, within which all other content is contained, and every element must be paired with its end tag within the value of which it is a part. For example, the following is well-formed:

<topelement>
	<p>This is a paragraph.</p>
	<p>This is another paragraph.</p>
</topelement>

The following is not well-formed, and is rejected by XML parsers:

<topelement>
	This text is one paragraph.
	<p>
	This is another paragraph.
</topelement>

The last example is a common sloppy use of HTML. When including HTML within XML elements, every element, including every HTML element, must have an end tag.

The following document is also not well-formed:

<topelement>
	<element1> …
		<element2> …
	</element1>
		</element2>
</topelement>

because <element2>, which is part of <element1>, is not closed within the value of <element1>.

Whenever an XML file is edited, it is imperative that it remain well-formed. An easy way to check that a document is well-formed is to open it in a web browser that parses XML documents. One such browser is Microsoft's Internet Explorer.

Finally, a few terms:

  • Elements and text are generically called nodes.
  • A node that is part of the value of an element is called a child of the element.
  • An element is called the parent of all its child nodes.
  • Nodes with the same parent node are called siblings.