Wednesday 1 December 2010

Sixth Task - What is Xpath and what does it do? Evaluate the strengths and weaknesses of this concept, with examples

XPath is a language that describes a way to locate and process items in Extensible Markup Language (XML) documents by using an addressing syntax based on a path through the document's logical structure or hierarchy. This makes writing programming expressions easier than if each expression had to understand typical XML markup and its sequence in a document. XPath also allows the programmer to deal with the document at a higher level of abstraction. XPath is a language that is used by and specified as part of both the Extensible Stylesheet Language Transformations (XSLT) and by XPointer (SML Pointer Language). It uses the information abstraction defined in the XML Information Set (Infoset). Since XPath does not use XML syntax itself, it could be used in contexts other than those of XML

The explanation why an XML document is indulgence this way is because an XML document corresponds to as a database. Consequently, retrieving data from it requests a language to point to specific data within the document.
<?xml version=”1.0″ encoding=”ISO-8859-1″?>
<bookstore>[document node]<
book><title lang=”en”>[attribute]Harry Potter</title>[element]<
author>JKRowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
An XML is treated as a tree of nodes. as i have stated below there are seven kind of nodes defined by XPath.The root of this tree is called as document node or root node.The nodes are element, attribute, text, namespace, processing-instruction, comment, and document root nodes.
XML NODES
  1. Bookstore is root node,title is an element and lang=’en’ is attribute.
  2. ParentEach element and attribute has one parent.In the example; the book element is the parent of the title, author, year, and price.
  3. Children:Element nodes may have zero, one or more children.In the example; the title, author, year, and price elements are all children of the book element.
  4. Siblings : Nodes that have the same parent.In the example; the title, author, year, and price elements are all siblings.
  5. Ancestors : A node’s parent, parent’s parent, etc.In the example; the ancestors of the title element are the book element and the bookstore element.
  6. Descendants: A node’s children, children’s children, etc. In the example; descendants of the bookstore element are the book, title, author, year, and price elements.
As an example, say you receive an XML document that contains the details of a shipment and you want to retrieve the element/attribute values from the XML document. You don't just want to list the values of all the nodes, but also want to output the values of specific elements or attributes. In such a case, you would use XPath to retrieve the values of those elements and attributes. XPath constructs a hierarchical structure of an XML document, a tree of nodes, which is the XPath data model. The XPath data model consists of seven node types. The different types of nodes in the XPath data model are discussed in the following table:

In the Xpath data model, XML document is viewed as a tree of node thus in hierarchy of nodes. below  are the seven nodes and their descriptions

Node Type Description
Root Node The root node is the root of the DOM tree. The document element (the root element) is a child of the root node. The root node also has the processing instructions and comments as child nodes.
Element Node This represents an element in an XML document. The character data, elements, processing instructions, and comments within an element are the child nodes of the element node.
Attribute Node This represents an attribute other than the xmlns-prefixed attribute, which declares a namespace.
Text Node The character data within an element is a text node. A text node has at least one character of data. A whitespace is also considered as a character of data. By default, the ignorable whitespace after the end of an element and before the start of the following element is also a text node. The ignorable whitespace can be excluded from the DOM tree built by parsing an XML document. This can be done by setting the whitespace-preserving mode to false with the setPreserveWhitespace(boolean flag) method.
Comment Node This represents a comment in an XML document, except the comments within the DOCTYPE declaration.
Processing Instruction Node This represents a processing instruction in an XML document except the processing instruction within the DOCTYPE declaration. The XML declaration is not considered as a processing instruction node.
Namespace Node This represents a namespace mapping, which consists of a xmlns:-prefixed attribute such as xmlns:xsd="http://www. w3.org/2001/XMLSchema". A namespace node consists of a namespace prefix (xsd in the example) and a namespace URI (http://www.w3.org/2001/XMLSchema in the example).

No comments:

Post a Comment