Thursday 30 December 2010

Task 8: What is the value of XML in e-commerce? Give examples of its usage.

E-commerce Applications using XML

As e-commerce and network Technology continues to evolve, online payment to become an essential in the Development process platform. In this trading platform, and always have a great deal of data in mobile, how to use a new technology to manage these important data. This paper is the use of XML technology to achieve the data flow of the parties to the transaction more concerned about the problems - the safe and efficient flow of data.

Task 8: Why is the W3C XML schema important? Give examples of its role.

As I Have already mentioned and describe in my previous post about DTD. Prior to further elaborating more on XML schema or XSD. I first want to give a concise indication on how decisive the XML schema is in conditions of Validation of data which is very crucial.

To iterate again a Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference.

Document Type Definitions (DTDs) and XML Schemas are key technologies in this area. Although neither is strictly required for XML development, both DTDs and XML Schemas are important parts of the XML toolbox. DTDs have been around for over twenty years as a part of SGML, while XML Schemas are relative newcomers. Though they use very different syntax and take different approaches to the task of describing document structures, both mechanisms definitely occupy the same turf. The W3C seems to be grooming XML Schemas as a replacement for DTDs, but it isn't yet clear that how quickly the transition will be made. DTDs are here-and-now, while XML Schemas, in large part, are for the future.

Data Type Problems

Validation


A valid document includes a document type declaration that identifies the DTD the document satisfies. The DTD lists all the elements, attributes, and entities the document uses and the contexts in which it uses them. The DTD may list items the document does not use as well. Validity operates on the principle that everything not permitted is forbidden. Everything in the document must match a declaration in the DTD. If a document has a document type declaration and the document satisfies the DTD that the document type declaration indicates, then the document is said to be valid. If it does not, it is said to be invalid.


There are many things the DTD does not say. In particular, it does not say the following:

·         What the root element of the document is
·         How many of instances of each kind of element appear in the document
·         What the character data inside the elements looks like
·         The semantic meaning of an element; for instance, whether it contains a date or a person's name

DTDs allow you to place some constraints on the form an XML document takes, but there can be quite a bit of flexibility within those limits. A DTD never says anything about the length, structure, meaning, allowed values, or other aspects of the text content of an element.

Validity is optional. A parser reading an XML document may or may not check for validity. If it does check for validity, the program receiving data from the parser may or may not care about validity errors. In some cases, such as putting records into a database, a validity error may be quite serious, indicating that a required field is missing, for example. In other cases, rendering a web page perhaps, a validity error may not be so important, and you can work around it. Well-formedness is required of all XML documents; validity is not. Your documents and your programs can use it or not as you find it beneficial.

A Simple DTD Example

For example  to describe a person. Say the person had a name and three professions. The name had a first name and a last name. The particular person can be called  John Duffles. However, that's not relevant for DTDs. A DTD only describes the general type, not the specific instance. A DTD for person documents would say that a person element contains one name child element and zero or more profession child elements. It would further say that each name element contains a first_name child element and a last_name child element. Finally it would state that the first_name, last_name, and profession elements all contain text. The example below is a DTD that describes such a person element.

Example1.1  A DTD for the person


<!ELEMENT person     (name, profession*)>
<!ELEMENT name       (first_name, last_name)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT last_name  (#PCDATA)>
<!ELEMENT profession (#PCDATA)>

This DTD example above would probably be stored in a separate file from the documents it describes. This allows it to be easily referenced from multiple XML documents. However, it can be included inside the XML document if that's convenient, using the document type declaration. If it is stored in a separate file, then that file would most likely be named person.dtd, or something similar. The .dtd extension is fairly standard though not specifically required by the XML specification. If this file were served by a web server, it would be given the MIME media type application/xml-dtd.

Each line of example above is an element declaration. The first line declares the element; the second line declares the personname element; the third line declares the first_name element; and so on. However, the line breaks aren't relevant except for legibility. Although it's customary to put only one declaration on each line, it's not required. Long declarations can even span multiple lines.

The first element declaration in example above states that each person element must contain exactly one name child element followed by zero or more profession elements. The asterisk after profession stands for "zero or more." Thus, every person must have a name and may or may not have a profession or multiple professions. However, the name must come before all professions. For example, this person element is valid:

<person>
  <name>
    <first_name>John </first_name>
    <last_name>Duffles</last_name>
  </name>
</person>

However, this person element is not valid because it omits the name:

<person>
  <profession>computer scientist</profession>
  <profession>mathematician</profession>
  <profession>cryptographer</profession>
</person>

This person element is not valid because a profession element comes before the name:

<person>
  <profession>computer scientist</profession>
  <name>
    <first_name>Alan</first_name>
    <last_name>Turing</last_name>
  </name>
  <profession>mathematician</profession>
  <profession>cryptographer</profession>
</person>

The person element may not contain any element except those listed in its declaration. The only extra character data it can contain is whitespace. For example, this is an invalid person publication element: element because it adds a

<person>
  <name>
    <first_name>Alan</first_name>
    <last_name>Turing</last_name>
  </name>
  <profession>mathematician</profession>
  <profession>cryptographer</profession>
  <publication>On Computable Numbers...</publication>
</person>



This is an invalid person element because it adds some text outside the allowed children:

<person>
  <name>
    <first_name>Alan</first_name>
    <last_name>Turing</last_name>
  </name>
  was a <profession>computer scientist</profession>,
  a <profession>mathematician</profession>, and a
  <profession>cryptographer</profession>
</person>

In all these examples of invalid elements, you could change the DTD to make these elements valid. All the examples are well-formed, after all. However, with the DTD in example 1.1, they are not valid.

The name declaration says that each name element must contain exactly one first_namelast_name element. All other variations are forbidden. element followed by exactly one

The remaining three declarations--first_name, last_name, and profession--all say that their elements must contain #PCDATA. This is a DTD keyword standing for parsed character data --that is, raw text possibly containing entity references such as &amp; and &lt;, but not containing any tags or child elements.

In the example 1.1placed the most complicated and highest-level declaration at the top. However, that's not required. For instance in example 1.2 below is an equivalent DTD that simply reorders the declarations. DTDs allow forward, backward, and circular references to other declarations.

Example 1-2. An alternate DTD for the person element

<!ELEMENT last_name  (#PCDATA)>
<!ELEMENT profession (#PCDATA)>
<!ELEMENT name       (first_name, last_name)>
<!ELEMENT person     (name, profession*)>


Example 1-3. A valid person document


<?xml version="1.0" standalone="no"?>
<!DOCTYPE person SYSTEM "http://www.cafeconleche.org/dtds/person.dtd">
<person>
  <name>
    <first_name>John Duffles</first_name>
    <last_name>John Duffles</last_name>
  </name>
  <profession>computer scientist</profession>
  <profession>mathematician</profession>
  <profession>cryptographer</profession>
</person>


XML Schemas

A Simple XML Schema

Below is a simple XML schema, which is made up of one complex type element with two child simple type elements

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Author">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="FirstName" type="xs:string" />
        <xs:element name="LastName" type="xs:string" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>


As you can see, an XML schema is an XML document and must follow all the syntax rules of any other XML document; that is, it must be well formed. XML schemas also have to follow the rules defined in the "Schema of schemas," which defines, among other things, the structure of and element and attribute names in an XML schema.

Although it is not required, it is a common practice to use the xs qualifier to identify Schema elements and types we have to validate it first which I will show below.


The document element of XML schemas is xs:schema. It takes the attribute xmlns:xs with the value of http://www.w3.org/2001/XMLSchema, indicating that the document should follow the rules of XML Schema. This will be clearer after you learn about namespaces.

In this XML schema, we see a xs:element element within the xs:schema element. xs:elementAuthor as a complex type element, which contains a sequence of two elements: FirstName and LastName, both of which are of the simple type, string. is used to define an element. In this case it defines the element

Validating an XML Instance Document

As I have explained on the above code there is an  example of a simple XML schema, which defined the structure of an Author element. The code sample below shows a valid XML instance of this XML schema.

<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:noNamespaceSchemaLocation="Author.xsd">
    <FirstName>John</FirstName>
    <LastName>Duffles</LastName>
</Author>

The above code is a simple XML document. Its document element is Author, which contains two child elements: FirstName and LastName, just as the associated XML schema requires.

The xmlns:xsi attribute of the document element indicates that this XML document is an instance of an XML schema. The document is tied to a specific XML schema with the xsi:noNamespaceSchemaLocation attribute.


Reference: Eric van der Vlist (2002) Dyanamic The W3C's Object-Oriented Descriptions for XML (page 5-10)[Online] Viewed at [http://oreilly.com/catalog/9780596002527/preview]

Task 8: Evaluate SAX (the XML API, giving examples of usage.

Understanding Event-Driven XML Processing

SAX (Simple API for XML) is an event-driven model for processing XML. Most XML processing models (for example: DOM and XPath) build an internal, tree-shaped representation of the XML document. The developer then uses that model's API (getElementsByTagName in the case of the DOM or findnodes using XPath, for example) to access the contents of the document tree. The SAX model is quite different. Rather than building a complete representation of the document, a SAX parser fires off a series of events as it reads the document from beginning to end. Those events are passed to event handlers, which provide access to the contents of the document.

Event Handlers

There are three classes of event handlers: DTDHandlers, for accessing the contents of XML Document-Type Definitions; ErrorHandlers, for low-level access to parsing errors; and, by far the most often used, DocumentHandlers, for accessing the contents of the document. For clarity's sake, I'll only cover DocumentHandler events.

A SAX processor will pass the following events to a DocumentHandler:

  •          The start of the document.
  •         A processing instruction element.
  •          A comment element.
  •         The beginning of an element, including that element's attributes.
  •         The text contained within an element.
  •          The end of an element.
  •          The end of the document. 
 
WHAT IS SAX EXACTLY and Which Parser to Use (SAX or DOM)

SAX chooses to give you access to the information in your XML document, not as a tree of nodes, but as a sequence of events! You ask, how is this useful? The answer is that SAX chooses not to create a default Java object model on top of your XML document (like DOM does). This makes SAX faster, and also necessitates the following things:

  •  creation of your own custom object model
  • creation of a class that listens to SAX events and properly creates your object model.


Note that these steps are not necessary with DOM, because DOM already creates an object model for you (which represents your information as a tree of nodes).

In the case of DOM, the parser does almost everything, read the XML document in, create a Java object model on top of it and then give you a reference to this object model (a Document object) so that you can manipulate it. SAX is not called the Simple API for XML for nothing, it is really simple.

SAX doesn’t expect the parser to do much, all SAX requires is that the parser should read in the XML document, and fire a bunch of events depending on what tags it encounters in the XML document. You are responsible for interpreting these events by writing an XML document handler class, which is responsible for making sense of all the tag events and creating objects in your own object model. So you have to write:


  •  Your custom object model to “hold” all the information in your XML document into.
  • A document handler that listens to SAX events (which are generated by the SAX parser as its reading your XML document) and makes sense of these events to create objects in your custom object model.


SAX can be really fast at runtime if your object model is simple. In this case, it is faster than DOM, because it bypasses the creation of a tree based object model of your information. On the other hand, you do have to write a SAX document handler to interpret all the SAX events (which can be a lot of work).

What kinds of SAX events are fired by the SAX parser? These events are really very simple. SAX will fire an event for every open tag, and every close tag. It also fires events for #PCDATA and CDATA sections. You document handler (which is a listener for these events) has to interpret these events in some meaningful way and create your custom object model based on them. Your document handler will have to interpret these events and the sequence in which these events are fired is very important. SAX also fires events for processing instructions, DTDs, comments, etc. But the idea is still the same, your handler has to interpret these events (and the sequence of the events) and make sense out of them.

When to Use DOM 

If your XML documents contain document data (e.g., Framemaker documents stored in XML format), then DOM is a completely natural fit for your solution. If you are creating some sort of document information management system, then you will probably have to deal with a lot of document data.

An example of this is the Datachannel RIO product, which can index and organize information that comes from all kinds of document sources (like Word and Excel files). In this case, DOM is well suited to allow programs access to information stored in these documents.

However, if you are dealing mostly with structured data (the equivalent of serialized Java objects in XML) DOM is not the best choice. That is when SAX might be a better fit.


In my Conclusion

The SAX document handler you write does element to object mapping. If your information is structured in a way that makes it easy to create this mapping you should use the SAX API. On the other hand, if your data is much better represented as a tree then you should use DOM.

Referece: Simple API for XML (SAX) (2010)[Online] Viewed At: http://www.ibm.com/developerworks/xml/standards/x-saxspec.html [Accessed 26/12/2010]

Tuesday 28 December 2010

Task 8: What is the Document Object Model? Describe its purposes and its use, with examples.


The Document Object Model, or DOM, is an interface to allow programs and scripts to update content, structure, and style of documents dynamically. It is platform- and language-neutral. The DOM is not HTML nor is it JavaScript. It is something like the glue that binds them together. As I have mentioned in other posts about the DOM I iterate again that The DOM is a W3C (World Wide Web Consortium) standard. The DOM defines a standard for accessing documents like XML and HTML. And also it divides into three separated levels. Which were published in the years of 1998, 2000, and 2004 correspondingly.

Core DOM - standard model for any structured document.
XML DOM - standard model for XML documents.
HTML DOM - standard model for HTML documents.

The DOM defines the objects and properties of all document elements, and the methods (interface) to access them.

When you load a document in a Browser, it creates a number of JavaScript Objects with Property values based on the HTML in the document and other pertinent information. These Objects exist in a Hierarchy that reflects the structure of the HTML page itself. The ability to change a Web page dynamically with a scripting language is made possible by the Document Object Model (DOM) which can connect any element on the screen to a JavaScript function. The DOM is the road map through which you can locate any element in your HTML document and use a script, such as JavaScript, to change the element’s properties.


Nodes

Furthermore JavaScript is used to influence XML document. In addition being an object oriented language, JavaScript manipulates  XML document as an  object where properties and methods are as its core feature.  In XML, every part of the document is an element node within the DOM. These nodes come together to form a tree-structure used for navigation by interpreting languages. Since every part of an XML document is a node, the top node is the document itself, known as the ‘documentElement.’ Within the document are smaller nodes that have a relationship with all the connected nodes. This creates the tree hierarchy that interpreters in internet browsers use for navigation. Nodes found within the document element include the root element, child elements, attribute elements, comment elements and text.

What is the Purpose of the XML DOM?

The XML DOM provides a way for browsers with XML parsers to read and manipulate code. By itself, XML does not actually say much. It is a collection of information stored in a file that has no meaning. The DOM structure is what organizes the code so the languages, like JavaScript, can read and understand it. It is the standard model that allows the display of XML data on a web page.


<html>
<head>
<script type="text/javascript" src="loadxmldoc.js">

</script>
</head>
<body>
<script type="text/javascript">
xmlDoc=loadXMLDoc("hi there.xml");

</script>
</body>
</html>

The file is called "loadxmldoc.js", and will be loaded in the head section of an HTML page. Then, the loadXMLDoc() function can be called from a script in the page. The above code is used to load the xml document (hi there.xml) by using the loadXMLDoc functions.

Reference: Document Object Model DOM ( (2010)[Online] Viewed at : http://docs.python.org/library/xml.dom.html [Accessed 25/12/2010]