The Importance of XML


Java™ 2 Primer Plus
By Steven Haines, Steve Potts

Table of Contents
Chapter 25.  XML

As software programs evolve from standalone singular applications to distributed enterprise systems, developers are faced with a new set of challenges. No longer are applications running on a single machine, but on multiple machines with potentially different operating systems on different hardware architectures. Thus, the developer is challenged with the task of defining a communication mechanism between applications written in different programming languages and running on different operating systems.

If all applications were written in Java, your job would be easy. Java was designed to run in a consistent virtual machine on any supported operating system, and it is the virtual machine's responsibility to translate the "Java" representation of data to the operating system. Java uses 4 bytes to represent an Integer, whereas some implementations of programming languages use 2 bytes, so how do you account for the missing two bytes? The underlying interpretation of the byte order of data varies between Windows and Unix operating systems: if you have a 4-byte Integer, the exact same bits are interpreted differently on the two operating systems, generating a different value.

As I said, if all programs were written in Java there would be no problem, but the practicality is that all programs cannot be written in Java. Consider the class of applications that talk directly to hardware these applications need to see the underlying operating system and understand the computer architecture hosting the hardware. This is just one case where Java cannot be used; another class of software that is running on an operating system for which a Java virtual machine has not been developed.

Now that you understand some of the challenges you face, how might you consider passing data unambiguously between two applications? Consider passing the variable speed with the value 65. You are trying to convey two pieces of information: the value 65, and its associated meaning speed. As I have mentioned, passing the Integer value as a collection of bits does not work. How about sending the data as the character 6 followed by the character 5? Regardless of operating system and programming language, the character 5 is the same.

That works, so now we need to identify that 65 represents the speed of something, so how about passing the string speed along with the value 65? We could send the following string across operating systems and programming languages:


This works to pass a primitive type, but what about sending an object (or the state of an object so that it can be re-created on the other side)? This is a little more complicated. Let's consider passing a Car from a Java program running on Windows (again the operating system does not matter to Java) to a C++ program running on Sun Solaris. Consider the following Car class:

 public class Car {    private Color color =;    private String make = "Porsche";    private String model = "Carerra 911";    private int gas = 15; // in gallons    private int speed = 0; // in mph    private int oil = 5; // in quarts    private boolean running = false;    ... methods ...  } 

You have learned that we can send strings between applications to solve our problems, so now the question is how do we persist this Car class to a string and one that is self describing so that application on the other end can easily identify all the car's attributes? Consider passing the following variable to another program:

 Car myCar = new Car(); 

So, we want to attach the variable myCar with all the data represented in the particular Car instance. Because all the car's attributes are owned by the car itself, it would be nice to preserve that relationship when describing the car to another application. Thus, the solution is to create a string that is hierarchical: the car object contains (or is the parent of) its attributes. Rather than reinvent the wheel, we can borrow the notation used in HTML tags for two reasons:

  1. They have defined a nice hierarchical model

  2. HTML works across operating systems and programming languages, so it is a good place to start

The root of an HTML document is the <html> tag, its children can include a <head> element and a <body> element. The <head> can contain a <title> element or a <script> element. The <body> element can contain a bgcolor attribute that changes the background color of the page, a <table> element that can contain <tr> elements, and so on. Each element begins with a start tag that is a tag name enclosed between less than and greater than signs: <tag-name>. Each element is terminated by using the same tag name prepended with a less than sign followed by a forward slash and terminated by a greater than sign: </tag-name>. For example:

 <html>    <head>      <title>My Page</title>    </head>    <body bgcolor="black">      ... body content ...    </body>  </html> 

Although HTML documents have specific HTML tags, they don't quite fit when describing a car, but the structure is nice. How about borrowing the structure (tag notation and notion that an element can contain attributes and other elements), but replacing the tag names with ones that are more meaningful when describing a car? Consider the following:

 <car name="myCar">    <color>yellow</color>    <make>Porsche</make>    <model>Carerra 911</model>    <gas>15</gas>    <oil>5</oil>    <speed>95</speed>    <running>true</running>  </car> 

This description of a car has the following characteristics:

  • It is self describing: You can easily see that the <car> object is named myCar and has a <speed> element with the value 95.

  • It is possible to send this representation of a car between operating systems and programming languages.

  • It preserves the hierarchical nature of object-oriented programming.

This is the essence of XML: It follows a tagging structure similar to HTML, but with tag names that describe the data they represent, as well as a hoard of other features, including rules that allow you to define the structure of your specific data. For example, a <car> can contain a <speed> element, but a <speed> element cannot contain a <car> element, a <speed> element may be optional and assumed to be 0 if not present, and so forth.


    Java 2 Primer Plus
    Java 2 Primer Plus
    ISBN: 0672324156
    EAN: 2147483647
    Year: 2001
    Pages: 332

    Similar book on Amazon © 2008-2017.
    If you may any questions please contact us: