Inside XML

 
   
  Table of Contents
Inside XML
By Steven Holzner
   
Publisher : New Riders Publishing
Pub Date : November 14, 2000
ISBN : 0-7357-1020-1
Pages : 1152

The XML explosion hardly needs any introduction. It's everywhere and there just seems to be no end to what can be done with XML. While writing to the W3C standards, and keeping up with the pace for corporate implementation, you, the programmer or web developer, will need a comprehensive guide to get you started and show you what XML and its related technologies can do. A thorough guide is imperative to success because you will need to know and understand the full scope of XML from day one in order to work with it successfully. With your time constraints and impossible project schedules, you need a comprehensive guide that fulfills your needs in one complete book. Inside XML is an anchor book that covers both the Microsoft and non-Microsoft approach to XML programming. It covers in detail the hot aspects of XML; such as, DTDs vs. XML Schemas, CSS, XSL, XSLT, Xlinks, Xpointers, XHTML, RDF, CDF, parsing XML in Perl and Java, and much more.

777

Copyright

About the Author

About the Technical Reviewers

Acknowledgments

Tell Us What You Think

Introduction

What's Inside?

Who Is This Book For?

At What Level Is This Book Written?

Conventions Used in This Book

Chapter 1. Essential XML

Markup Languages

What Does XML Look Like?

What Does XML Look Like in a Browser?

What's So Great About XML?

Well-Formed XML Documents

Valid XML Documents

Parsing XML Yourself

XML Resources

XML Editors

XML Browsers

XML Parsers

XML Validators

CSS and XSL

XLinks and XPointers

URLs versus URIs

ASCII, Unicode, and the Universal Character System

XML Applications

Chapter 2. Creating Well-Formed XML Documents

The World Wide Web Consortium

What Is a Well-Formed XML Document?

Markup and Character Data

The Prolog

The XML Declaration

Comments

Processing Instructions

Tags and Elements

The Root Element

Attributes

Building Well-Formed Document Structure

CDATA Sections

XML Namespaces

Infosets

Canonical XML

Chapter 3. Valid XML Documents: Creating Document Type Definitions

Creating Document Type Declarations

Creating Document Type Definitions

A DTD Example

External DTDs

Using Document Type Definitions with URLs

Public Document Type Definitions

Using Both Internal and External DTDs

Namespaces and DTDs

Validating Against a DTD

Chapter 4. DTDs: Entities and Attributes

Entities

Attributes

Creating Internal General Entities

Creating External General Entities

Building a Document from Pieces

Predefined General Entity References

Creating Internal Parameter Entities

External Parameter Entities

Using INCLUDE and IGNORE

All About Attributes

Embedding Non-XML Data in a Document

Embedding Multiple Unparsed Entities in a Document

Chapter 5. Creating XML Schemas

XML Schemas in Internet Explorer

W3C XML Schemas

Declaring Types and Elements

Specifying Attribute Constraints and Defaults

Creating Simple Types

Creating Empty Elements

Creating Mixed-Content Elements

Annotating Schemas

Creating Choices

Creating Sequences

Creating Attribute Groups

Creating all Groups

Schemas and Namespaces

Chapter 6. Understanding JavaScript

What Is JavaScript?

JavaScript Is Object-Oriented

Programming in JavaScript

Chapter 7. Handling XML Documents with JavaScript

The W3C DOM

Loading XML Documents

Getting Elements by Name

Getting Attribute Values from XML Elements

Parsing XML Documents in Code

Handling Events While Loading XML Documents

Validating XML Documents with Internet Explorer

Scripting XML Elements

Editing XML Documents with Internet Explorer

Chapter 8. XML and Data Binding

Data Binding in Internet Explorer

Using Data Source Objects

XML and Hierarchical Data

Searching XML Data

Chapter 9. Cascading Style Sheets

Attaching Style Sheets to XML Documents

Selecting Elements in Style Sheet Rules

Creating Style Rules

Formal Style Property Specifications

Chapter 10. Understanding Java

Java Resources

Writing Java Programs

Creating Java Files

Creating Variables in Java

Creating Arrays in Java

Creating Strings in Java

Java Operators

Java Conditional Statements: if, if else, switch

Java Loops: for, while, do while

Declaring and Creating Objects

Creating Methods in Java

Creating Java Classes

Chapter 11. Java and the XML DOM

Getting XML for Java

Setting CLASSPATH

Creating a Parser

Displaying an Entire Document

Filtering XML Documents

Creating a Windowed Browser

Creating a Graphical Browser

Navigating in XML Documents

Modifying XML Documents

Chapter 12. Java and SAX

Working with SAX

Displaying an Entire Document

Filtering XML Documents

Creating a Windowed Browser

Creating a Graphical Browser

Navigating in XML Documents

Modifying XML Documents

Chapter 13. XSL Transformations

Using XSLT Style Sheets in XML Documents

Creating XSLT Style Sheets

Altering Document Structure Based on Input

Generating Comments with xsl:comment

Generating Text with xsl:text

Copying Nodes

Sorting Elements

Using xsl:if

Using xsl:choose

Controlling Output Type

Chapter 14. XSL Formatting Objects

Formatting an XML Document

Creating the XSLT Style Sheet

Transforming a Document into a Formatting Object Form

Creating a Formatted Document

XSL Formatting Objects

Chapter 15. XLinks and XPointers

Overview: Linking with XLinks and XPointers

All About XLinks

All About XPointers

Chapter 16. Essential XHTML

XHTML Versions

XHTML Checklist

XHTML Programming

Chapter 17. XHTML at Work

Displaying an Image (<img>)

Creating a Hyperlink or Anchor (<a>)

Setting Link Information (<link>)

Creating Tables (<table>)

Creating Documents with Frames (<frameset>)

Using Style Sheets in XHTML

Using Script Programming (<script>)

Creating XHTML Forms (<form>)

Extending XHTML 1.0

All About XHTML 1.1 Modules

Chapter 18. Resource Description Framework and Channel Definition Format

RDF Overview

RDF Syntax

The Dublin Core

Using XML in Property Elements

Using Abbreviated RDF Syntax

RDF Containers

Creating RDF Schemas

CDF Overview

CDF Syntax

Creating a CDF File

Setting a Channel Base URI

Setting Last Modified Dates

Setting Channel Usage

Chapter 19. Vector Markup Language

Creating VML Documents

The VML Elements

The <shape> Element

Using Predefined Shapes

Coloring Shapes

Scaling Shapes

Positioning Shapes

The <shadow> Element

The <fill> Element

Using the <shapetype> Element

More Advanced VML

Chapter 20. WML, ASP, JSP, Servlets, and Perl

XML and Active Server Pages

XML and Java Servlets

Java Server Pages

XML and Perl

Wireless Markup Language

Appendix A. The XML 1.0 Specification

REC-xml-19980210

Extensible Markup Language (XML) 1.0

Abstract

Status of this document

Extensible Markup Language (XML) 1.0

1 Introduction

2 Documents

3 Logical Structures

4 Physical Structures

5 Conformance

6 Notation

Appendices

Copyright

Copyright Information

Copyright 2001 by New Riders Publishing

FIRST EDITION: November

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review.

Library of Congress Catalog Card Number: 00-102949

05 04 03 02 01 7 6 5 4 3

Interpretation of the printing code: The rightmost double-digit number is the year of the book's printing; the rightmost single-digit number is the number of the book's printing. For example, the printing code 01-1 shows that the first printing of the book occurred in 2001.

Composed in Bembo and MCPdigital by New Riders Publishing

Printed in the United States of America

Trademark Acknowledgements

All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. New Riders Publishing cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark.

W3C Massachusetts Institute of Technology (MIT), Institut National de Recherche en Informatique et en Automatique (INRIA), Keio University (Keio)

Warning and Disclaimer

This book is designed to provide information about XML. Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied.

The information is provided on an as-is basis. The authors and New Riders Publishing shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of the discs or programs that may accompany it.

Dedication

To Nancy, of course!

About the Author

Steven Holzner has been writing about XML about as long as XML has been around. He's written 63 books, all on programming topics, and selling well over a million copies. His books have been translated into 16 languages around the world and include a good number of industry bestsellers. He's a former contributing editor of PC Magazine, graduated from MIT, and received his Ph.D. at Cornell. He's been on the faculty of both MIT and Cornell.

About the Technical Reviewers

These reviewers contributed their considerable hands-on expertise to the entire development process for Inside XML. As the book was being written, these dedicated professionals reviewed all the material for technical content, organization, and flow. Their feedback was critical to ensuring that Inside XML fits our readers' need for the highest quality technical information.

Robert J. Brunner is a Senior Post-Doctoral Scholar in Astronomy at the California Institute of Technology. He has worked for several years on the integration of novel technologies, including XML and Java, into the design of large, highly distributed, collaborative archives. He received a Ph.D. in Astrophysics from the Johns Hopkins University.

Andrew J. Indovina is currently employed in the e-commerce field in Rochester, NY. He is the co-author of the books Visual Basic 6 Interactive Course, Sams Teach Yourself Visual Basic Online in Webtime, and Visual C++ 6.0 Unleashed. He has also done technical edits for books covering Java, Perl, Visual Basic, Visual C++, game development, and project management.

Acknowledgments

A book like the one you're holding is the work of a great many people, not just the author. The people at New Riders have been great, and I'd like to thank Stephanie Wall, Executive Editor extraordinaire; Chris Zahn and Robin Drake, Development Editors, who did a super job and accepted many chapter updates and then updates of updates as we worked to fit in late-breaking news and make this the absolute best XML book anywhere; as well as Lori Lyons and Krista Hansing, Editors, who kept things moving along; and finally the Technical Reviewers, Robert Brunner and Andy Indovina, who did a great job of checking everything. Thanks, everyone, for all your much-appreciated hard work.

Tell Us What You Think

As the reader of this book, you are the most important critic and commentator. We value your opinion and want to know what we're doing right, what we could do better, what areas you'd like to see us publish in, and any other words of wisdom you're willing to pass our way.

As the Executive Editor for the Networking team at New Riders Publishing, I welcome your comments. You can fax, email, or write me directly to let me know what you did or didn't like about this book as well as what we can do to make our books stronger.

Please note that I cannot help you with technical problems related to the topic of this book, and that due to the high volume of mail I receive, I might not be able to reply to every message.

When you write, please be sure to include this book's title and author as well as your name and phone or fax number. I will carefully review your comments and share them with the author and editors who worked on the book.

Fax: 317-581-4663
Email: nrfeedback@newriders.com
Mail:


Stephanie Wall
Executive Editor
New Riders Publishing
201 West 103rd Street
Indianapolis, IN 46290 USA
 

Introduction

Welcome to Inside XML. This book is designed to be as comprehensive and as accessible as possible for a single book on XML. XML is a standard, not an implementation, and it has become an umbrella for a great number of topics. You'll find XML just about everywhere you look on the Internet today, and even in many places behind the scenes (such as internally in Netscape Navigator). I believe that this book provides the most complete coverage of what's going on in XML than any other XML book today.

You'll find coverage of all the official XML standards here. I'll also take a look at many of the most popular and important implementations of XML that are out there, and I'll put them to work in this book.

That's just part of the story we'll also put XML to work in depth, pushing the envelope as far as it can go. The best way to learn any topic like XML is by example, and this is an example-oriented book. You'll find hundreds of tested examples here, ready to be used.

Writing XML is not some ordinary and monotonous task: It inspires artistry, devotion, passion, exaltation, and eccentricity not to mention exasperation and frustration. I'll try to be true to that spirit and capture as much of the excitement and power of XML in this book as I can.

What's Inside?

This book is designed to give you as much of the whole XML story as one book can hold. We'll not only see the full XML syntax from the most basic to the most advanced but we'll also dig into many of the ways in which XML is used.

Hundreds of real-world topics are covered in this book, including connecting XML to databases, both locally and on Web servers; styling XML for viewing in today's browsers; reading and using XML documents in browsers; creating your own graphically based browsers; and a great deal more.

Here's a sample of some of the topics in this book note that each of these topics has many subtopics (too many to list here):

  • The complete XML syntax

  • Well-formed XML documents

  • Valid XML documents

  • Document type definitions (DTDs)

  • Namespaces

  • The XML Document Object Model (DOM)

  • Canonical XML

  • XML schemas

  • Parsing XML with JavaScript

  • XML and data binding

  • XML and cascading style sheets (CSS)

  • XML and Java

  • DOM parsers

  • SAX parsers

  • Extensible Style Language (XSL) transformations

  • XSL formatting objects

  • XLinks

  • XPointers

  • XPath

  • XBase

  • XHTML 1.0 and 1.1

  • Resource Description Framework (RDF)

  • Channel Definition Format (CDF)

  • Vector Markup Language (VML)

  • Wireless Markup Language (WML)

  • Server-side XML with Java Server Pages (JSP), Active Server Pages (ASP), Java servlets, and Perl

This book starts with the basics. I do assume that you have some knowledge of HTML, but not necessarily much. We'll see how to create XML documents from scratch in this book, starting at the very beginning.

From there, we'll move up to see how to check the syntax of XML documents. The big attraction of XML is that you can define your own tags, such as the <DOCUMENT> and <GREETING> tags in this document, which we'll see early in Chapter 1, "Essential XML":

<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT>     <GREETING>         Hello From XML     </GREETING>     <MESSAGE>         Welcome to the wild and woolly world of XML.     </MESSAGE> </DOCUMENT>

Because you can create your own tags in XML, it's also important to specify the syntax you want those tags to obey (for example, can a <MESSAGE> appear inside a <GREETING>?). XML puts a big emphasis on this, too, and there are two main ways to specify the syntax you want your XML to follow with XML document type definitions (DTDs) and XML schemas. We'll see how to create both.

And because you can make up your own tags in XML, it's also up to you to specify how they should be used Netscape Navigator won't know, for example, that a <KILLER> tag marks a favorite book in your collection. Because it's up to you to determine what a tag actually means, handling your XML documents in programming is an important part of learning XML, despite what some second-rate XML books try to claim. The two languages I'll use in this book are JavaScript and Java; before using them, I'll introduce them in special sections with plenty of examples, so even if you're not familiar with these languages, you won't have to go anywhere else to get the skills you need.

The major browsers today are becoming more XML-aware, and they use scripting languages to let you work with your XML documents. We'll be using the most popular and powerful of those scripting languages here, JavaScript. Using JavaScript, we'll be able to read XML documents directly in browsers such as Internet Explorer.

It's also important to know how to handle XML outside browsers, because there are plenty of things that JavaScript can't handle. These days, most XML development is taking place in Java, and an endless arsenal of Java resources is available for free on the Internet. In fact, the connection between Java and XML is a natural one, as we'll see. We'll use Java to read XML documents and interpret them, starting in Chapter 11, "Java and the XML DOM." That doesn't mean that you have to be a Java expert far from it, in fact because I'll introduce all the Java we'll need right here in this book. And because most XML development is done in Java today, we're going to find a wealth of tools here, ready for use.

You can also design your XML documents to be displayed directly in some modern browsers, and I'll take a look at doing that in two ways with cascading style sheets (CSS) and the Extensible Style Language (XSL). Using CSS and XSL, you can indicate exactly how a tag that you make up, such as <PANIC> or <BIG_AND_BOLD> or <AZURE_UNDERLINED_TEXT>, should be displayed. I'll take a look at both parts of XSL XSL transformations and formatting objects in depth.

In addition, we'll see all the XML specifications in this book, such as XLinks, XBase, and XPointers, which enable you to point to particular items in XML documents in very specific ways. The XML specifications are made by a body called the World Wide Web Consortium (W3C); we'll become very familiar with those specifications here, seeing what they say and seeing what they lack.

I'll wind up the book by taking a look at a number of the most popular uses of XML on the Internet in several chapters. XML is really a language for defining languages, and there are hundreds of such XML-based languages out there now. Some of them are gaining great popularity, and I'll cover them in some depth in this book.

An astonishing wealth of material on XML is available on the Internet today, so I'll also fill this book with the URIs of dozens of those resources (in XML, you use uniform resource identifiers, not URLs, although in practice they are the same thing for most purposes today). In nearly every chapter, you'll find lists of free online programs and other resources. (However, there's a hazard here that I should mention: URIs change frequently on the Internet, so don't be surprised if some of these URIs have changed by the time you look for them.)

Who Is This Book For?

This book is designed for just about anyone who wants to learn XML and how it's used today in the real world. The only assumption that I make is that you have some knowledge of how to create documents using Hypertext Markup Language (HTML). You don't have to be any great HTML expert, but a little knowledge of HTML will be helpful. That's really all you need.

However, it's a fact of life that most XML software these days is targeted at Windows. Among other things, that means you should have access to Windows for many of the topics covered in this book. In Chapters 7, "Handling XML Documents with JavaScript," and 8, "XML and Data Binding," we'll be taking a look at the XML support in Internet Explorer. I wish there were more support for the other operating systems that I like, such as UNIX, but right now a lot of it is Windows-only. I'll explore alternatives when I can. One hopeful note for the future is that more Java-based XML tools are appearing daily, and those tools are platform-independent.

At What Level Is This Book Written?

This book is written at several different levels, from basic to advanced, because the XML spectrum is so broad. The general rule is that this book was written to follow HTML books in level. We start at the basic level and gradually get more advanced in a slow, steady way.

I'm not going to assume that you have any programming knowledge (at least until we get to the advanced topics in Chapter 20, "WML, ASP, JSP, Servlets, and Perl," such as Java Server Pages and using Perl with XML) when you start this book. We'll be using both JavaScript and Java in this book, but all you need to know about those languages will be introduced before we use them, and it won't be hard to pick up.

Because there are so many uses of XML available today, this book involves many different software packages; all the ones I'll put to work in the text are free to download from the Internet, and I'll tell you where to get them.

Conventions Used in This Book

I use several conventions in this book that you should be aware of. The most important one is that when I add new sections of code, they'll be highlighted with shading to point out the actual lines I'm discussing so that they stand out. (This sample is written in one of the languages built on XML, the Wireless Markup Language [WML], which is targeted at "microbrowsers" in cellular phones and personal digital assistants [PDAs].)

<?xml version="1.0"?> <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"> <wml>     <card id="Card1" title="First WML Example">         <!-- This is a comment -->         <p>             Greetings from WML.         </p>     </card> </wml>

Also, where there's something worth noting or some additional information that adds something to the discussion, I'll add a sidebar. That looks like this:

More on SOAP

With a common name like SOAP, it's hard to search the Internet for more information about the Simple Object Access Protocol unless you're really into pages on personal cleanliness and daytime television. For more information, you might check out this starter list: http://msdn.microsoft.com/xml/general/soapspec.asp, http://www.oasis-open.org/cover/soap.html, http://www.develop.com/soap/, and http://www.develop.com/soap/soapfaq.xml.

Finally, many discussions in the text contain syntax examples like this:

-config file

When using a command or switch shown in a syntax example, substitute the correct value for the characters in italic monospace. With the switch above, for example, you would substitute the correct configuration filename for file.

We're ready to go. If you have comments, I encourage you to write to me, care of New Riders. This book is designed to be the new standard in XML coverage, truly more complete and more accessible than ever before. Please do keep in touch with me about ways to improve it and keep it on the forefront. If you think the book lacks anything, let me know I'll add it because I want to make sure that this book stays on top.



Inside XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 23
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net