|
The XML explosion hardly needs any introduction. It's everywhere and there just seems to be no end to what can be done with XML. While writing to the W3C standards, and keeping up with the pace for corporate implementation, you, the programmer or web developer, will need a comprehensive guide to get you started and show you what XML and its related technologies can do. A thorough guide is imperative to success because you will need to know and understand the full scope of XML from day one in order to work with it successfully. With your time constraints and impossible project schedules, you need a comprehensive guide that fulfills your needs in one complete book. Inside XML is an anchor book that covers both the Microsoft and non-Microsoft approach to XML programming. It covers in detail the hot aspects of XML; such as, DTDs vs. XML Schemas, CSS, XSL, XSLT, Xlinks, Xpointers, XHTML, RDF, CDF, parsing XML in Perl and Java, and much more. 777 |
Copyright
About the Author
About the Technical Reviewers
Acknowledgments
Tell Us What You Think
Introduction
What's Inside?
Who Is This Book For?
At What Level Is This Book Written?
Conventions Used in This Book
Chapter 1. Essential XML
Markup Languages
What Does XML Look Like?
What Does XML Look Like in a Browser?
What's So Great About XML?
Well-Formed XML Documents
Valid XML Documents
Parsing XML Yourself
XML Resources
XML Editors
XML Browsers
XML Parsers
XML Validators
CSS and XSL
XLinks and XPointers
URLs versus URIs
ASCII, Unicode, and the Universal Character System
XML Applications
Chapter 2. Creating Well-Formed XML Documents
The World Wide Web Consortium
What Is a Well-Formed XML Document?
Markup and Character Data
The Prolog
The XML Declaration
Comments
Processing Instructions
Tags and Elements
The Root Element
Attributes
Building Well-Formed Document Structure
CDATA Sections
XML Namespaces
Infosets
Canonical XML
Chapter 3. Valid XML Documents: Creating Document Type Definitions
Creating Document Type Declarations
Creating Document Type Definitions
A DTD Example
External DTDs
Using Document Type Definitions with URLs
Public Document Type Definitions
Using Both Internal and External DTDs
Namespaces and DTDs
Validating Against a DTD
Chapter 4. DTDs: Entities and Attributes
Entities
Attributes
Creating Internal General Entities
Creating External General Entities
Building a Document from Pieces
Predefined General Entity References
Creating Internal Parameter Entities
External Parameter Entities
Using INCLUDE and IGNORE
All About Attributes
Embedding Non-XML Data in a Document
Embedding Multiple Unparsed Entities in a Document
Chapter 5. Creating XML Schemas
XML Schemas in Internet Explorer
W3C XML Schemas
Declaring Types and Elements
Specifying Attribute Constraints and Defaults
Creating Simple Types
Creating Empty Elements
Creating Mixed-Content Elements
Annotating Schemas
Creating Choices
Creating Sequences
Creating Attribute Groups
Creating all Groups
Schemas and Namespaces
Chapter 6. Understanding JavaScript
What Is JavaScript?
JavaScript Is Object-Oriented
Programming in JavaScript
Chapter 7. Handling XML Documents with JavaScript
The W3C DOM
Loading XML Documents
Getting Elements by Name
Getting Attribute Values from XML Elements
Parsing XML Documents in Code
Handling Events While Loading XML Documents
Validating XML Documents with Internet Explorer
Scripting XML Elements
Editing XML Documents with Internet Explorer
Chapter 8. XML and Data Binding
Data Binding in Internet Explorer
Using Data Source Objects
XML and Hierarchical Data
Searching XML Data
Chapter 9. Cascading Style Sheets
Attaching Style Sheets to XML Documents
Selecting Elements in Style Sheet Rules
Creating Style Rules
Formal Style Property Specifications
Chapter 10. Understanding Java
Java Resources
Writing Java Programs
Creating Java Files
Creating Variables in Java
Creating Arrays in Java
Creating Strings in Java
Java Operators
Java Conditional Statements: if, if else, switch
Java Loops: for, while, do while
Declaring and Creating Objects
Creating Methods in Java
Creating Java Classes
Chapter 11. Java and the XML DOM
Getting XML for Java
Setting CLASSPATH
Creating a Parser
Displaying an Entire Document
Filtering XML Documents
Creating a Windowed Browser
Creating a Graphical Browser
Navigating in XML Documents
Modifying XML Documents
Chapter 12. Java and SAX
Working with SAX
Displaying an Entire Document
Filtering XML Documents
Creating a Windowed Browser
Creating a Graphical Browser
Navigating in XML Documents
Modifying XML Documents
Chapter 13. XSL Transformations
Using XSLT Style Sheets in XML Documents
Creating XSLT Style Sheets
Altering Document Structure Based on Input
Generating Comments with xsl:comment
Generating Text with xsl:text
Copying Nodes
Sorting Elements
Using xsl:if
Using xsl:choose
Controlling Output Type
Chapter 14. XSL Formatting Objects
Formatting an XML Document
Creating the XSLT Style Sheet
Transforming a Document into a Formatting Object Form
Creating a Formatted Document
XSL Formatting Objects
Chapter 15. XLinks and XPointers
Overview: Linking with XLinks and XPointers
All About XLinks
All About XPointers
Chapter 16. Essential XHTML
XHTML Versions
XHTML Checklist
XHTML Programming
Chapter 17. XHTML at Work
Displaying an Image (<img>)
Creating a Hyperlink or Anchor (<a>)
Setting Link Information (<link>)
Creating Tables (<table>)
Creating Documents with Frames (<frameset>)
Using Style Sheets in XHTML
Using Script Programming (<script>)
Creating XHTML Forms (<form>)
Extending XHTML 1.0
All About XHTML 1.1 Modules
Chapter 18. Resource Description Framework and Channel Definition Format
RDF Overview
RDF Syntax
The Dublin Core
Using XML in Property Elements
Using Abbreviated RDF Syntax
RDF Containers
Creating RDF Schemas
CDF Overview
CDF Syntax
Creating a CDF File
Setting a Channel Base URI
Setting Last Modified Dates
Setting Channel Usage
Chapter 19. Vector Markup Language
Creating VML Documents
The VML Elements
The <shape> Element
Using Predefined Shapes
Coloring Shapes
Scaling Shapes
Positioning Shapes
The <shadow> Element
The <fill> Element
Using the <shapetype> Element
More Advanced VML
Chapter 20. WML, ASP, JSP, Servlets, and Perl
XML and Active Server Pages
XML and Java Servlets
Java Server Pages
XML and Perl
Wireless Markup Language
Appendix A. The XML 1.0 Specification
REC-xml-19980210
Extensible Markup Language (XML) 1.0
Abstract
Status of this document
Extensible Markup Language (XML) 1.0
1 Introduction
2 Documents
3 Logical Structures
4 Physical Structures
5 Conformance
6 Notation
Appendices
Copyright 2001 by New Riders Publishing
FIRST EDITION: November
All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review.
Library of Congress Catalog Card Number: 00-102949
05 04 03 02 01 7 6 5 4 3
Interpretation of the printing code: The rightmost double-digit number is the year of the book's printing; the rightmost single-digit number is the number of the book's printing. For example, the printing code 01-1 shows that the first printing of the book occurred in 2001.
Composed in Bembo and MCPdigital by New Riders Publishing
Printed in the United States of America
All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. New Riders Publishing cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark.
W3C Massachusetts Institute of Technology (MIT), Institut National de Recherche en Informatique et en Automatique (INRIA), Keio University (Keio)
This book is designed to provide information about XML. Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied.
The information is provided on an as-is basis. The authors and New Riders Publishing shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of the discs or programs that may accompany it.
To Nancy, of course!
Steven Holzner has been writing about XML about as long as XML has been around. He's written 63 books, all on programming topics, and selling well over a million copies. His books have been translated into 16 languages around the world and include a good number of industry bestsellers. He's a former contributing editor of PC Magazine, graduated from MIT, and received his Ph.D. at Cornell. He's been on the faculty of both MIT and Cornell.
These reviewers contributed their considerable hands-on expertise to the entire development process for Inside XML. As the book was being written, these dedicated professionals reviewed all the material for technical content, organization, and flow. Their feedback was critical to ensuring that Inside XML fits our readers' need for the highest quality technical information.
Robert J. Brunner is a Senior Post-Doctoral Scholar in Astronomy at the California Institute of Technology. He has worked for several years on the integration of novel technologies, including XML and Java, into the design of large, highly distributed, collaborative archives. He received a Ph.D. in Astrophysics from the Johns Hopkins University.
Andrew J. Indovina is currently employed in the e-commerce field in Rochester, NY. He is the co-author of the books Visual Basic 6 Interactive Course, Sams Teach Yourself Visual Basic Online in Webtime, and Visual C++ 6.0 Unleashed. He has also done technical edits for books covering Java, Perl, Visual Basic, Visual C++, game development, and project management.
A book like the one you're holding is the work of a great many people, not just the author. The people at New Riders have been great, and I'd like to thank Stephanie Wall, Executive Editor extraordinaire; Chris Zahn and Robin Drake, Development Editors, who did a super job and accepted many chapter updates and then updates of updates as we worked to fit in late-breaking news and make this the absolute best XML book anywhere; as well as Lori Lyons and Krista Hansing, Editors, who kept things moving along; and finally the Technical Reviewers, Robert Brunner and Andy Indovina, who did a great job of checking everything. Thanks, everyone, for all your much-appreciated hard work.
As the reader of this book, you are the most important critic and commentator. We value your opinion and want to know what we're doing right, what we could do better, what areas you'd like to see us publish in, and any other words of wisdom you're willing to pass our way.
As the Executive Editor for the Networking team at New Riders Publishing, I welcome your comments. You can fax, email, or write me directly to let me know what you did or didn't like about this book as well as what we can do to make our books stronger.
Please note that I cannot help you with technical problems related to the topic of this book, and that due to the high volume of mail I receive, I might not be able to reply to every message.
When you write, please be sure to include this book's title and author as well as your name and phone or fax number. I will carefully review your comments and share them with the author and editors who worked on the book.
Fax: | 317-581-4663 |
Email: | nrfeedback@newriders.com |
Mail: |
|
Welcome to Inside XML. This book is designed to be as comprehensive and as accessible as possible for a single book on XML. XML is a standard, not an implementation, and it has become an umbrella for a great number of topics. You'll find XML just about everywhere you look on the Internet today, and even in many places behind the scenes (such as internally in Netscape Navigator). I believe that this book provides the most complete coverage of what's going on in XML than any other XML book today.
You'll find coverage of all the official XML standards here. I'll also take a look at many of the most popular and important implementations of XML that are out there, and I'll put them to work in this book.
That's just part of the story we'll also put XML to work in depth, pushing the envelope as far as it can go. The best way to learn any topic like XML is by example, and this is an example-oriented book. You'll find hundreds of tested examples here, ready to be used.
Writing XML is not some ordinary and monotonous task: It inspires artistry, devotion, passion, exaltation, and eccentricity not to mention exasperation and frustration. I'll try to be true to that spirit and capture as much of the excitement and power of XML in this book as I can.
This book is designed to give you as much of the whole XML story as one book can hold. We'll not only see the full XML syntax from the most basic to the most advanced but we'll also dig into many of the ways in which XML is used.
Hundreds of real-world topics are covered in this book, including connecting XML to databases, both locally and on Web servers; styling XML for viewing in today's browsers; reading and using XML documents in browsers; creating your own graphically based browsers; and a great deal more.
Here's a sample of some of the topics in this book note that each of these topics has many subtopics (too many to list here):
The complete XML syntax
Well-formed XML documents
Valid XML documents
Document type definitions (DTDs)
Namespaces
The XML Document Object Model (DOM)
Canonical XML
XML schemas
Parsing XML with JavaScript
XML and data binding
XML and cascading style sheets (CSS)
XML and Java
DOM parsers
SAX parsers
Extensible Style Language (XSL) transformations
XSL formatting objects
XLinks
XPointers
XPath
XBase
XHTML 1.0 and 1.1
Resource Description Framework (RDF)
Channel Definition Format (CDF)
Vector Markup Language (VML)
Wireless Markup Language (WML)
Server-side XML with Java Server Pages (JSP), Active Server Pages (ASP), Java servlets, and Perl
This book starts with the basics. I do assume that you have some knowledge of HTML, but not necessarily much. We'll see how to create XML documents from scratch in this book, starting at the very beginning.
From there, we'll move up to see how to check the syntax of XML documents. The big attraction of XML is that you can define your own tags, such as the <DOCUMENT> and <GREETING> tags in this document, which we'll see early in Chapter 1, "Essential XML":
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> <MESSAGE> Welcome to the wild and woolly world of XML. </MESSAGE> </DOCUMENT>
Because you can create your own tags in XML, it's also important to specify the syntax you want those tags to obey (for example, can a <MESSAGE> appear inside a <GREETING>?). XML puts a big emphasis on this, too, and there are two main ways to specify the syntax you want your XML to follow with XML document type definitions (DTDs) and XML schemas. We'll see how to create both.
And because you can make up your own tags in XML, it's also up to you to specify how they should be used Netscape Navigator won't know, for example, that a <KILLER> tag marks a favorite book in your collection. Because it's up to you to determine what a tag actually means, handling your XML documents in programming is an important part of learning XML, despite what some second-rate XML books try to claim. The two languages I'll use in this book are JavaScript and Java; before using them, I'll introduce them in special sections with plenty of examples, so even if you're not familiar with these languages, you won't have to go anywhere else to get the skills you need.
The major browsers today are becoming more XML-aware, and they use scripting languages to let you work with your XML documents. We'll be using the most popular and powerful of those scripting languages here, JavaScript. Using JavaScript, we'll be able to read XML documents directly in browsers such as Internet Explorer.
It's also important to know how to handle XML outside browsers, because there are plenty of things that JavaScript can't handle. These days, most XML development is taking place in Java, and an endless arsenal of Java resources is available for free on the Internet. In fact, the connection between Java and XML is a natural one, as we'll see. We'll use Java to read XML documents and interpret them, starting in Chapter 11, "Java and the XML DOM." That doesn't mean that you have to be a Java expert far from it, in fact because I'll introduce all the Java we'll need right here in this book. And because most XML development is done in Java today, we're going to find a wealth of tools here, ready for use.
You can also design your XML documents to be displayed directly in some modern browsers, and I'll take a look at doing that in two ways with cascading style sheets (CSS) and the Extensible Style Language (XSL). Using CSS and XSL, you can indicate exactly how a tag that you make up, such as <PANIC> or <BIG_AND_BOLD> or <AZURE_UNDERLINED_TEXT>, should be displayed. I'll take a look at both parts of XSL XSL transformations and formatting objects in depth.
In addition, we'll see all the XML specifications in this book, such as XLinks, XBase, and XPointers, which enable you to point to particular items in XML documents in very specific ways. The XML specifications are made by a body called the World Wide Web Consortium (W3C); we'll become very familiar with those specifications here, seeing what they say and seeing what they lack.
I'll wind up the book by taking a look at a number of the most popular uses of XML on the Internet in several chapters. XML is really a language for defining languages, and there are hundreds of such XML-based languages out there now. Some of them are gaining great popularity, and I'll cover them in some depth in this book.
An astonishing wealth of material on XML is available on the Internet today, so I'll also fill this book with the URIs of dozens of those resources (in XML, you use uniform resource identifiers, not URLs, although in practice they are the same thing for most purposes today). In nearly every chapter, you'll find lists of free online programs and other resources. (However, there's a hazard here that I should mention: URIs change frequently on the Internet, so don't be surprised if some of these URIs have changed by the time you look for them.)
This book is designed for just about anyone who wants to learn XML and how it's used today in the real world. The only assumption that I make is that you have some knowledge of how to create documents using Hypertext Markup Language (HTML). You don't have to be any great HTML expert, but a little knowledge of HTML will be helpful. That's really all you need.
However, it's a fact of life that most XML software these days is targeted at Windows. Among other things, that means you should have access to Windows for many of the topics covered in this book. In Chapters 7, "Handling XML Documents with JavaScript," and 8, "XML and Data Binding," we'll be taking a look at the XML support in Internet Explorer. I wish there were more support for the other operating systems that I like, such as UNIX, but right now a lot of it is Windows-only. I'll explore alternatives when I can. One hopeful note for the future is that more Java-based XML tools are appearing daily, and those tools are platform-independent.
This book is written at several different levels, from basic to advanced, because the XML spectrum is so broad. The general rule is that this book was written to follow HTML books in level. We start at the basic level and gradually get more advanced in a slow, steady way.
I'm not going to assume that you have any programming knowledge (at least until we get to the advanced topics in Chapter 20, "WML, ASP, JSP, Servlets, and Perl," such as Java Server Pages and using Perl with XML) when you start this book. We'll be using both JavaScript and Java in this book, but all you need to know about those languages will be introduced before we use them, and it won't be hard to pick up.
Because there are so many uses of XML available today, this book involves many different software packages; all the ones I'll put to work in the text are free to download from the Internet, and I'll tell you where to get them.
I use several conventions in this book that you should be aware of. The most important one is that when I add new sections of code, they'll be highlighted with shading to point out the actual lines I'm discussing so that they stand out. (This sample is written in one of the languages built on XML, the Wireless Markup Language [WML], which is targeted at "microbrowsers" in cellular phones and personal digital assistants [PDAs].)
<?xml version="1.0"?> <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"> <wml> <card id="Card1" title="First WML Example"> <!-- This is a comment --> <p> Greetings from WML. </p> </card> </wml>
Also, where there's something worth noting or some additional information that adds something to the discussion, I'll add a sidebar. That looks like this:
More on SOAPWith a common name like SOAP, it's hard to search the Internet for more information about the Simple Object Access Protocol unless you're really into pages on personal cleanliness and daytime television. For more information, you might check out this starter list: http://msdn.microsoft.com/xml/general/soapspec.asp, http://www.oasis-open.org/cover/soap.html, http://www.develop.com/soap/, and http://www.develop.com/soap/soapfaq.xml. |
Finally, many discussions in the text contain syntax examples like this:
-config file
When using a command or switch shown in a syntax example, substitute the correct value for the characters in italic monospace. With the switch above, for example, you would substitute the correct configuration filename for file.
We're ready to go. If you have comments, I encourage you to write to me, care of New Riders. This book is designed to be the new standard in XML coverage, truly more complete and more accessible than ever before. Please do keep in touch with me about ways to improve it and keep it on the forefront. If you think the book lacks anything, let me know I'll add it because I want to make sure that this book stays on top.