Validating an XML Document

Credit: Mauro Cicio

Problem

You want to check whether an XML document conforms to a certain schema or DTD.

Solution

Unfortunately, as of this writing there are no stable, pure Ruby libraries that do XML validation. Youll need to install a Ruby binding to a C library. The easiest one to use is the Ruby binding to the GNOME libxml2 toolkit. (There are actually two Ruby bindings to libxml2, so don get confused: we e referring to the one you get when you install the libxml-ruby gem.)

To validate a document against a DTD, create a a DTD object and pass it into Document#validate. To validate against an XML Schema, pass in a Schema object instead.

Consider the following DTD, for a cookbook like this one:

	require 
ubygems
	require  
libxml

	dtd = XML::Dtd.new(%{
	
	
	
	
	
	})

Heres an XML document that looks like it conforms to the DTD:

	open(cookbook.xml, w) do |f|
	 f.write %{
	
	 
	 A recipe
	 A difficult/common problem
	 A smart solution
	 A deep solution
	 Pointers
	 
	
	}
	end

But does it really? We can tell for sure with Document#validate:

	document = XML::Document.file(cookbook.xml)
	document.validate(dtd) # => true

Heres a Schema definition for the same document. We can validate the document against the schema by making it into a Schema object and passing that into Document#validate:

	schema = XML::Schema.from_string %{

	
	 

	 
	 
	 
	 
	 
	 
	 

	 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	 
	

	
	}

	document.validate(schema) 	# => true

Discussion

Programs that use XML validation are more robust and less complicated than nonvalidating versions. Before starting work on a document, you can check whether or not its in the format you expect. Most services that accept XML as input don have forgiving parsers, so you must validate your document before submitting it or it might fail without you even noticing.

One of the most popular and complete XML libraries around is the GNOME Libxml2 library. Despite its name, it works fine outside the GNOME platform, and has been ported to many different OSes. The Ruby project libxml (http://libxml.rubyforge.org) is a Ruby wrapper around the GNOME Libxml2 library. The project is not yet in a mature state, but its very active and the validation features are definitively usable. Not only does libxml support validation and a complete range of XML manipolation techniques, it can also improve your programs speed by an order of magnitude, since its written in C instead of REXMLs pure Ruby.

Don confuse the libxml project with the libxml library. The latter is part of the XML::Tools project. It binds against the GNOME Libxml2 library, but it doesn expose that librarys validation features. If you try the example code above but can find the XML::Dtd or the XML::Schema classes, then youve got the wrong binding. If you installed the libxml-ruby package on Debian GNU/Linux, youve got the wrong one. You need the one you get by installing the libxml-ruby gem. Of course, youll need to have the actual GNOME libxml library installed as well.

See Also

  • The Ruby libxml project page (http://www.rubyforge.org/projects/libxml)
  • The other Ruby libxml binding (the one that doesn do validation)is part of the XML::Tools project (http://rubyforge.org/projects/xml-tools/); don confuse the two!
  • The GNOME libxml project homepage (http://xmlsoft.org/)
  • Refer to http://www.w3.org/XML for the difference between a DTD and a Schema


Strings

Numbers

Date and Time

Arrays

Hashes

Files and Directories

Code Blocks and Iteration

Objects and Classes8

Modules and Namespaces

Reflection and Metaprogramming

XML and HTML

Graphics and Other File Formats

Databases and Persistence

Internet Services

Web Development Ruby on Rails

Web Services and Distributed Programming

Testing, Debugging, Optimizing, and Documenting

Packaging and Distributing Software

Automating Tasks with Rake

Multitasking and Multithreading

User Interface

Extending Ruby with Other Languages

System Administration



Ruby Cookbook
Ruby Cookbook (Cookbooks (OReilly))
ISBN: 0596523696
EAN: 2147483647
Year: N/A
Pages: 399

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net