LINQ to XML API


The LINQ to XML API is independent from LINQ to XML queries, and it allows developers to build and manage XML contents regardless of whether they will query them with LINQ and extension methods. You can use this API as a stand-alone utility or in conjunction with LINQ queries. This new API is built with World Wide Web Consortium (W3C) XML Infoset instances in mind, rather than just XML 1.0 documents. Therefore, the in-memory tree is the objective of this API, not the bare XML text file.

Note 

W3C defines XML Infoset as a set of information items that describes the structure of any well-formed XML document. You can think of an XML Infoset as the in-memory node graph description, corresponding to an XML document, aside from the physical nature of the document itself. For further details on XML Infoset, read the W3C Recommendation: http://www.w3.org/TR/xml-infoset/.

The goal of the LINQ to XML API is to provide an object-oriented approach for XML construction and management, avoiding or solving many common issues related to XML manipulation through W3C DOM. With LINQ to XML, the approach to XML is no longer document centric, as it is in W3C DOM. Using LINQ to XML, elements can be created and can exist detached from any document, namespace usage has been simplified, and traversing the in-memory tree is like scanning any other object graph. To make all of this possible, the API is based on a set of classes, all with names prefixed by an X (and which we will often refer to as X* classes in this the chapter), that correspond to the main common nodes of an XML document. In Figure 6-1, you can see the object model hierarchy.

image from book
Figure 6-1: The object model hierarchy of main X* classes

To start using this API, you must reference the System.Xml.Linq assembly and use its classes. The following sections describe the main types defined in System.Xml.Linq.

XElement

This is one of the main classes of the LINQ to XML API. As you can see from Figure 6-1, it has the same hierarchical level as the XDocument class and is derived from the base XNode class, through XContainer. As its name suggests, it describes an XML element and can be used as the container of any XML fragment parented to a tag. It provides many constructors and static methods, some of which are very useful. For instance, we can load the content of an XElement from an existing XmlReader instance to reuse existing code based on System.Xml classes by using the static Load method of XElement. Constructors such as the following create XML node graphs using functional construction:

 public XElement (XName name); public XElement (XElement other); public XElement (XName name, Object content); public XElement (XName name, params Object[] content);

The params Object[] optional list of parameters of one constructor represents a list of child nodes, attributes, or both of the elements we are defining. For instance, an XElement named customer, with a child element named firstName, can be defined by using the code in Listing 6-3.

Listing 6-3: A sample XElement constructed using the LINQ to XML API

image from book
  XElement tag = new XElement("customer",     new XElement("firstName", "Paolo")); 
image from book

Using a standard DOM approach, we should have to define an XmlDocument instance, explicitly create the elements, and append each child node to its parent. Take a look at the code block in Listing 6-4 to compare a DOM approach with the new functional construction we have just used.

Listing 6-4: Definition of an XML element using DOM

image from book
  XmlDocument doc = new XmlDocument(); XmlElement customerElement = doc.CreateElement("customer"); XmlElement firstNameElement = doc.CreateElement("firstName"); firstNameElement.InnerText = "Paolo"; customerElement.AppendChild(firstNameElement); doc.AppendChild(customerElement); 
image from book

As you can see, the DOM approach is verbose and difficult to understand. Probably the easiest way to define this customer element is to use Visual Basic 9.0 XML literals, as demonstrated in Listing 6-5.

Listing 6-5: Definition of an XML element using Visual Basic 9.0 XML literals

image from book
  Dim customerName As String = "Paolo" Dim tag As XElement = _   <customer>     <firstName><%= customerName %></firstName>   </customer> 
image from book

As you saw in Chapter 3, this syntax will be translated by the Visual Basic 9.0 compiler into the equivalent functional construction.

XElement instances can also be saved into a String, an XmlWriter, or a TextWriter. Every XElement allows the reading of its content with direct casting, using a custom implementation of the Explicit operator, defined to obtain a typed version of the Value of the element. Compared to a classic System.Xml.XmlElement, this is a great improvement because we can manage XML nodes typed from a .NET point of view with a value-centric approach. To better understand this concept, consider the sample code in Listing 6-6.

Listing 6-6: Sample of explicit type casting using XElement content

image from book
  XElement order = new XElement("order",     new XElement("quantity", 10),     new XElement("price", 50),     new XAttribute("idProduct", "P01")); Decimal orderTotalAmount =     (Decimal)order.Element("quantity") *     (Decimal)order.Element("price"); Console.WriteLine("Order total amount: {0}", orderTotalAmount); 
image from book

Here we use an XElement that describes an order. Imagine that we received this instance of the order from an order management system rather than constructing it explicitly by code. As you can see, we extract the elements named quantity and price and we convert them to a Decimal type. The conversion will return the inner Value of each element node, trying to cast it to Decimal. To handle the case of invalid content, we need to catch a FormatException, because the various Explicit operator overloads internally use XmlConvert from System.Xml or Parse methods of .NET types.

Finally, note that the XElement constructor automatically handles XML encoding of text. Consider Listing 6-7.

Listing 6-7: Sample of explicit escaping of XML text

image from book
  XElement notes = new XElement("notes",     "Some special characters like & > < <div/> etc."); 
image from book

The result is encoded automatically using XmlConvert and looks like the following:

 <notes>Some special characters like &amp; &gt; &lt; &lt;div/&gt; etc.</notes>

Also, node names are checked against XML naming rules and invalid names are rejected, throwing a System.Xml.XmlException. (For further details, see XSD types Name and NMToken on the W3C Web site at: http://www.w3.org.) This behavior is different from that of old XmlWriter, where names were automatically encoded. Sincerely, we think that it is better to make developers aware of syntactic rules rather than always hide them under the cover. However, if you want to define “irregular" node names with LINQ to XML, you can just use the XmlConvert class, invoking its methods, EncodeName or EncodeNmToken, respectively.

XDocument

The XDocument class represents an XML Infoset document instance. We can create document instances starting from a params Object[] list of objects of the following types: XElement, XDeclaration, XProcessingInstruction, XDocumentType, and XComment.

Surprisingly, XDocument does not have a constructor with a parameter of type XmlReader, Stream, or whatever describes a source file or Uniform Resource Identifier (URI). In fact, XDocument, like XElement, provides a set of static Load methods that can work with String, XmlReader/XmlWriter, and TextReader or TextWriter. To persist the XML Infoset XDocument instances, you need to provide a set of Save methods. Generally, an XDocument instance is useful whenever you need to create processing instructions or document type declarations on top of the XML document; otherwise, XElement is a better choice and is easier to use.

Important 

As we have already seen, Visual Basic 9.0 XML literals are parsed by the Visual Basic 9.0 compiler to generate standard LINQ to XML API syntax. During this parsing phase, the compiler supports a subset of constructors provided by the various LINQ to XML types. For instance, whenever you need to create an XDocument using Visual Basic 9.0 XML literals, the only constructor supported is the one that requires a first argument of type XDeclaration (for example, a processing instruction) on top of the document. Any other XML literal missing the trailing XDeclaration will be assumed to be an XElement instance.

XAttribute

This class represents an XML attribute instance and can be added to any XContainer by using its constructor and LINQ to XML functional construction. Notice that the XAttribute class is independent from XNode and, consequently, from XElement and XDocument. It has only the base XObject class in common with all other X* classes. Like the XElement class, it provides a rich set of conversion operators so that it can provide its content already typed from a .NET point of view. From a practical point of view, working with attributes is quite similar to working with elements. However, from an internal point of view, attributes are handled as a name/value pair mapped to the container element. Each XAttribute provides a couple of properties, called NextAttribute and PreviousAttribute, that are useful for browsing the sequence of attributes of an element.

XNode

XNode is the base class for many of the X* classes, and it implements the entire tree-node management infrastructure, providing methods to add, move, remove, and replace nodes within the XML Infoset. For instance, the AddAfterSelf and AddBeforeSelf methods are useful for inserting one or more nodes after or before the current one. Listing 6-8 provides an example of these methods-specifically, it shows how to use these methods to insert a couple of addresses into the previously seen customer, just after the first address.

Listing 6-8: Sample usage of the AddAfterSelf method of XNode

image from book
  XElement customer = XElement.Load(@"..\..\customer.xml"); XElement firstAddress = (customer.Descendants("addresses").Elements("address")).First(); firstAddress.AddAfterSelf(     new XElement("address",         new XAttribute("type", "IT-blog"),             "http://blogs.devleap.com/"),     new XElement("address",         new XAttribute("type", "US-blog"),             "http://weblogs.asp.net/PaoloPia/")); 
image from book

As you can see, we can add a set of nodes because these methods provide a couple of overloads, which are shown here:

 public void AddAfterSelf(Object content); public void AddBeforeSelf(Object content); public void AddAfterSelf(params Object[] content); public void AddBeforeSelf(params Object[] content);

The first two overloads in the preceding list require a single parameter of type Object, while the second two overloads accept a params Object[] variable list of parameters. You might be wondering why these methods, like many of the previously seen constructors, accept the type Object instead of XNode or any other X* class instance. The answer is quite simple but very interesting: Whenever we provide an object to methods and constructors of X* classes, the API checks to determine whether they implement IEnumerable to recursively handle their contents; if they do not, the API converts them to a String, calling their ToString() implementation. NULL parameters are just ignored.

We can write LINQ to XML syntax to load a set of nodes, as in the following code block, based on functional construction and using C# merged with LINQ queries. In Listing 6-9, we use the well-known customers sequence-which we used in Chapter 4, “LINQ Syntax Fundamentals”-to build an XML document based on those customers.

Listing 6-9: A LINQ to XML sentence merged with LINQ queries

image from book
  XElement xmlCustomers = new XElement("customers",     from   c in customers     where  c.Country == Countries.Italy     select new XElement("customer",                new XAttribute("name", c.Name),                new XAttribute("city", c.City),                new XAttribute("country", c.Country))); 
image from book

The result looks like the following XML document:

 <?xml version="1.0" encoding="utf-8"?> <customers>   <customer name="Paolo" city="Brescia" country="Italy" />   <customer name="Marco" city="Torino" country="Italy" /> </customers>

The same result can be achieved by using Visual Basic 9.0 XML literals with the code shown in Listing 6-10.

Listing 6-10: A LINQ to XML sentence merged with LINQ queries, using Visual Basic 9.0 XML literals

image from book
  Dim xmlCustomers As XElement = _   <customers>     <%= From c In customers _         Where (c.Country = Countries.Italy) _         Select _         <customer>           <firstName><%= c.FirstName %></firstName>         </customer> %>   </customers> 
image from book

Another interesting method provided by XNode is DeepEqual. It is a static method, useful to fully compare a couple of XML nodes for equality, as the name suggests. It works by comparing nodes using an internal abstract instance method still called DeepEqual. In this way, every type inherited from XNode implements its own DeepEqual behavior. For example, XElement compares element names, element content, and element attributes. The XNodeEqualityComparer class that we will use later in this chapter, within LINQ to XML queries, is based on DeepEqual.

XName and XNamespace

When defining XML contents and node graphs, usually you must also map nodes to their XML namespace. In Listing 6-11, you can see how to define nodes with an XML namespace by using a classic DOM approach.

Listing 6-11: XML namespace handling using classic DOM syntax

image from book
  XmlDocument document = new XmlDocument(); XmlElement customer = document.CreateElement("c", "customer",     "http://schemas.devleap.com/Customer"); document.AppendChild(customer); XmlElement firstName = document.CreateElement("c", "firstName",     "http://schemas.devleap.com/Customer"); customer.AppendChild(firstName); 
image from book

As you can see, we use an overload of the CreateElement method, which requires three parameters: a namespace prefix, a tag local name, and the full namespace URI. The same can be done for XML attributes, using CreateAttribute of XmlDocument or SetAttribute of XmlElement. To tell the truth, this way of working is not all that difficult to understand and implement. Nevertheless, developers often create confusion when using this approach and complain that XML namespaces are difficult to manage. The real issue probably derives from namespace prefixes, which are just aliases to the real XML namespaces. Theoretically, prefixes are used to simplify namespace references; in reality, they might cause confusion. To address feedback from developers, the LINQ to XML API was designed to provide an easier way of working with XML namespaces, avoiding any explicit use of prefixes. Every node name is an instance of the XName class, which can be defined by a String or by a pairing of an XNamespace and a String. In Listing 6-12, you can see how to define XML content by using a single default XML namespace.

Listing 6-12: LINQ to XML namespace declaration

image from book
  XNamespace ns = "http://schemas.devleap.com/Customer"; XElement customer = new XElement(ns + "customer",     new XAttribute("id", "C01"),     new XElement(ns + "firstName", "Paolo"),     new XElement(ns + "lastName", "Pialorsi")); 
image from book

As you can see, the XNamespace definition looks like a String, but it is not. Internally, every XNamespace has a more complex behavior. Here is the output of the preceding code:

 <?xml version="1.0" encoding="utf-8"?> <customer  xmlns="http://schemas.devleap.com/Customer">   <firstName>Paolo</firstName>   <lastName>Pialorsi</lastName> </customer>

Using Visual Basic 9.0 syntax, we can define the namespace directly inside the XML content, as Listing 6-13 shows.

Listing 6-13: Visual Basic 9.0 XML literals used to declare XML content with a default XML namespace

image from book
  Dim customer As XDocument = _   <?xml version="1.0" encoding="utf-8"?>   <customer  xmlns="http://schemas.devleap.com/Customer">     <firstName>Paolo</firstName>     <lastName>Pialorsi</lastName>   </customer> 
image from book

Now consider Listing 6-14, where we use a couple of XML namespaces.

Listing 6-14: Multiple XML namespaces within a single XElement declaration

image from book
  XNamespace nsCustomer = "http://schemas.devleap.com/Customer"; XNamespace nsAddress = "http://schemas.devleap.com/Address"; XElement customer = new XElement(nsCustomer + "customer",     new XAttribute("id", "C01"),     new XElement(nsCustomer + "firstName", "Paolo"),     new XElement(nsCustomer + "lastName", "Pialorsi"),     new XElement(nsAddress + "addresses",         new XElement(nsAddress + "address",             new XAttribute("type", "email"),                 "paolo@devleap.it"),         new XElement(nsAddress + "address",             new XAttribute("type", "home"),                 "Brescia - Italy"))); 
image from book

Again, the output is a document with all qualified XML nodes:

 <?xml version="1.0" encoding="utf-8"?> <customer  xmlns="http://schemas.devleap.com/Customer">   <firstName>Paolo</firstName>   <lastName>Pialorsi</lastName>   <addresses xmlns="http://schemas.devleap.com/Address">     <address type="email">paolo@devleap.it</address>     <address type="home">Brescia - Italy</address>   </addresses> </customer>

At this point, we have seen that XNamespace is quite simple to use and that the LINQ to XML API automatically handles namespace declaration, avoiding the explicit use of prefixes. You are probably curious about what happens when we define an XName as a concatenation of an XNamespace instance and a String to represent the local name of the node. Each XName instance can be represented as a String, using its ToString method:

 Console.WriteLine(customer.Name.ToString());

Here is the result of the preceding line of code:

 {http://schemas.devleap.com/Customer}customer

Let’s try to use this “resolved” text instead of the concatenation (XNamespace instance plus local name) used previously:

 XElement testCustomer = new XElement("{http://schemas.devleap.com/Customer}customer"); Console.WriteLine(testCustomer.Name);

In the System.Xml.Linq API, the resolved text “{namespace}local-name” is called the “expanded name” and is semantically equivalent to defining the XNamespace separately. The concatenation of an XNamespace and a String produces a new XName equivalent to the expanded name.

Now we are missing only XML namespace prefixes. We have seen that this new API handles namespace declaration by itself. However, sometimes we might need to influence how to serialize nodes and represent namespaces by overriding the default behavior of LINQ to XML. To achieve this goal, we can explicitly define the prefixes to use for namespaces by using xmlns attributes within our elements, as we do in the example in Listing 6-15.

Listing 6-15: LINQ to XML declaration of an XML namespace with a custom prefix

image from book
  XNamespace ns = "http://schemas.devleap.com/Customer"; XElement customer = new XElement(ns + "customer",     new XAttribute(XNamespace.Xmlns + "c", ns),     new XAttribute("id", "C01"),     new XElement(ns + "firstName", "Paolo"),     new XElement(ns + "lastName", "Pialorsi")); 
image from book

The output looks like the following:

 <?xml version="1.0" encoding="utf-8"?> <c:customer xmlns:c="http://schemas.devleap.com/Customer" >   <c:firstName>Paolo</c:firstName>   <c:lastName>Pialorsi</c:lastName> </c:customer>

As you can see, we defined “c” as the prefix of nodes associated with the XNamespace instance named ns.

One more time, the corresponding and easiest Visual Basic 9.0 syntax is shown in Listing 6-16.

Listing 6-16: Visual Basic 9.0 XML literals used to declare an XML namespace with a custom prefix

image from book
  Dim customer As XDocument = _ <?xml version="1.0" encoding="utf-8"?> <c:customer xmlns:c="http://schemas.devleap.com/Customer" >   <c:firstName>Paolo</c:firstName>   <c:lastName>Pialorsi</c:lastName> </c:customer> 
image from book

You might think that, starting from LINQ to XML, namespaces are simpler to handle and prefixes are transparently taken out of your control. On the other hand, you might now have the impression that if you need to influence prefixes, you need to do a little more work, at least using C# 3.0. In fact, Visual Basic 9.0 XML literals also simplify namespace declaration, leveraging a feature called global XML namespaces. This new feature allows you to globally declare an XML namespace URI with its corresponding prefix within a Visual Basic 9.0 code file so that you can reuse it many times in code. In Listing 6-17, you can see an example.

Listing 6-17: Visual Basic 9.0 XML literals and global XML namespaces

image from book
  Imports System.Xml.Linq Imports System.Linq Imports <xmlns:c="http://schemas.devleap.com/Customer"> Public Class Program   Private Shared Sub Listing6_17()     Dim xmlCustomers As XDocument = _       <?xml version="1.0" encoding="utf-8"?>         <c:customers>             <c:customer name="Paolo" city="Brescia" country="Italy"/>             <c:customer name="Marco" city="Torino" country="Italy"/>             <c:customer name="James" city="Dallas" country="USA"/>             <c:customer name="Frank" city="Seattle" country="USA"/>         </c:customers>     End Sub End Class 
image from book

The key point of this sample is the Imports statement, which declares the global namespace prefix c for namespace http://schemas.devleap.com/Customer. This particular kind of Imports syntax can be used only to declare an XML namespace with its prefix. It is not allowed to declare a default XML namespace without a prefix.

Let’s look at a final example, shown in Listing 6-18, using C# 3.0 to define a default namespace and a custom prefixed one.

Listing 6-18: C# 3.0 syntax used to define a default namespace and a custom prefix for one

image from book
  XNamespace nsCustomer = "http://schemas.devleap.com/Customer"; XNamespace nsAddress = "http://schemas.devleap.com/Address"; XElement customer = new XElement(nsCustomer + "customer",     new XAttribute("id", "C01"),     new XElement(nsCustomer + "firstName", "Paolo"),     new XElement(nsCustomer + "lastName", "Pialorsi"),     new XElement(nsAddress + "address", "Brescia - Italy",         new XAttribute(XNamespace.Xmlns + "a", nsAddress))); 
image from book

The code in Listing 6-18 produces an XML fragment like the following one:

 <?xml version="1.0" encoding="utf-8"?> <customer >   <firstName>Paolo</firstName>   <lastName>Pialorsi</lastName>   <a:address xmlns:a="http://schemas.devleap.com/Address">Brescia - Italy</a:address> </customer>

To query the previous XML content for the purpose of extracting the lastName node, we can just write a line of code like the following one:

 Console.WriteLine(customer.Elements(nsCustomer + "lastName"));

Using Visual Basic 9.0 and global XML namespaces, we can use code like this:

 Console.WriteLine(customer.<c:lastName>);

Later in this chapter, we will examine in detail how to query XML contents using LINQ to XML queries with both C# 3.0 and Visual Basic 9.0 syntax.

Other X* Classes

This new API has other available classes that define processing instructions (XProcessingInstruction), document types (XDocumentType), comments (XComment), and text nodes (XText). They are all derived from XNode and are typically used to build XDocument instances.

XObject and Annotations

XObject represents the base class of the whole LINQ to XML API, and it mainly provides methods and properties to work with annotations on nodes. Annotations are a new mechanism that maps metadata to XML nodes. For instance, we can add custom user information to our nodes as shown in Listing 6-19.

Listing 6-19: Annotations applied to an XElement instance

image from book
  XElement customer = XElement.Load(@"..\..\customer.xml"); CustomerAnnotation annotation = new CustomerAnnotation(); annotation.Notes = "This is a good customer!"; customer.AddAnnotation(annotation); 
image from book

CustomerAnnotation is a custom type and can be any .NET type. We can then retrieve annotations from XML nodes by using one of the two generic methods, Annotation<T> and Annotations<T>. These generic methods search for an annotation of type T or one that is derived from T in the current node, and if one exists, Annotation<T> and Annotations<T> return the first one or the full set of them, respectively.

 annotation = customer.Annotation<CustomerAnnotation>();

Because XObject is the base class of every kind of X* class that is used to describe an XML node, annotations can be added to any node. Usually, annotations are used to keep state information, such as the mapping to source entities or documents used to build XML, while the code handles real XML content.




Introducing Microsoft LINQ
Introducing MicrosoftВ® LINQ
ISBN: 0735623910
EAN: 2147483647
Year: 2007
Pages: 78

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net