Flylib.com

Books Software

 
 
 

XML and Perl - page 71


Summary

As I've demonstrated in this chapter, XML can be easily integrated with database-driven applications and represent relational data. The inherent hierarchy of XML fits in very well with the relational database model. In addition, because XML is plain text, it can easily be transferred between two applications on different platforms (for example, Microsoft Windows and Linux) written in different languages (for example, Perl and Java). This chapter illustrates the power of XML ”it is platform and language independent and can easily be used as the common middleware format when converting or working with two or more foreign data types.


Exercises

  1. What needs to be changed in the example in Listing 6.8 to support a two-step process (similar to what was discussed in the chapter)?

  2. Assume that you need to develop a two-step application because the source database server and the destination database server are located in two different buildings . If you can't FTP between the machines, how would you transfer the XML middleware document? What would you need to change in the Perl DBI calls to connect to another database server on another host?

For suggested solutions to this exercise, be sure to check out the web site: http://www.xmlproj.com/book/chapter6.


Relevant Links

Perl DBI Home Page: http://dbi.perl.org.

XML and Database Links: http://www.rpbourret.com/xml/XMLDBLinks.htm.

XML DBMS Middleware: http://www.rpbourret.com/xmldbms/.


Chapter 7. Transforming Miscellaneous Data Formats to XML (and Vice-Versa)

graphics/chic01.gif

Chapter Roadmap

This chapter discusses generating XML documents from various input data formats. The concept seems to be pretty simple; however, there is more to it than you might think. I'll also demonstrate the power of XML SAX technology when it is applied to other formats.You will see how the SAX interface can be implemented to process other data formats and how powerful this implementation can be. Although SAX stands for Simple API for XML, it's now becoming a lot more than that thanks to its well-defined standard and the generality of its implementation. SAX-like interfaces are now being implemented in other communities that deal with different data formats, so eventually the SAX acronym might stand for Simple API for X, with X being the unknown, and you fill in the blank.

Here is a quick summary of the topics discussed in this chapter.

  • The section, "Why Convert Another Data Format to XML?" provides the answer to the question it poses.

  • The section, "XML::SAXDriver::CSV Perl Module" covers XML generation based on the CSV (Comma Separated Value) input.

  • The section, "XML::SAXDriver::Excel Perl Module" covers XML generation based on Microsoft Excel binary data.

  • The section, "Developing a Custom Event Handler" goes over the concepts and contains an example of how to write a SAX driver for non-XML data.

To run the examples in this chapter, you will need to install the following Perl modules:

  • XML::SAXDriver::CSV

  • XML::SAXDriver::Excel

Note

If you have any questions about Perl modules (for example, where do you get them, how do you install them, and so forth), please refer to Appendix B, "Basic Perl Concepts."



Why Convert Another Data Format to XML?

XML is everywhere.You see it online, read about it in publications , see it in the bookstore, and that's just the beginning. XML has quickly become the de facto way for applications to communicate. Although XML enjoys huge popularity, it is just now beginning to be widely implemented in applications; new standards are evolving almost weekly, and the technology still isn't even close to reaching its full potential.

Now, you might ask, how does that apply to this chapter? XML is powerful and there are numerous tools available to make it work. A majority of new applications are incorporating XML as the primary data interchange and data storage format. Given that, most applications that want to seamlessly integrate into the applications that utilize XML must adapt to the XML standard. Many legacy applications that are still deployed use either their native binary format or other widely adapted data formats to communicate. One of the most popular formats is CSV. Because CSV-formatted documents are just plain text, it is an easy format to adopt. One problem with CSV is that the data is not structured and can't be easily described, as it can with XML.

Another format that is commonly used and distributed among users on Microsoft Windows platforms is the binary Microsoft Excel format. Microsoft Excel is a spreadsheet application that supports data input, manipulation, and storage. This being the case, many applications enable their users to enter their data in Excel and then upload it into the system by accepting the binary Microsoft Excel files as input.

Because these types of problems exist, there are Perl modules designed to help you solve them. Remember, someone has probably already run into the problem you're having. SAXDriver modules are designed to facilitate converting other data formats into XML by giving you the flexibility of providing your own conversion rules and by being very efficient and light-weight (that is, small memory footprint). These are important qualities that would enable you to deploy these applications in a critical production environment and deal with large XML documents. Imagine the tasks that you could accomplish if you create an XML communication middleware application, and then effortlessly adapt any other format by converting it to XML without having to change the middleware business rules. Currently, two Perl modules enable you to easily accomplish this task: XML::SAXDriver::CSV and XML::SAXDriver::Excel. Let's take a closer look at these modules.