A data language has elements that specify, in human-readable form, data intended for a computer program. These languages appear when a need arises for users to be able to describe data in an input file. For example, users may need to type a record of orders taken while the system was down. In the example treated in this chapter, a marketing department passes a list of featured coffees to the group that writes a monthly brochure. This group has a computer application that reads the list and provides links to other information about the coffee types in the list. In examples such as these, existing code usually applies some variety of string matching to read such data, and this code can be complex.
The role of a data language parser is to simplify the reading of a legacy language. This chapter shows how to create such a parser. If you are creating a new language, you should consider using XML (Extensible Markup Language) rather than writing a parser. XML makes it easy to create new languages without writing parsers at all. In XML, you specify the pattern of a new language in a text file. Standard and freely available tools read the pattern of your language and validate input files against it. After explaining how to parse existing languages, this chapter gives a brief introduction to the use of XML as a language creation tool.