Domain Languages

Domain Languages

The limits of language are the limits of one's world.

Ludwig Von Wittgenstein

Computer languages influence how you think about a problem, and how you think about communicating. Every language comes with a list of features ” buzzwords such as static versus dynamic typing, early versus late binding, inheritance models (single, multiple, or none) ”all of which may suggest or obscure certain solutions. Designing a solution with Lisp in mind will produce different results than a solution based on C-style thinking, and vice versa. Conversely, and we think more importantly, the language of the problem domain may also suggest a programming solution.

We always try to write code using the vocabulary of the application domain (see The Requirements Pit, where we suggest using a project glossary). In some cases, we can go to the next level and actually program using the vocabulary, syntax, and semantics ”the language ”of the domain.

When you listen to users of a proposed system, they might be able to tell you exactly how the system should work:

Listen for transactions defined by ABC Regulation 12.3 on a set of X.25 lines, translate them to XYZ Company's format 43B, retransmit them on the satellite uplink, and store for future analysis.

If your users have a number of such well-bounded statements, you can invent a mini-language tailored to the application domain that expresses exactly what they want:

 From X25LINE1 (Format=ABC123) {       Put TELSTAR1 (Format=XYZ43B);       Store DB;     } 

This language need not be executable. Initially, it could be simply a way of capturing the user 's requirements ”a specification. However, you may want to consider taking this a step further and actually implementing the language. Your specification has become executable code.

After you've written the application, the users give you a new requirement: transactions with negative balances shouldn't be stored, and should be sent back on the X.25 lines in the original format:

 From X25LINE1 (Format=ABC123) {       if (ABC123.balance < 0) {         Put X25LINE1 (Format=ABC123);       }       else {         Put TELSTAR1 (Format=XYZ43B);         Store DB;       }     } 

That was easy, wasn't it? With the proper support in place, you can program much closer to the application domain. We're not suggesting that your end users actually program in these languages. Instead, you're giving yourself a tool that lets you work closer to their domain.

Tip 17

Program Close to the Problem domain



Whether it's a simple language to configure and control an application program, or a more complex language to specify rules or procedures, we think you should consider ways of moving your project closer to the problem domain. By coding at a higher level of abstraction, you are free to concentrate on solving domain problems, and can ignore petty implementation details.

Remember that there are many users of an application. There's the end user, who understands the business rules and the required outputs. There are also secondary users: operations staff, configuration and test managers, support and maintenance programmers, and future generations of developers. Each of these users has their own problem domain, and you can generate mini-environments and languages for all of them.

Domain-Specific Errors

If you are writing in the problem domain, you can also perform domain-specific validation, reporting problems in terms your users can understand. Take our switching application on on the facing page. Suppose the user misspelled the format name :

 From X25LINE1 (Format=AB123) 

If this happened in a standard , general-purpose programming language, you might receive a standard, general-purpose error message:

 Syntax error: undeclared identifier 

But with a mini-launguage, you would instead be able to issue an error message using the vocabulary of the domain:

 "AB123" is not a format. known formats are ABC123,             XYZ43B, PDQB, and 42. 

Implementing a Mini-Language

At its simplest, a mini-language may be in a line-oriented, easily parsed format. In practice, we probably use this form more than any other. It can be parsed simply using switch statements, or using regular expressions in scripting languages such as Perl. The answer to Exercise 5 on page 281 shows a simple implementation in C.

You can also implement a more complex language, with a more formal syntax. The trick here is to define the syntax first using a notation such as BNF. [7] Once you have your grammar specified, it is normally trivial to convert it into the input syntax for a parser generator. C and C++ programmers have been using yacc (or its freely available implementation, bison [URL 27]) for years . These programs are documented in detail in the book Lex and Yacc [LMB92]. Java programmers can try javaCC, which can be found at [URL 26]. The answer to Exercise 7 on page 282 shows a parser written using bison. As it shows, once you know the syntax, it's really not a lot of work to write simple mini-languages.

[7] BNF, or Backus-Naur Form, lets you specify context-free grammars recursively. Any good book on compiler construction or parsing will cover BNF in (exhaustive) detail.

There's another way of implementing a mini-language: extend an existing one. For example, you could integrate application-level functionality with (say) Python [URL 9] and write something like [8]

[8] Thanks to Eric Vought for this example.

 record = X25LINE1.get(format=ABC123)  if  (record.balance < 0):             X25LINE1.put(record, format=ABC123)  else:  TELSTAR1.put(record, format=XYZ43B)             DB.store(record) 
Data Languages and Imperative Languages

The languages you implement can be used in two different ways.

Data languages produce some form of data structure used by an application. These languages are often used to represent configuration information.

For example, the sendmail program is used throughout the world for routing e-mail over the Internet. It has many excellent features and benefits, which are controlled by a thousand-line configuration file, written using sendmail 's own configuration language:

 Mlocal, P=/usr/bin/procmail,             F=lsDFMAw5 :/@qSPfhn9,             S=10/30, R=20/40,             T=DNS/RFC822/X-Unix,             A=procmail -Y -a $h -d $u 

Obviously, readability is not one of sendmail 's strengths.

For years, Microsoft has been using a data language that can describe menus , widgets, dialog boxes, and other Windows resources. Figure 2.2 on the next page shows an excerpt from a typical resource file. This is far easer to read than the sendmail example, but it is used in exactly the same way ”it is compiled to generate a data structure.

Figure 2.2. Windows .rc file
graphics/02fig02.gif

Imperative languages take this a step further. Here the language is actually executed, and so can contain statements, control constructs, and the like (such as the script on page 58).

You can also use your own imperative languages to ease program maintenance. For example, you may be asked to integrate information from a legacy application into your new GUI development. A common way of achieving this is by screen scraping; your application connects to the mainframe application as if it were a regular human user, issuing keystrokes and "reading" the responses it gets back. You could script the interaction using a mini-language. [9]

[9] In fact, you can buy tools that support just this kind of scripting. You can also investigate open -source packages such as Expect, which provide similar capabilities [URL 24].

 locate prompt "SSN:"     type "%s" social_security_number     type enter     waitfor keyboardunlock     if text_at(10,14) is "INVALID SSN" return bad_ssn     if text_at(10,14) is "DUPLICATE SSN" return dup_ssn  # etc...  

When the application determines it is time to enter a Social Security number, it invokes the interpreter on this script, which then controls the transaction. If the interpreter is embedded within the application, the two can even share data directly (for example, via a callback mechanism).

Here you're programming in the maintenance programmer's domain. When the mainframe application changes, and the fields move around, the programmer can simply update your high-level description, rather than groveling around in the details of C code.

Stand-Alone and Embedded Languages

A mini-language doesn't have to be used directly by the application to be useful. Many times we may use a specification language to create artifacts (including metadata) that are compiled, read-in, or otherwise used by the program itself (see Metaprogramming).

For example, on page 100 we describe a system in which we used Perl to generate a large number of derivations from an original schema specification. We invented a common language to express the database schema, and then generated all the forms of it we needed ”SQL, C, Web pages, XML, and others. The application didn't use the specification directly, but it relied on the output produced from it.

It is common to embed high-level imperative languages directly into your application, so that they execute when your code runs. This is clearly a powerful capability; you can change your application's behavior by changing the scripts it reads, all without compiling. This can significantly simplify maintenance in a dynamic application domain.

Easy Development or Easy Maintenance?

We've looked at several different grammars, ranging from simple line-oriented formats to more complex grammars that look like real languages. Since it takes extra effort to implement, why would you choose a more complex grammar?

The trade-off is extendibility and maintenance. While the code for parsing a "real" language may be harder to write, it will be much easier for people to understand, and to extend in the future with new features and functionality. Languages that are too simple may be easy to parse, but can be cryptic ”much like the sendmail example on page 60.

Given that most applications exceed their expected lifetimes, you're probably better off biting the bullet and adopting the more complex and readable language up front. The initial effort will be repaid many times in reduced support and maintenance costs.

Related sections include:
  • Metaprogramming

Challenges
  • Could some of the requirements of your current project be expressed in a domain-specific language? Would it be possible to write a compiler or translator that could generate most of the code required?

  • If you decide to adopt mini-languages as a way of programming closer to the problem domain, you're accepting that some effort will be required to implement them. Can you see ways in which the framework you develop for one project can be reused in others?

Exercises

5.

We want to implement a mini-language to control a simple drawing package (perhaps a turtle -graphics system). The language consists of single-letter commands. Some commands are followed by a single number. For example, the following input would draw a rectangle.

 P 2 #  select pen 2  D   #  pen down  W 2 #  draw west 2cm  N 1 #  then north 1  E 2 #  then east 2  S 1 #  then back south  U   #  pen up  

Implement the code that parses this language. It should be designed so that it is simple to add new commands.

6.

Design a BNF grammar to parse a time specification. All of the following examples should be accepted.

 4pm, 7:38pm, 23:42, 3:16, 3:16am 
7.

Implement a parser for the BNF grammar in Exercise 6 using yacc, bison, or a similar parser-generator .

8.

Implement the time parser using Perl. [Hint: Regular expressions make good parsers.]



The Pragmatic Programmer(c) From Journeyman to Master
The Pragmatic Programmer: From Journeyman to Master
ISBN: 020161622X
EAN: 2147483647
Year: 2005
Pages: 81

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net