Another set of Design Patterns isn't as generic as those discussed so far, but, for example, patterns for building enterprise applications.
Defining an enterprise application is tricky, but you can think of it as a large-scale information system with many users and/or a lot of data.
The main book dealing with the patterns in this category is Martin Fowler's Patterns of Enterprise Application Architecture [Fowler PoEAA].
The patterns here at first sight might not seem as cool or amazing as some of the Design Patterns, but they are extremely useful, cover a lot of ground, and contain a lot of experience and knowledge. As I said, they are less generic than the other Design Patterns and focused just on large-scale information systems.
They come into play for the chosen structure of the logic; for example, the Domain Model. The patterns here aren't so much about how the Domain Model itself (or any of the other models for structuring the main logic) should be structured, but more about the infrastructure for supporting the Domain Model.
To make it more concrete, I'd like to discuss an example, and I choose Query Objects [Fowler PoEAA].
An Example: Query Objects
Let's assume for a moment that you have a Domain Model for a SalesOrder application. There is a Customer class and an Order class, and the Order class in particular is composed of a number of other classes. This is simple and clear.
There are several different solutions from which to choose in order to navigate the Domain Model. One solution is to have a global root object that has references to root-like collections. In this case, a customer collection would be an example of one of these. So what the developer does is to start with the global root object and navigate from there to the customer collection, and then iterate the collection until what is needed is found, or perhaps navigate to the customer's sales orders if that's what's of interest.
A similar paradigm is that all collections are global so you can directly access the customer collection and iterate over it.
Both those paradigms are easy to understand and simple to use, but one drawback is that they are lacking somewhat from the perspective of a distributed system. Assume you have the Domain Model running at the client (each client has one Domain Model, or rather a small subset of the Domain Model instances, and no shared Domain Model instances) and the database is running at a database server (a pretty common deployment model). What should happen when you ask the root object to get the customer collection of one million customers? You can get all the customers back to the client so the client can iterate over the collection locally. Not so nice to wait for that huge collection to be transmitted.
Another option is to add an application server to the picture and ask it to only send over a collection reference to the client side, and then much less data is transmitted, of course. On the other hand, there will be an incredible amount of network calls when the client is iterating the list and asking for the next customer over the network one million times. (It will be even worse if the customer instances themselves aren't marshaled by value but only by reference.) Yet another option is to page the customer collection so the client perhaps gets 100 customers from the server at a time.
I knowall these solutions have one problem in common; you don't often want to look at all the customers. You need a subset, in which case it's time to discuss the next problem.
The problem is that the users want a form where they can search for customers flexibly. They want to be able to ask for all customers who
But on the same form, they should also be able to ask for just customers in a certain part of Sweden. Again, the search form needs to be pretty flexible.
I'm going to discuss three different solution proposals, namely "filtering within Domain Model," "filtering in database with huge parameter lists," and "Query Objects."
Solution Proposal One: Filter Within Domain Model
Let's take a step back and admit that we could use any of the solutions already discussed so that the collection is materialized somewhere and then the filter is checked for every instance. All instances meeting the filter criteria are added to a new collection, and that is the result.
This is a pretty simple solution, but practically unusable in many real-world situations. You will waste space and time. Not only were there one million customers, but you also had to materialize the orders for the customers. Phew, that solution is just impossible to use and it's even worse when you scale up the problem....
Of course, the conclusion here depends to a large degree on the execution platform. Remember what I said about the deployment modela subset of the Domain Model instances in each client, the database at a database server, no shared Domain Model instances.
If instead there was one shared set of Domain Model instances at an application server (which has its own problemsmore about that in later chapters), this might have been a suitable solution, but only for server-side logic. For clients asking for a subset of the shared Domain Model instances, the clients must express their criteria somehow.
Solution Proposal Two: Filtering in Database with Huge Parameter Lists
Databases are normally good at storing and querying, so let's use them to our advantage here. We just need to express what we want with a SQL statement and then transform the result into instances in our Domain Model.
A SQL statement like the following could solve the first problem:
SELECT Id, CustomerName, ... FROM Customers WHERE CustomerName LIKE '%aa%' AND Id IN (SELECT CustomerId FROM ReferencePersons WHERE FirstName = 'Stig') AND Id IN (SELECT CustomerId FROM Orders WHERE TotalAmount > 1000000) AND Id IN (SELECT CustomerId FROM Orders WHERE OrderDate BETWEEN '20040601' AND '20040630')
It's debatable whether I can combine the two subselects targeting Orders into a single subselect. As the requirement was stated, I don't think so (because the meaning would change slightly if I combined them).
Anyway, this isn't really important for the discussion here.
Here we just materialize the instances that are of interest to us. However, we probably don't want the layer containing the Domain Model to have to contain all that SQL code. What's the point of the Domain Model in that case? The consumer layer just gets two models to deal with.
So we now have a new problem. How shall the consumer layer express what it wants? Ah, the Domain Layer which is responsible for the mapping between the database and Domain Model can provide the consumer layer with a search method. Proposal number two is the following:
public IList SearchForCustomers (string customerNameWithWildCards , bool mustHaveOrderedSomethingLastMonth , int minimumOrderAmount , string firstNameOfAtLeastoneReferencePerson)
This probably solves the requirement for the first query, but not the second. We need to add a few more parameters like this:
public IList SearchForCustomers (string customerNameWithWildCards , bool mustHaveOrderedSomethingLastMonth , int minimumOrderAmount , string firstNameOfAtLeastoneReferencePerson , string country, string town)
Do you see where this is going? The parameter list quickly gets impractical because there are probably a whole bunch of other parameters that are also needed. Sure, editors showing placeholders for each parameter helps when calling the method, but using the method will still be error-prone and impractical. And when another parameter is needed, you have to go and change all the old calls, or at least provide a new overload.
Another problem is how to express certain things in that pretty powerless way of primitive datatypes in a list of parameters. A good example of that is the parameter called mustHaveOrderedSomethingLastMonth. What about the month before that? Or last year? Sure, we could use two dates instead as parameters and move the responsibility of defining the interval to the consumer of the method, but what about when we only care about customers in a certain town? What should the date parameters be then? I guess I could use minimum and maximum dates to create the biggest possible interval, but it's not extremely intuitive that that's the way to express "all dates."
Gregory Young commented on the problem of how to express presendence. Expressing this with a parameter list is troublesome: (criterion1 and criterion2) or (criterion1 and criterion2)
I think we have quickly grown out of this solution, too. I came to the same conclusion back in the VB6 days, so I used an array-based solution. The first column of the array was the fieldname (such as CustomerName), the second column was the operator (such as Like from an enumerator) and the third column was the criterion such as "*aa*". Each criterion had one row in the array.
That solution solved some of the problems with the parameter list, but it had its own problems. Just because there was a new possible criterion added, I didn't have to change any of the old consumer code. That was good, but it was pretty powerless for advanced criterion, so I stepped back and exposed the database schema, for example, to deal with the criterion "Have any orders with a total amount larger than one million?" I then used the complete IN-clause as the criterion.
The array-based solution was a step in the right direction, but it would have become a little more flexible with objects instead. Unfortunately, it wasn't really possible to write marshal by value components in VB6. There were solutions to the problem, such as using a more flexible array structure, but the whole thing is so much more natural in .NET. Over to the Query Object pattern.
Solution Proposal Three: Query Objects
The idea of the Query Object pattern is to encapsulate the criteria in a Query instance and then send that Query instance to another layer where it is translated into the required SQL. The UML diagram for the general solution could look like that shown in Figure 2-5.
Figure 2-5. Class diagram for general Query Object solution
The criterion could use another query (even though it's not apparent in the typical description of this as in Figure 2-5), and that way it's easy to create the equivalent of a subquery in SQL.
Let's come up with a try for a Query Object language for applying on the problem. First though, let's assume that the Domain Model is as is shown in Figure 2-6.
Figure 2-6. Domain Model to be used for the example
Let's see what it could look like our newly created naïve query language in C#:
Query q = new Query("Customer"); q.AddCriterion("CustomerName", Op.Like, "*aa*"); Query sub1 = new Query("Order"); sub1.AddCriterion("TotalAmount", Op.GreaterThan, 1000000); q.AddCriterion(sub1); Query sub2 = new Query("Order"); sub2.AddCriterion("OrderDate", Op.Between, DateTime.Parse("2004-06-01"), DateTime.Parse("2004-06-30")); q.AddCriterion(sub2); q.AddCriterion("ReferencePersons.FirstName", Op.Equal, "Stig");
The parameter to the Query constructor is not a table name but a Domain Model class-name. The same goes for the parameters to AddCriterion(); I mean it's not table columns, but class fields/properties. In that case, property names or field names are used in the Domain Model.
Also note that in this specific example, I didn't need a subquery for the criterion regarding the ReferencePersons because the Domain Model was navigable from Customer to ReferencePerson. On the other hand, subqueries were needed for the Orders for the opposite reason.
If you are SQL-literate, your first impression might be that the SQL-version was more expressive, easier to read, and just better. SQL is certainly a powerful query language, but remember what we want to accomplish. We want to be able to work as much as possible with the Domain Model (within limits) and thereby achieve a more maintainable solution. Also note that the C# code just shown was needlessly talkative. Later on in the book we will discuss how the syntax could look by writing a thin layer on top of a general query object implementation.
So what we gained was further transparence of our code with regard to the database schema. Generally, I think this is a good thing. When we really need to, we can always go out of this little sandbox of ours to state SQL queries with the full power of the database and without a lifeline.
Another thing I'd like to point out is that creating a competent Query Object implementation will quickly become very complex, so watch out that you don't take on too much work.
A nice little side effect is that you can also use query objects pretty easily for local filtering, such as holding on to a cached list of all products. For the developer consuming the Domain Model, he or she just creates a Query Object as usual, but it is then used in a slightly different manner, without touching the database.
I know, I know. Caching is just as cool and useful as it is dangerous. Watch out, it can backfire. You have been warned.
Some DDD-literate readers would probably prefer the Specification pattern [Evans DDD] as the solution to this problem. That provides a neat connection over to the third and final pattern category we are going to focus on: Domain Patterns.