Architecture Styles to Value

Architects and developers must make design decisions based on the requirements of the project. I haven't collected an exhaustive list of values regarding architecture styles here, but I have decided on a few main things I'd like to comment on to some extent, such as model focus, domain models, databases, distribution, and messaging. First, I think it's wise to keep a model focus.

Focus on the Model

For a long time, I have liked the object-oriented paradigm a lot, but it wasn't until pretty recently that I made the full move to that paradigm myself. There have been platform problems with using the paradigm before, but now the platforms are mature.

Note

As a reviewer pointed out, maturity depends on the platform we are talking about. If you come from a VB background, what was just said is reasonable, but if you come from Java, SmallTalk, C#, and so on, the platform has been mature for quite some time.

Around 13 years ago, I tried to use a visual model to communicate my understanding of the requirements of a system I was about to build, and I used some OMT [Rumbaugh OMT] sketches. (OMT stands for Object Modeling Technique and was a method with a process and a notation. The notation was very similar to the one in Unified Modeling Language, UML, which isn't just a coincidence because OMT was the main inspiration for the notation in UML.) We were discussing multiplicity between classes, where behavior should belong, and so on. I realized after a few sessions that using my technique for discussion with expert users, instead of their technique, was a complete failure. They answered my questions randomly and didn't see or add much value at all in the discussion. (The system at large wasn't a failure, it's still being used and it's the core system there, but the development process didn't go as smoothly as it could have. I had to change method.)

Use Case Focus

I have thought about that experience several times and have actually joked about how naïve I was. In the years following, I always tried to play by the methodologies of the target groups; I mean the methodologies of the users with the users, and the methodologies of software development with developers. One technique that worked pretty well in both camps was the use case technique [Jacobson OOSE]. (At least it worked out well in the informal manner of XP [Beck XP] stories, short text descriptions, which was what I used. Such a description is a short description of a piece of functionality in a system. An example is "Register Order for Company Customer.")

It was very natural to users and could be made natural to developers, too, so it became the bridging tool of mine for years. The way I did the bridging was to have one class per use case in the software.

It did come to my attention that thanks to my way of applying use cases, I became pretty procedural in my thinking. I was designing a little bit like Transaction Script [Fowler PoEAA], but I tried to balance it by generalizing as much behavior as possible (or at least suitable).

A few years ago I heard Ivar Jacobson talking about use cases, and I was pretty surprised when I realized that he didn't encapsulate the use cases in classes of their own as I had expected and had done for a long time.

Another thing that got me kind of worried about this method was the constant struggle I was having with my friend and Valhalla-developer Christoffer Skjoldborg when we worked with the Domain Model pattern [Fowler PoEAA]. He saw little value in the use case classes, maintaining that they could even get in the way because they might become a hindrance when mixing and matching use case parts.

Different Patterns for Dealing with the Main Logic

Before we continue, I must say a few words about Transaction Script and Domain Model, which we have already touched upon.

Martin Fowler discusses three ways of structuring the main logic in application architecture in his book Patterns of Enterprise Application Architecture [Fowler PoEAA]. Those are Transaction Script, Table Module, and Domain Model.

Transaction Script is similar to batch programs in that all functionality is described from start till end in a method. Transaction Script is very simple and useful for simple problems, but breaks down when used for dealing with high complexity. Duplication will be hard to avoid. Still, very many very large systems are built this way. Been there, done that, and I've seen the evidence of duplication even though we tried hard to reduce it. It crops up.

Table Module encapsulates a Recordset [Fowler PoEAA], and then you call methods on the Table Module for getting information about that customer with id 42. To get the name, you call a method and send id 42 as a parameter. The Table Module uses a Recordset internally for answering the request. This certainly has its strengths, especially in environments when you have decent implementations for the Recordset pattern. One problem, though, is that it also has a tendency to introduce duplication between different Table Modules. Another drawback is that you typically can't use polymorphistic solutions to problems because the consumer won't see the object identities at all, only value-based identities. It's a bit like using the relational model instead of the object-oriented model.

Domain Model instead uses object orientation for describing the model as close to the chosen abstractions of the domain as possible. It shines when dealing with complexity, both because it makes usage of the full power of object orientation possible and because it is easier to be true to the domain. Usage of the Domain Model pattern isn't without problems, of course. A typical one is the steep learning curve for being able to use it effectively.

Domain-Driven Design Focus

The real eye-opener for me regarding the usage of use cases was Domain-Driven Design (DDD) [Evans DDD]. (We will talk a lot about DDD soon, but for now, let's just say it's about focusing on the domain and letting it affect the software very much.) I still think that the use case technique is a great way of communicating with users, but DDD has made me think it can help a lot if we can manage to actively involve the users in discussions about the core model. It can head off mistakes very early and help the developers understand the domain better.

Note

"Model" is one of those extremely overloaded terms in software. I think we all have an intuitive idea about what it is, but I'd like to describe how I think about it. Instead of trying to come up with the definition, here in all simplicity (this could fill a book of its own by the right author) are a few properties of a model that you can compare to your own understanding of the term. A model is

Partial

For a certain purpose

A system of abstractions

A cognitive tool

Also,

A model has several presentations (for example: language, code, and diagrams).

There are several models in play in a system.

If you would like to read more, very much has been written about what a model is. A good place to start is the first pages in Eric Evans' book [Evans DDD]. (A special thanks to Anders Hessellund and Eric Evans for inspiration.)

The model is a great tool for communication between developers and users, and the better the communication is between those groups, the better the software will become, both in the short and the long run.

We (including me) have been using models forever, of course, but the difference between one of my old models and those that have been built with the ideas of DDD is that my old models had much more infrastructure focus and technical concept focus. My new models are much cleaner from such distractions and instead focus totally on the core domain, its main concepts, and the domain problems at hand. This is a big mind shift.

Another way of expressing it is that technicalities (such as user interface fashion) come and go. Core business lasts. And when the core business changes, we want to change the model and the software.

It's not rocket science so far, and I'm probably just kicking in open doors. I think, however, this lies at the very core of how to achieve efficient software development, and in my experience it is rarely used. By this I mean focusing on the model, having a model (with different presentations) that is used by both users and developers, and not getting distracted by unimportant details.

Why Model Focus?

I'd like to exaggerate this a bit. Let's play a game: What kind of developer would be the best one for building a vertical system in a special field, let's say in finance? It goes without saying that developer should be technically savvy, very skilled and experienced, have a huge social network, and so on. It would also be very nice if he was extremely skilled in the business domain itself and it wouldn't hurt if he had worked as, say, a trader for ten years, if that's the kind of system to be built.

Finding developers like that does happen, but in my experience it's more the exception than the rule. In the case of the developer being able to move on to another domain for a new system when the financial system is up and running, that exception becomes even more rare. This is because developers just can't have ten years of experience in logistics, health care, insurance, and the rest.

Another solution can be to let the domain expert users develop the software themselves. This has been an old dream for decades, and to a degree it's slowly becoming more and more viable all the time. At the same time, it takes away time from them, time they often can use better for their core work. There's also too many technical problems still involved.

So what's the next best thing? The answer is obvious (at least from today's standpoint): make the developer learn as much as possible about the domain he or she is working on and add users to the picture to bring the domain expertise to the project team and to actively and constructively work within the project. The users would not just set the requirementsalthough that is also very importantbut would actually help out with designing the core of the system. If we can manage to create this atmosphere, nobody is more skilled than the domain expert user at deciding on what the core is, what the key abstractions are, and so on.

Of course, as developers we are also necessary. The whole thing is done in cooperation. For example, what might be seen as a tiny detail to businesspeople can be a huge thing to us in the domain model. They might say: "Oh, sometimes it's like this instead," and everything is totally different, but because it's not the most common variant, they think it's unimportant.

Time to Try Discussing the Model with Customers Again

Why would I now succeed in discussing the model with the customers when I failed before? Many things have changed. First, it's important to have a sense of the context and the project group. Second, use cases can help a lot when getting started, and then you can switch the focus later to the core model itself. Third, building systems is something many more people have experience in and have been taught about now. Fourth, to be picky, the model represented as graphical, UML-like sketch for example is not the most important thing to have users work on; it's still typically a representation that developers like better than users. What we are looking for is the ubiquitous language [Evans DDD].

The ubiquitous language is not something like UML, XML Schema, or C#; it's a natural, but very distilled, language for the domain. It's a language that we share with the users for describing the problem at hand and the domain model. Instead of us listening to the users and trying to translate it to our own words, a ubiquitous language would create fewer reasons for misunderstandings, and it will become easier for the users to understand our sketches and actually to help correct mistakes, helping us gain new knowledge about the domain. If it's suitable, you can discuss ubiquitous language in the context of the graphical/code models with the users. If not, you still have the most important thing done if you catch the ubiquitous language.

Note

Eric Evans commented on the previous with the following: "Something to make clear is that the ubiquitous language isn't just the domain expert's current lingo. That has too many ambiguities and assumptions and also probably too large a scope. The ubiquitous language evolves as a collaboration between the domain experts and the software experts. (Of course, it will resemble a subset of the domain jargon.)"

The ubiquitous language is something that you should work hard to keep well-defined and in synch with the software. For example, a change in the ubiquitous should lead to a change in the software and vice versa. Both artifacts should influence each others.

If You Have a Model Focus, Use the Domain Model Pattern

If we have agreed on having a deep model focus and using object-orientation, I believe the natural result is to use the Domain Model pattern [Fowler PoEAA] for structuring the core logic of the applications and services. You find an example of a small Domain Model in Figure 1-1.

Figure 1-1. An example of a Domain Model, an early sketch for the example application, further discussed in Chapter 4, "A New Default Architecture"

Note

If you aren't up to speed on UML, I think a good book to read is UML Distilled [Fowler UML Distilled]. It's not essential that you understand UML, but I think it will be helpful to know the bones because I will be using UML as a sketch tool here and there (of course, you will benefit from it at other times as well, not just with this book).

Even though Figure 1-1 shows just an early sketch of a small domain model, I think we can see that the model expresses domain concepts and not technical distractions. We can also see that it contains several cooperating small pieces, which together form a whole.

Note

If you wonder why Figure 1-1 is handwritten, it is to stress the point that it is a sketch. We will explore and develop the details in codefor example, with the help of Test-Driven Development (TDD). (We will discuss TDD later in this chapter, as well as in Chapter 3, "TDD and Refactoring.")

Before we talk more about the specific style I used for the Domain Model, I think it's time to look back again.

Domain Model History from My Rear View Mirror

I still remember the first time I tried to use the Domain Model, even though I didn't call it that at the time. It was around 1991, and I couldn't understand how it should match up in the UI. I don't remember what the application was, but I remember that I tried to decide on what the menus should look like in order to be object-oriented. I didn't come over that first hurdle that time. I think I was looking for a very close mapping between the UI and the Domain Model and couldn't find out how to achieve it.

A few years prior to that, I worked on an application that was said to use object-oriented ideas, for example, for the UI. There were no big breakthroughs, perhaps, but it managed to achieve small things. For example, instead of first choosing a function and then deciding to what object (such as a portfolio) to apply it, the user first chose the object and then decided between the different functions that were possible for the chosen object. At the time this was pretty different from how most business applications were navigated in the UI.

A Few Words About Naked Objects

Perhaps I wasn't totally off with my intentions. Perhaps what I was trying to achieve was something like Naked Objects [Pawson/Matthews Naked Objects], but I really didn't get very far with it.

Without going into details about Naked Objects, I think a short explanation is in order. The basic idea is that not only developers but users also like object-orientation. They like it even more than we normally think, and they would quite often be happy with a UI that is very close to the Domain Model itself, so close that the UI can be created automatically by a framework directly based on the Domain Model.

A system based on naked objects automatically presents a form to a user that contains widgets exposing the properties in the Domain Model class.

For more complex tasks, there is some need for customization (as always), but the idea is to cut down on that as well, again thanks to the framework that understands what UI to create from the Domain Model fragments. So when the Domain Model is "done," the UI is more or less done as well.

For the moment I don't have first-hand experience with the concept and technique, but an appealing twist to me of the idea is that the users will really get a chance to see and feel the model, which should be very helpful in bridging the gap between developers and users.

After that, I tried Domain Model approaches many times over the years. I found problems with using it, especially related to performance overhead (especially in distributed scenarios). That said, some of my real world applications were built in that style, especially the simpler ones. What is also important to say is that however much I wanted to build very powerful, well-designed Domain Models, they didn't turn out as I had anticipated.

That wasn't the only problem I had, though....

Old Truths Might Be Wrong

I've believed in Domain Models and tried to use them several times in the past, but most attempts to implement it in object-oriented fashion led me to the conclusion that it doesn't work. That was, for example, the case with my COM+ applications in Visual Basic 6 (VB6). The main problem was that performance overhead was too high.

When I wrote my previous book [Nilsson NED], which described a default-architecture for .NET applications, I reused my old VB6/COM+ knowledge and didn't make the discussed architecture Domain Model-based at all.

Later on I started up new experiments again, trying to see if I could get down to something like 10% in response-time overhead on common scenarios (such as fetching a single order, fetching a list of orders for a customer, or saving an order) when using Domain Model in .NET compared with Transaction Scripts and Recordset. To my surprise, I could get lower overhead. The old truths had become wrong. The Domain Model-based architecture is very much more possible now in .NET than in VB6. One reason for this is that the instantiation time is reduced by orders of magnitude. Therefore, the overhead is just much lower when you have very many instances. Another reason is that it's trivial to write Marshal By Value (so that not only is the reference to the instance moved, but the whole instance, at least conceptually, is moved) components in .NET. It was not simple in the COM world. (It wasn't even possible to do it in the ordinary sense in VB6.)

Note

The importance of Marshal By Value is of less importance because we now most often prefer to not send the Domain Model as-is over the wire, but my early tests were using that.

Yet another reason is that .NET better supports object-orientation; it's just a much better toolbox.

Note

Also worth mentioning is that there hasn't been much emphasis on the Domain Model in the Microsoft community in the past. The Java community, on the other hand, is almost the opposite. I remember when I went to a workshop with mostly Java people. I asked them about what their favorite structure for the logic was, perhaps if they used Recordset (added in JDBC version 3) a lot. They looked at me funnily as if they hadn't understood the question. I quickly realized that Domain Model was more or less the de facto standard in their case.

To set all this straight, I'd like to clarify two things. First, I do not say that you should choose the Domain Model pattern because of performance reasons. Rather that you often can choose the Domain Model pattern without getting performance problems.

Second, I do not say that to be able to use a model focus, you need a certain technology. Different technologies are more or less appropriate for expressing the model in software in close resemblance with the domain. Of course, there is not a single technology that is always best. Different domains and different problems set different factors for what is an appropriate technology. So I see technology as an enabler. Different technologies can be better enablers than others.

To summarize this, it's very much a matter of design if the Domain Model pattern can be used efficiently enough or not.

Some good news regarding this is that there is lots of good information to gain. Domain-Driven Design and its style for structuring Domain Models provides lots of valuable help, and that is something we will spend a lot of time trying out in the book.

One Root Structure

So if we choose to focus on a Domain Model, it means that we will get all the people on the project to buy into the model. This goes for the developers, of course.

It might even be that the DBAs can agree on seeing the Domain Model as the root, even though the database design will be slightly different. If so, we have probably accomplished getting the DBA to speak the ubiquitous language, too! (As a matter of fact, the more you can get your DBA to like the Domain Model and adhere to it, the easier it will be to implement it. This is because you will need to create less mapping code if the database design isn't radically different from the Domain Model, but more like a different view of the same model.) Oh, and even the users could buy into the Domain Model. Sure, different stakeholders will have different needs from the Domain Model, different views regarding the detail level, for example, but it's still one root structurea structure to live with, grow with, change....

As you will find, I'm going to put a lot of energy into discussing Domain Model-based solutions in this book from now on, but before doing that, I'd like to talk a bit about other architecture values. Let's leave the domain focus for a while and discuss some more technically focused areas. First is how to deal with the database.

Handle the Database with Care

In the past, I've thought a lot about performance when building systems. For example, I used to write all database access as handwritten, stored procedures. Doing this is usually highly effective performance-wise during runtime, especially if you don't just have CRUD (Create Read Update Delete) behavior for one instancesuch as a single customerat a time, but have "smarter" stored procedures; for example, doing several operations and affecting many rows in each call.

The Right Efficiency

Even though hardware capability is probably still growing in accordance with something like Moore's law, the problem of performance is an everyday one. At the same time as the hardware capabilities increase, the size of the problems we try to solve with software also increases.

In my experience, performance problems are more often than not the result of bad database access code, bad database structure, or other such things. One common reason for all this is that no effort at all has been spent on tuning the database, only on object-orientation purity. This in its turn has led to an extreme number of roundtrips, inefficient and even incorrect transaction code, bad indexing schemes, and so on.

To make it a bit more concrete, let's look at an example where object orientation has been thought of as important while database handling has not. Let's assume you have a Customer class with a list of Orders. Each Order instance has a list of OrderLine instances. It looks something like Figure 1-2.

Figure 1-2. UML diagram for Customer/Order example

Moreover, let's assume that the model is very important to you and you don't care very much about how the data is fetched from and stored to the relational database that is being used. Here, a possible (yet naïve) schema would be to let each class (Customer, Order and OrderLine) be responsible for persisting/depersisting itself. That could be done by adding a Layer Supertype [Fowler PoEAA] class (called PersistentObject in the example shown in Figure 1-3) which implements, for example, GetById() and from which the Domain Model classes inherit (see Figure 1-3).

Figure 1-3. UML diagram for `Customer/Order` example, with Layer Supertype

Now, let's assume that you need to fetch a Customer and all its Orders, and for each order, all OrderLines. Then you could use code like this:

//A consumer Customer c = new Customer(); c.GetById(id);

Note

If you wonder about //A consumer in the previous code snippet, the idea is to show the class name (and sometimes methods) for code snippets like that to increase clarity. Sometimes the specific class name (as in this case) isn't important, and then I use a more generic name instead.

C++ kind of has this concept built in because method names are written like ClassName::MethodName, but I think some small comments should provide the same effect.

What happens now is that the following SQL statements will execute:

SELECT CustomerName, PhoneNumber, ... FROM Customers WHERE Id = 42

Next the customer's GetById() method will fetch a list of orders, but only the keys will be fetched by calling GetIdsOfChildren(), which will execute something like this:

SELECT Id FROM Orders WHERE CustomerId = 42

After that, the Customer will instantiate Order after Order by iterating over the DataReader for Order identifiers, delegating the real work to the GetById() method of the Order like this:

//Customer.GetById() ... Order o; while theReader.Read() {     o = new Order();     o.GetById(theReader.GetInt32(0));     c.AddOrder(o); }

Then for each Order it's time to fetch the identifiers of all the OrderLines...well, you get the picture. What happened here was that the object perspective was used to a certain extent (at least that was the intention of the designer) instead of thinking in sets, as is what relational databases are based on. Because of that, the number of roundtrips to the database was extremely high (one for every row to fetch plus a few more). The efficiency plummeted through the floor.

Note

Worth mentioning is that the behavior just described might be exactly what you need to avoid massive loading of data in a specific scenario. It's hard to point out something that is always bad. Remember the context.

On the other hand, if we think about the other extreme, handwritten and hand optimized stored procedures, it could look like this:

CREATE PROCEDURE GetCustomerAndOrders(@customerId INT) AS     SELECT Name, PhoneNumber, ...     FROM Customers     WHERE Id = @customerId     SELECT Id, ...     FROM Orders     WHERE CustomerId = @customerId     SELECT Id, ...     FROM OrderLines     WHERE OrderId IN         (SELECT Id         FROM Orders         WHERE CustomerId = @customerId)

They are often efficient (as I said) during runtime. They are very inefficient during maintenance. They will give you lots of code to maintain by hand. What's more, stored procedures in Transact SQL (T-SQL, the SQL dialect used for Sybase SQL Server and Microsoft SQL Server), for example, don't lend themselves well to ordinary techniques for avoiding code duplication, so there will be quite a lot of duplicate code.

Note

I know, some of my readers will now say that the previous example could be solved very efficiently without stored procedures or that that stored procedure wasn't the most efficient one in all circumstances. I'm just trying to point out two extreme examples from an efficiency point of view. I mean badly designed codeat least from an efficiency point of viewthat uses dynamic SQL compared to better design and well-written stored procedures.

So the question is if runtime efficiency is the most important factor for choosing how to design?

Maintainability Focus

If I only have to choose one "ability" as the most important one, these days I would choose maintainability. Not that it's the only one you need, it absolutely is not, but I think it's often more important than scalability, for example. With good maintainability, you can achieve the other abilities easily and cost-effectively. Sure, this is a huge simplification, but it makes an important point.

Another way to see it is to compare the total cost of producing a new system with the total cost of the maintenance of the system during its entire life cycle. In my experience, the maintenance cost will be much greater for most successful systems.

To conclude this, my current belief is that it is worth giving some attention to the database. You should see it as your friend and not your enemy. At the same time, however, hand writing all the code that deals with the database is most often not the "right" approach. It has little or nothing to do with the model that I want to focus on. To use a common quote by Donald Knuth, "Premature optimization is the root of all evil." It's better to avoid optimizations until you have metrics saying that you have to do them.

To make a quick and long jump here, I think that decent design, together with Object Relational Mapping (O/R Mapping), is often good enough, and when it isn't, you should hand tune. O/R Mapping is a bit like the optimizer of database serversmost of the time it is smart enough, but there are situations when you might need to give it a hand.

Note

Decent design and O/R Mapping will be discussed a lot throughout the book. For now, let's define O/R Mapping as a technique for bridging between object orientation and relational databases. You describe the relationship between your object-oriented model and your relational database and the O/R Mapper does the rest for you.

So one approach is to use tools such as O/R Mappers for most of the database access code and then hand write the code that needs to be handwritten for the sake of execution efficiency.

Let's take a closer look at the problem something like an O/R Mapper will have to deal with, namely mapping between two different worlds, the object-oriented and the relational.

The Impedance Mismatch Between Domain Model and Relational Database

I mentioned previously that mapping to the UI is one problem when using Domain Models. Another well-known problem is mapping to relational databases. That problem is commonly referred to as impedance mismatch between the two worlds of relational and object-orientation.

I'd like to give my thoughts on what that impedance mismatch is, although it is something of a pop version. For further and more formal information see Cattel [Cattell ODM].

First of all, there are two type systems if you use both a relational database and an object-oriented model. One part of the problem is caused by the fact that the type systems are in different address spaces (even if not on different machines), so you have to move data between them.

Secondly, not even primitive types are exactly the same. For example a string in .NET is of variable length, but in Microsoft SQL Server a string is typically a varchar or a char or text. If you use varchar/char, you have to decide on a maximum width. If you use text, the program model is totally different than for the other string types in SQL Server.

Another example is DateTime. DateTime in .NET is pretty similar to SQL Server, but there are differences. For instance, the precision is down to 100 nanoseconds for .NET, but "only" down to 3/1000 of a second in SQL Server. Another "fun" difference is if you set a DateTime in .NET to DateTime.MinValue it will lead to an exception if you try to store it in a SQL Server DateTime.

Yet another difference is that of nullability. You can't store null in an ordinary int in .NET, but that's perfectly valid in SQL Server.

Note

The problems mentioned so far exist whether you use a Domain Model or not.

A big difference is how relationships are dealt with. In a relational database, relationships are formed by duplicated values. The primary key of the parent (for example, Customers.Id) is duplicated as a foreign key in the children (for example, Orders.CustomerId), effectively letting the child rows "point" to their parents. So everything in a relational model is data, even the relationships. In an object-oriented model, relationships can be set up in many different ways (for example, via values similar to those in the relational, but that is not typical). The most typical solution is to use the built-in object identifiers letting the parent have references to object identifiers of its children. This is, as you can see, a completely different model.

Navigation in a relational model can be done in two ways. First, one can use a parent primary key and then use a query to find all children that have foreign keys with the same value as the primary key of the parent. Then, for each child, the child primary key can be used with a new query to ask for all the children of the child, and so on. The other and probably more typical way of navigating in the relational model is to use relational joins between the parent set and the children set. In an object-oriented model, the typical way of navigating is to simply traverse the relationships between instances.

Next you find two code snippets where I have an order with ordernumber 42. Now I want to see what the name of the customer for that order is.

C#:

anOrder.Customer.Name

SQL:

SELECT Name FROM Customers WHERE Id IN     (SELECT CustomerId     FROM Orders     WHERE OrderNumber = 42)

Note

I could just as well have used a JOIN for the SQL case, of course, but I think a subquery was clearer here.

Another navigation-related difference is that for objects, the navigation is oneway. If you need bidirectional navigation it is actually done with two separate mechanisms. In the relational model, the navigation is always bidirectional.

Note

The question about directionality can be seen as the opposite also to what I just explained, because in the relational database, there is just a "pointer" in one direction. I still think that "pointer" is bidirectional because you can use it for traversal in both directions, and that's what I think is important.

As we have already touched upon, the relational model is set-based. Every operation deals with sets. (A set can be just one row, of course, but it is still a set.) However, in the object-oriented model, we deal with one object at a time instead.

Moreover, the data in the relational model is "global," while we strive hard to maintain privacy of the data in an object-oriented model.

When it comes to design, the granularity is quite different. Let's take an example to make this clear. Assume that we are interested in keeping track of one home phone number and one work phone number for certain people. In a relational model it is normally alright to have one table called People (plural is the de facto name standardat least among database peopleto make it obvious we are dealing with a set).

Note

If I'm picky, I should use the word "relation" instead of "table" when I'm talking about the relational model.

The table has three columns (probably more, but so far we have only talked about the phone numbers) representing one primary key and two phone numbers. Perhaps there could be five columns, because we might want to break down the phone numbers into two columns each for area code and local number, or seven if we add country code as well. See Figure 1-4 for a comparison.

Figure 1-4. The same model expressed as relational and object-oriented

What's important here is that even for 1:1, all columns are normally defined in one table. In an object-oriented model, it would be usual to create two classes, one called Person and one called PhoneNumber. Then a Person instance would be a composition of two PhoneNumber instances. We could do a similar thing in the relational model, but it would not usually make any sense. We try not to reuse definitions a lot in the relational model, especially because we don't have behavior tied to the table definitions. It's just the opposite in the object-oriented model. Another way to say this is that in the relational model, we don't increase the satisfied normal form if we move PhoneNumbers out into a separate table. We have probably just increased the overhead. What it comes down to is that the relational model is for dealing with tabular, primitive data, and this is both good and bad. The object-oriented model deals neatly with complex data as well.

Note

The relational model has a pretty strong concept when it comes to definition reuse, which is called domain. Unfortunately, the support for that concept still isn't very strong in today's products.

It's also the case that many of the products have support for complex datatypes, but it's still in its infancy.

We just discussed one example of granularity where a relational model is more coarse-grained than an object-oriented model. We could also look at it the other way around. For example, in a relational model an order might have many orderLines, but the orderLines are on their own, so to speak. Each orderLine is just a row; each order is other such rows. There is a relationship between them, but the rows are the units. In an object-oriented model, it might be a good solution to see the order as the unit and let it be a composition of orderLines. This time the relational model was finer-grained than the object-oriented model.

Note

I'm not implying that there won't be an OrderLine class in the Domain Model. There should be. What I'm saying is that what I ask for and work with as the unit is an order, and orderLines is part of the order.

Last but not least, the relational model doesn't support inheritance (again, at least it's not mainstream in the most popular products). Inheritance is at the core of the object-oriented model. Sure, you can simulate inheritance in the relational model, but that is all it is, a simulation. The different simulation solutions to choose among are all compromises and carry overhead in storage, speed, and/or relational ugliness.

Note

Deep and native XML integration in the database seems to be the newest way of trying to lessen the problem of impedance mismatch. But XML is actually also a third model, a hierarchical one, which has impedance mismatch with the object-oriented world and the relational world.

Because it has been typical for my applications to use relational databases, the impedance mismatch has created a big problem.

The Data Mapper pattern [Fowler PoEAA] can be used for dealing with the problem. The Data Mapper pattern is about describing the relationship between the Domain Model and the Database, and then the shuffling work is taken care of automatically. Unfortunately the Data Mapper pattern itself is a tough one, especially in the .NET platform where Object-Relational Mapper (O/R-mapper) products are a couple of years behind. In Java-land there are several mature products and even a standardized spec called JDO [Jordan/Russell JDO], which makes the two platforms, as similar as they are, totally different from each other in this respect.

So it's time to leave the area of Domain Models and databases and end the architecture section by talking about distribution.

Handle Distribution with Care

Do you recognize the "I have a new tool, let's really use it" syndrome? I do. I've been known to suffer from it from time to time. It can be problematic, but I like to think about it not just negatively, but actually a bit positively, too. After all, it could be a sign of productive curiosity, healthy progressive thinking, and a never-ending appetite for improvements.

OK, now I have set up some excuses before telling you some horror stories from my past. About ten years ago, out came Microsoft Transaction Server (MTS) and suddenly it became pretty easy for all COM developers to build distributed systems. What happened? Loads of distributed systems have been built. I was there with the rest of them.

Note

I'm going to use MTS as the name here, but you could just as well use COM+ or COM+ Component Services or Enterprise Services.

Fine, for some systems it was very suitable to be distributed. Moving some processing out from the client and from the database to an application server had its benefits, like resource sharing (for example, connection pooling at the server-side so that hundreds of clients could be served by just a handful of database connections). These benefits were very much needed for some systems, and it became easier and more productive to get those benefits.

Overuse Is Never Good

Unfortunately, the benefits were so appealing that things went a bit overboard from time to time. For example, the server-side work might be split at several different application servers. Not that the application servers were cloned so that all clients could use any of them and get all services needed, but rather the client called one of the application servers. That server called another application server, which called another application server. There were small benefits, but huge drawbacks because of increased latency and lower availability.

Another mistake was that simple applications with just two simultaneous users were written for and executed in MTS. It's a very good example of making a mountain out of a molehill because of a belief that the application will probably be very popular in some distant future and be used by thousands of users simultaneously. The problem wasn't just a more complex operational environment, but that the design of the application had lots of implications if you wanted to be a good MTS citizen. A typical example of this was to avoid chatty communication between certain (or even worse, between all) layers. Those implications did increase the complexity of the design to quite an extent. Well, overuse is never good.

Did you notice that I avoided talking about the normal pitfalls that beginners of distributed systems encounter when making their first attempts with distributed systems, such at chatty communication? Well, even if, against all odds, for some reason those mistakes were avoided, there are other, slightly more subtle problems.

All these problems led Martin Fowler to come up with his First Law of Distributed Object Design, which is "Don't distribute" [Fowler PoEAA]. If you absolutely don't have to, don't do it. The cost and complexity will just go sky high. It's a law of nature.

Distribution Might Be Useful

Still, there are good things about distribution too, both now and in the future, for example:

Fault tolerance
If there is just one single machine that runs everything, you have a problem if that machine starts burning. For some applications that risk is OK; for others it's unthinkably bad.
Security
There is a heated debate over whether or not it increases security to split the application across several machines. Anyway, security concern is a common reason for why it is done.
Scalability
For some applications, the load is just too high for a single machine to cope with it. It might also be that, financially, you don't want to buy the most expensive machine around on day one. You may instead want to add more machines to the problem when the problem increases, without throwing away the old machines.

If you do need a distributed system, it's extremely important to think long and hard about the design, because then you have one big design challenge that can easily create big throughput problems, for example. When talking about distribution, messaging is an important concept.

Messaging Focus

The focus of this book is design of one application or one service, not so much about how to orchestrate several services. Even so, I think it's good to at least start thinking about messaging (if you haven't already). I guess this will just become more and more important for us in the future.

Messages as a Core Programming Model Thingy

I used to think it was good to abstract away or hide the network in distributed systems. In this way, the client programmer didn't know if a method call was going to translate to a network jump or if it was just a local function call. I liked that approach because I wanted to simplify the life of the client programmer so he or she could focus on stuff that's important to create great user interfaces and not get distracted with things such as network communication.

Let's take a look at a simple example. Assume you have a Customer class, as shown in Figure 1-5.

Figure 1-5. A `Customer` class

Instances of that Customer class would typically (and hopefully) live their life in the same address space as the consumer code, so that when the consumer asks for the name of the Customer, it's just a local call.

There is nothing, however, to physically stop you from letting Customer just be a Proxy [GoF Design Patterns], which in its turn will relay the calls to a Customer instance living at an application server. (OK, it's a bit simplified, but that's more or less what is going on behind the scenes if you configure the Customer class in MTS at an application server, for example.)

Szyperski points out [Szyperski Component Software] that location transparency is both an advantage and a burden. The advantage is that all types of communication (in process, interprocess, and intermachine) are mapped to one abstraction, the procedure call. The burden is that it hides the significant cost difference between the different types of calls. The difference is normally orders of magnitude in execution time.

I think the current trend is the opposite of location transparency and to make the costly messages obvious by making messages a core thing in the programming model. Trend-conscious as I am, I do like that evolution. I like it just because it's obvious to the client programmer that there is going to be a network callit doesn't really mean that it must be hard to make. For example, message factories could be provided, yet it still makes it much clearer what will probably take time and what will not.

You might wonder how will messaging affect a domain model? The need for a domain model won't go away, but there will probably be slight changes. The first example that comes to mind is that inserts, as in a journal, are favored over updates (even for modifications). That makes asynchronous communication much more usable.

If Possible, Put It Off Until It's Done Better

One very important advantage of messaging is that it increases the execution flexibility so much. In the past, I think I've been pretty good at using batch jobs for long executions that didn't have to execute in real time. It's an old, and in my opinion underused, method for vastly increasing response time for the real time part of the work, because then the long executions would execute in windows of low load, typically at night, for example.

Currently I have written too many of my applications to be synchronous. When a piece of functionality takes too long to execute and doesn't have to execute in real time, I then have to change the functionality into a batch process instead. What is executing in real time is changed from being the complete piece of functionality into the request itself only.

A more efficient solution to the problem would be to think asynchronous messages from the start as often as possible. Then the functionality could run in real time if it's appropriate or be put on a message queue to be executed as soon as possible or at given intervals. (The batch solution would be kind of built in from the beginning.)

A solution based on asynchronous messages might require quite a different mindset when you build your user interface, but if you really challenge the different design problems, I think you will find that asynchronicity is possible and a suitable way to go more often than you may have thought in the past.

A few words to end this discussion: In my opinion it's a good idea to focus on the core model itself, no matter what the execution environment is. Then you can probably use it in several different situations, as the need arises.

Those were a few words about architecture values to value. Let's move over to the process side.

Focus on the Model

Use Case Focus

Different Patterns for Dealing with the Main Logic

Domain-Driven Design Focus

Why Model Focus?

Time to Try Discussing the Model with Customers Again

If You Have a Model Focus, Use the Domain Model Pattern

Figure 1-1. An example of a Domain Model, an early sketch for the example application, further discussed in Chapter 4, "A New Default Architecture"

Domain Model History from My Rear View Mirror

A Few Words About Naked Objects

Old Truths Might Be Wrong

One Root Structure

Handle the Database with Care

The Right Efficiency

Figure 1-2. UML diagram for Customer/Order example

Figure 1-3. UML diagram for Customer/Order example, with Layer Supertype

Maintainability Focus

The Impedance Mismatch Between Domain Model and Relational Database

Figure 1-4. The same model expressed as relational and object-oriented

Handle Distribution with Care

Overuse Is Never Good

Distribution Might Be Useful

Messaging Focus

Messages as a Core Programming Model Thingy

Figure 1-5. A Customer class

If Possible, Put It Off Until It's Done Better

Figure 1-3. UML diagram for `Customer/Order` example, with Layer Supertype

Figure 1-5. A `Customer` class