Captive UML

So far I've talked about using special filesystems to import the external state of outside entities into a UML instance where it can be manipulated through a filesystem. An extension of this is to import the internal state of an application into a UML instance to be manipulated in the same way.

This would be done by actually embedding the UML instance within the application. The application would link UML in as a library, and a UML instance would be booted when the application runs. The application would export to the UML instance whatever internal state it considers appropriate as a filesystem. Processes or users within that UML instance could then examine and manipulate that state through this filesystem, with side effects inside the application whenever anything is changed.

Secure `mod_perl`

Probably the best example of a real-world use for a captive UML that I know of is Apache's mod_perl. This loadable module for Apache contains a Perl interpreter and allows the use of Perl scripts running inside Apache to handle requests, generate HTML, and generally control the server. It is very powerful and flexible, but it can't be used securely in a shared Apache hosting environment, where a hosting company uses a single Apache server to serve the Web sites of a number of unrelated customers.

Since a Perl script runs in the context of the Apache server and can control it, one customer using mod_perl could take over the entire server, cause it to crash or exit, or misbehave in any number of other ways. The only way to generate HTML dynamically with a shared Apache is to use CGI, which is much slower than with mod_perl. CGI creates a new process for every HTML request, which can be a real performance drag on a busy server. This is especially the case when the Web site is generated with Perl, or something similar, because of the overhead of the Perl interpreter.

With some captive UML instances inside the Apache server, you could get most of the performance of standard mod_perl, plus a lot of its flexibility, and do so securely, so that no customer could interfere with other sites hosted on the same server or with the server itself. You would do this by having the customer's Perl scripts running inside the instances, isolating them from anything outside. Communication with the Apache server would occur through a special filesystem that would provide access to some of Apache's internal state.

The most important piece of state is the stream of requests flowing to a Web site. These would be available in this filesystem, and in a very stripped-down implementation, they would be the only thing available. So, with the special Apache filesystem mounted on /apache, there could be a file called /apache/request that the Perl script would read. Whenever a request arrived, it would appear as the contents of this file. The response would be generated and written back to that file, and the Apache server would forward it to the remote browser.

One advantage of this approach is immediately evident. Since the HTML generation is happening inside a full Linux host and communication with the host Apache server is through a set of files, the script can be written in any languagePerl, Python, Ruby, shell, or even compiled C, if maximum performance is desired. It could even be written in a language that didn't exist at the time this version of Apache was released. The new language environment would simply need to be installed in the captive UML instance.

Another advantage is that the Web site can be monitored in real time, in any manner desired, from inside the UML instance. This includes running an interactive debugger on the script that's generating the Web site, in order to trace problems that might occur only in a production deployment. Obviously, this should be done with caution, considering that debuggers generally slow down whatever they're debugging and can freeze everything while stopped at a breakpoint. However, for tracking down tricky problems, this is a capability that doesn't exist in mod_perl currently but comes for free with a captive UML instance.

So far, I've talked about using a single file, /apache/request, to receive HTTP requests and to return responses. This Apache filesystem can be much richer and can provide access to anything in the mod_perl API, which is safe within a shared server. For example, the API provides access to information about the connection over which a request came, such as what IP the remote host has and whether the connection supports keepalives. This information could be provided through other files in this filesystem.

The API also provides access to the Apache configuration tree, which is the in-memory representation of the httpd.conf file. Since this information is already a tree, it can be naturally represented as a directory hierarchy. Obviously, full access to this tree should not be provided to a customer in a shared server. However, the portions of the tree associated with a particular customer could be. This would allow customers to change the configuration of their own Web sites without affecting anyone else or the server as a whole.

For example, the owner of a VirtualHost could change its configuration or add new VirtualHosts for the same Web site. Not only would this be more convenient than asking the hosting company to change the configuration file, it also could be done on the fly. This would allow the site to be reconfigured as much and as often as desired without having to involve the hosting company.

It is common to have Apache running inside a UML instance. This scheme turns that inside-out, putting the UML instance inside Apache. Why do things this way instead of the standard Apache-inside-UML way? The reasons mirror the reasons that people use a shared Apache provider rather than colocating a physical machine and running a private Apache on it.

It's cheaper since it involves less hardware, and it doesn't require a separate IP address for every Web site. The captive UML instance has less running in it compared to running Apache inside UML. All Web sites on the server share the same Apache instance, and the only resources they don't share are those dedicated to generating the individual Web sites. Also, it's easier to administrate. The hosting company manages the Apache server and the server as a whole, and the customers are responsible only for their own Web sites.

Evolution

Putting a UML instance inside Apache is probably the most practical use of a captive UML instance, but my favorite example is Evolution. I use Evolution on a daily basis, and there are useful things that I could make it do if there were a UML instance inside it with access to its innards. For example, I have wanted an easy way to turn an e-mail message into a task by forwarding the e-mail to some special address. With a UML instance embedded inside Evolution, I would have the instance on the network with a mail server accepting e-mail. Then a procmail script, or something similar, would create the task via the filesystem through which the UML instance had access to Evolution's data.

So, given an e-mail whose title is "frobnitzis broken" and whose message is "The frobnitz utility crashes whenever I run it," the script would do something like this:

UML% cat > /evolution/tasks/"frobnitz is broken" << EOF The frobnitz utility crashes whenever I run it EOF

This would actually create this task inside Evolution, and it would immediately appear in the GUI. Here, I am imagining that the "Evolution filesystem" would be mounted on /evolution and would contain subdirectories such as tasks, calendar, and contacts that would let you examine and manipulate your tasks, appointments, and contacts, respectively. Within /evolution/tasks would be files whose names were the same as those assigned to the tasks through the Evolution GUI. Given this, it's not too much of a stretch to think that creating a new file in this directory would create a new task within Evolution, and the contents of the task would be the text added to the file.

In reality, an Evolution task is a good deal more complicated and contains more than a name and some text, so tasks would likely be represented by directories containing files for their attributes, rather than being simple files.

This example demonstrates that, with a relatively small interface to Evolution and the ability to run scripts that use that interface, you can easily make useful customizations. This example, using the tools found on any reasonable Linux system, would require a few lines of procmail script to provide Evolution with a fundamental new capabilityto receive e-mail and convert it into a new task.

The new script would also make Evolution network-aware in a sense that it wasn't before by having a virtual machine embedded within it that is a full network node.

I can imagine making it network-aware in other ways as well:

By having a bug-tracking system send it bug reports when they are assigned to you so they show up automatically in your task list, and by having it send a message back to the bug-tracking system to close a bug when you finish the task
By allowing a task to be forwarded from one person to another with one embedded UML sending it to another, which recreates the task by creating the appropriate entries in the virtual Evolution filesystem

The fact that the captive UML instance could be a fully functional network node means that the containing application could be, too. The data exported through the filesystem interface could then be exported to the outside world in any way desired. Similarly, any data on the outside could be imported to the application through the filesystem interface. The application could export a Web interface, send and receive e-mail, and communicate with any other application through its captive UML instance.

Any application whose data needs to be moved to or from other applications could benefit from the same treatment. Our bug-tracking system could forward bugs to another bug tracker, receive bug reports as e-mail, or send statistics to an external database, even when the bug tracker couldn't do any of these itself. If it can export its data to the captive UML instance, scripts inside the instance can do all of these.

Given sufficient information exported to the captive UML instance, any application can be made to communicate with any other application. An organization could configure its applications to communicate with each other in suitable ways, without being constrained by the communication mechanisms built into the applications.

Application Administration

Some applications, such as databases and those that contain databases, require dedicated administration, and sometimes dedicated administrators. These applications try to be operating systems, in the sense that they duplicate and reimplement functionality that is already present in Linux. A captive UML within the application could provide these functions for free, allowing it to either throw out the duplicated functionality or avoid implementing it in the first place.

For example, databases and many Web sites require that users log in. They have different ways to store and manage account information. Almost everyone who uses Linux is familiar with adding users and changing passwords, but doing the same within a database requires learning some new techniques. However, with a captive UML instance handling this, the familiar commands and procedures suffice. The administrator can log in to the UML instance and add or modify accounts in the usual Linux way.

The captive UML instance can handle authentication and authorization. When a user logs in to such a Web site, the site passes the user ID and password to the UML instance to be checked against the password database.

If there are different levels of access, authorization is needed as well. After the captive UML instance validates the login, it can start a process owned by that user. This process can generate the HTML for requests from that user. With the site's data within this UML instance and suitably protected, authorization is provided automatically by the Linux file permission system. If a request is made for data that's inaccessible to the user, this process will fail to access it because it doesn't have suitable permissions.

The same is true with other tasks such as making backups. Databases have their own procedures for doing this, which differ greatly from the way it's done on Linux. With a captive UML instance having access to the application's data, the virtual filesystem that the instance sees can be backed up in the same way as any other Linux machines. The flip side of this is restoring a backup, which would also be done in the usual Linux way.

The added convenience of not having to learn new ways to perform old tasks is obvious. Moreover, there are security advantages. Doing familiar tasks in a familiar way reduces the likelihood of mistakes, for example, making it less likely that adding an account, and doing it wrong, will inadvertently open a security hole.

There is another security benefit, namely, that the application administrator logs in to the application's captive UML instance to perform administration tasks. This means that the administrator doesn't need a special account on the host, so there are fewer accounts, and thus fewer targets, on the host. When the administrator doesn't need root privileges on the host, there is one fewer person with root access, one fewer person who can accidentally do something disastrous to the host, and one fewer account that can be used to as a springboard to root privileges.

A Standard Application Programming Interface

Another side of a captive UML instance can be inferred from the discussion above, but I think it's worth talking about it specifically. A Linux environment, whether physical or virtual, normally comes with a large variety of programming tools. Add to this the ability of a captive UML instance to examine and manipulate the internal state of its application, and you have a standard programming environment that can be imported into any application.

The current state of application programmability and extensibility is that the application provides an API to its internals, and that API can be used by one of a small number of programming languages. To extend Emacs, you have to use Lisp. For GIMP, you have Scheme, TCL, and Perl. For Apache, there is Perl and Python. With a reasonable Linux environment, you get all of these and more. With an API based on the virtual filesystem I have described, application development and extension can be done with any set of tools that can manipulate files.

With an embedded UML instance providing the application's development environment, the developers don't need to spend time creating an API for every language they wish to support. They spend the time needed to embed a UML instance and export internal state through a UML virtual filesystem, and they are done. Their users get to choose what languages and tools they will use to write extensions.

Application-Level Clustering

A captive UML can also be used to provide application access to kernel functionality. Clustering is my favorite example. In Chapter 12 we saw two UML instances being turned into a little cluster, which is a simple example of process-level clustering. There is at least one real-world, commercial example of thisOracle clusters, where the database instances on multiple systems cooperate to run a single database.

There would be more examples like this if clustering were easier to do. Oracle did its own clustering from scratch, and any other product, commercial or open source, would have to do the same. With the clustering technologies that are currently in Linux and those that are on their way, UML can provide a much easier way to "clusterize" an application.

With UML, any clustering technology in the Linux kernel is automatically running in a process, assuming that it is not hardware-dependent. To clusterize an application, we need to integrate UML into the application in such a way that it can use that technology.

Integrating UML into the application is a matter of making UML available as a linkable library. At that point, the application can call into the UML library to get access to any functionality within it.

I am envisioning this as an enabling technology for much deeper Internet-wide collaborations than we've seen so far. At this point, most such collaborations have been Web-based. Why isn't that sufficient? Why do we need some new technology? The answer is the same as that for the question of why you don't do all of your work within a Web browser. You create a lotlikely allof your work with other applications because these other tools are specialized for the work you are doing, and your Web browser isn't. Your tools have interfaces that make it easy to do your work, and they understand your work in ways that enable them to help. Web browsers don't. Even when it is possible to do the same work in your Web browser, the Web interface is invariably slower, harder to use, and less functional than that for the specialized application.

Imagine taking one of these applications and making it possible for many people to work within it at the same time, working on the same data without conflicting with each other. Clusterizing the application would allow this.

To make our example a bit more concrete, let's take the ocfs2 UML cluster we saw in Chapter 12 and assume that an application wants to use it as the basis for making a cluster from multiple instances of itself. The ocfs2 cluster makes a shared disk accessible to multiple nodes in such a way that all the nodes see the same data at all times. The application shares some of its data between instances by storing it in an ocfs2 volume.

Let us say that this application is a document editor, and the value it gains from being clusterized is that many people can work on the same document at the same time without overwriting each other's work. In this case, the document is stored in the cluster filesystem, which is stored in a file on the host.

When an instance of this editor starts, the captive UML inside it boots enough that the kernel is initialized. It attaches itself to the ocfs2 shared disk and brings itself up as a cluster node. The editor knows how the document is stored within the shared disk and accesses it by directly calling into the Linux filesystem code rather than making system calls, such as open and read, as a normal process would.

With multiple instances of the editor attached to the same document, and the captive UML instances as nodes within the cluster, a user can make changes to the document at the same time as other users, without conflicting with them.

The data stored within the cluster filesystem needs to be the primary copy of the document, in the sense that changes are reflected more or less immediately in the filesystem. Otherwise, two users could change the same part of the document, and one would end up overwriting the other when the changes made it to the filesystem.

How quickly changes need to be reflected in the filesystem is affected to some extent by the organization of the document and the properties of the cluster being used. A cluster ensures that two nodes can't change the same data at the same time by locking the data so that only one node has access to it at any given time. If the locking is done on a per-file basis, and this editor stores its document in a single file, then the first user will have exclusive access to the entire document for the entire session. This is obviously not the desired effect.

Alternatively, the document could be broken into pieces, such as a directory hierarchy that reflects the organization of the document. The top-level directories could be volumes, with chapter subdirectories below that, sections below the chapters, and so on. The actual contents would reside within files at the lowest level. These would likely be at the level of paragraphs. A cluster that locks at the file level would let different people work on different paragraphs without conflict.

There are other advantages to doing this. It allows the Linux file permission system to be applied to the document with any desired granularity. When each contributor to the document is assigned a section to work on, this section would be contained inside some directory. The ownerships on these directories and files would be such that those people assigned to the section can edit it, and others can't, although they may have permission to read it. Groups can be set up so that some people, such as editors, can modify larger pieces of the document.

At first glance, it would appear that this could be implemented by running the application within a cluster, rather than having the cluster inside the application, as I am describing. However, for a number of reasons, that wouldn't work.

The mechanics of setting up the cluster require that it be inside the application. Consider the case where this idea is being used to support an Internet-wide collaboration. Running the application within a cluster requires the collaboration to have a cluster, and everyone contributing to it must boot their systems into this cluster. This immediately runs into a number of problems.

First, many people who would be involved in such an effort have no control over the systems they would be working from. They would have to persuade their system administrators to join this cluster. For many, such as those in a corporate environment, with systems installed with defined images, this would be impossible. However, when the application comes with its own clustering, this is much less of a problem. Installing a new application is much less problematic than having the system join a cluster.

Even if you can get your system to join this cluster, you need your system either to be a permanent member or to join when you run the application that needs it. These requirements pose logistical and security problems. To be a cluster node means sharing data with the other nodes, so having to do this whenever the system is booted is undesirable. To join the cluster only when the application is running requires the application to have root privileges or to be able to call on something with those privileges. This is also impossible for some types of clustering, which require that nodes boot into the cluster. Both of these options are risky from a security perspective. With the cluster inside the application, these problems disappear. The application boots into the cluster when it is started, and this requires no special privileges.

Second, there may be multiple clustered applications running on a given system. Having the system join a different cluster for each one may be impossible, as this would require that the system be a member of multiple clusters at the same time. For a cluster involving only a shared filesystem, this may be possible. But it also may not. If the different clusters require different versions of the same cluster, they may be incompatible with each other. There may be stupid problems like symbol conflicts with the two versions active on the host at the same time. For any more intrusive clustering, being a member of multiple clusters at once just won't work. The extreme case is a Single-System Image (SSI) cluster where the cluster acts as a single machine. It is absolutely impossible to boot into multiple instances of these clusters at once. However, with the cluster inside the application, this is not an issue. There can't be conflicts between different versions of the same clustering software between different clusters, or between different types of clusters, because each cluster is in its own application. They are completely separate from each other and can't conflict.

Consider the case where the large-scale collaboration decides to upgrade the cluster software it is using or decides to change the cluster software entirely. This change would require the administrators of all the involved systems to upgrade or change them. This logistical nightmare would knock most of the collaboration offline immediately and leave large parts of it offline for a substantial time. The effects of attempting this could even kill the collaboration. An upgrade would create two isolated groups, and the nonupgrading group could decide to stay that way, forking the collaboration. With the cluster as part of the application, rather than the other way around, an upgrade or change of cluster technologies would involve an upgrade of the application. This could also fail to go smoothly, but it is obviously less risky than upgrading the system as a whole.

Security also requires that the cluster be within the application. Any decent-size collaboration needs accountability for contributions and thus requires members to log in. This requires a unified user ID space across the entire cluster. For any cluster that spans organization boundaries, this is clearly impossible. No system administrator is going to give accounts to a number of outsiders for the benefit of a single application. It may also be mathematically impossible to assign user IDs such that they are the same across all of the systems in the cluster. With the application being its own cluster, this is obviously not a problem. With the captive UML instances being members of the cluster, they have their own separate, initially empty, user ID space. Assigning user IDs in this case is simple.

Now, consider the case where the application requires an SSI cluster. For it to require the system to be part of the cluster is impossible for logistical reasons, as I pointed out above. It's also impossible from a security standpoint. Every resource of every member of the cluster would be accessible to every other member. This is unthinkable for any but the smallest collaborations. This is not a problem if the cluster is inside the application. The application boots into the cluster, and all of its resources are available to the cluster. Since the application is devoted to contributing to this collaboration, it's expected that all of its information and resources are available to the other cluster nodes.

Earlier, I used the example of a UML cluster based on ocfs2 to show that process-level clustering using UML is possible and is the most practical way to clusterize an application. To implement the large-scale collaborations I have described, ocfs2 is inadequate for the underlying cluster technology for a number of reasons.

It requires a single disk shared among all of its nodes. For a UML cluster, this means a single file that's available to all nodes. This is impractical for any collaboration that extends much beyond a single host. It could work for a local network, sharing the file with something like NFS, but won't work beyond that. What is needed for a larger collaboration is a cluster technology that enables each node to have its own local storage, which it would share with the rest of the cluster as needed.
ocfs2 clusters are static. The nodes and their IP addresses are defined in a cluster-wide configuration file. The shared filesystem has a maximum cluster size built into it. This can't work for a project that has contributors constantly coming and going. What is required is something that allows nodes to be added and removed dynamically and that does not impose a maximum size on the cluster.
ocfs2 doesn't scale anywhere near enough to underlie a large collaboration. I am envisioning something with the scale of Wikipedia, with hundreds or thousands of contributors, requiring the clustering to scale to that number of nodes. ocfs2 is used for sharing a database, which is typically done with a number of systems in the two-digit range or less.

While ocfs2 doesn't have the ability to power such a project, I know of one clustering technology, GFS, that might. It stores data throughout the cluster. It claims to scale to tens of thousands of clients, a level that would support a Wikipedia-scale collaboration. It does seem to require good bandwidth (gigabit Ethernet or better) between nodes, which the Internet as a whole can't yet provide. Whether this is a problem probably depends on the quantity of data that needs to be exchanged between nodes, and that depends on the characteristics of the collaboration.

These projects probably will not be well served by existing technologies, at least at first. They will start with something that works well enough to get started and put pressure on the technology to develop in ways that serve them better. We will likely end up with clusters with different properties than we are familiar with now.

Secure mod_perl

Evolution

Application Administration

A Standard Application Programming Interface

Application-Level Clustering

Secure `mod_perl`