Mounting Host Directories within a UML

There are two ways to mount a host directory as a UML directory hostfs and humfs. hostfs is the older and more limited method, but it does have the advantage of greater convenience. Both are virtual file systems, in the sense that they are not stored within a UML block device. You can think of them as nondevice filesystems whose data is maintained without benefit of a storage device that's known to UML. In many cases, the data is simply stored inside the kernel. If you look at /proc/filesystems on any modern Linux system, UML or physical, you will see a great number of these:

host% cat /proc/filesystems nodev   sysfs nodev   rootfs nodev   bdev nodev   proc nodev   sockfs nodev   binfmt_misc nodev   debugfs nodev   usbfs nodev   pipefs nodev   futexfs nodev   tmpfs nodev   eventpollfs nodev   devpts         ext2 nodev   ramfs nodev   hugetlbfs         iso9660 nodev   mqueue nodev   selinuxfs         ext3 nodev   rpc_pipefs nodev   autofs

All of the nodev entries are virtual filesystems. Most of these make internal kernel information available as a filesystem. Probably the most familiar example is proc, which is normally mounted on /proc. This filesystem makes internal kernel variables and data structures visible as files. Most of the others do, as well, with the exception of tmpfs. This is a normal filesystem, in the sense that it can be mounted and arbitrary files created within it. Those files are temporary, disappearing when the filesystem is unmounted, rather than being stored permanently on a disk, and written to the system's swap when memory is tight, rather than to a dedicated disk partition. Figure 6.1 illustrates the differences between the various kinds of Linux filesystems. hostfs and humfs are similar to these in that the filesystem data seems to be fabricated from within the kernel, but different in that the data is permanently stored. They are conceptually most similar to a network filesystem, such as NFS. In both cases, the data is stored outside the machine and transparently made available by a filesystem that knows how to access it.

Figure 6.1. The various types of filesystems available in UML. `/` is a traditional disk-based filesystem contained in the UML device `/dev/ubda`, which itself is contained in a file on the host. `/proc` and `/tmp` are virtual filesystems whose data is contained entirely within the UML kernel. `/proc` exports internal kernel data structures, so its data is always within the kernel. `/tmp` is a tmpfs mount, which looks like a normal filesystem, but its data is stored in the kernel's filesystem cache and swapped to the UML instance's swap space if necessary. `/nfs` is also virtual in the sense that the data contained in the filesystem isn't stored on this system. It resides on a remote system and remote procedure calls (RPCs) are made in order to access it. `/host` is another sort of virtual filesystem, except that its data comes from a directory hierarchy on the `host`. This is fairly similar to an nfs mount, except that the remote files are accessed using system calls to the `host` rather than a network protocol.

With a network filesystem, file accesses are translated into network requests to the server, which sends data and status back. With hostfs and humfs, file accesses are translated into file accesses on the host. You can think of this as a one-to-one translation of requestsa read, write, or mkdir within UML translates directly into a read, write, or mkdir to the host. This is actually not true in the most literal sense. An operation such as mkdir within one of these filesystems must create a directory on the host; therefore, it must translate into a mkdir there, but won't necessarily do so immediately. Because of caching with the filesystem, the operation may not happen until a long time later. Operations such as a read or write may not translate into a host read or write at all. They may, in fact translate into an mmap followed by directly reading or writing memory. And in any case, the lengths of the read and write operations will certainly change when they reach the host. Linux filesystem operations typically have page granularitythe minimum I/O size is a machine page, 4K on most extant systems. For example, a sequence of 1-byte reads will be converted into a single page-length read to the host followed by simply passing out bytes one at a time from the buffer into which that page was read.

So, while it is conceptually true that hostfs and humfs operations correspond one-to-one to host operations, the reality is somewhat different. This difference will become relevant later in this chapter when we look at simultaneous access to data from a UML and the host, or from two UMLs.

`hostfs`

hostfs is the older and simpler of the two ways to mount a host directory as a UML directory. It uses the most obvious mapping of UML file operations to host operations in order to provide access to the host files. This is complicated only by some technical aspects, such as making use of the UML page cache. This simplicity results in a number of limitations, which we will see shortly and which I will use to motivate humfs.

So, let's get a UML instance and make a hostfs mount inside it:

UML# mount none /mnt -t hostfs

Now we have a new filesystem mounted on /mnt:

UML# mount /dev/ubd0 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=4,mode=620) shm on /dev/shm type tmpfs (rw) none on /mnt type hostfs (rw)

Its contents show that it looks a lot like the host's root filesystem:

UML# ls /mnt bin   etc    lib         media opt  sbin    sys  usr boot  home   lib64       misc  proc selinux tmp  var dev   initrd lost+found  mnt   root srv     tools

You can do the same ls on the host's / to verify this. Basically, we have mounted the host's root on the UML instance's /mnt, creating a completely normal Linux filesystem within the UML. For getting access to files on the host within UML, this is very convenient. You can do anything within this filesystem that you can do with a disk-based filesystem, with some restrictions that we will talk about later.

By default, when you make a hostfs mount, you get the host's root filesystem. This isn't always desirable, so there is an option to mount a different host directory:

UML# mkdir /mnt-home UML# mount none /mnt-home/ -t hostfs -o /home UML# ls /mnt-home/ jdike lost+found

The -o option specifies the host directory to mount. From that mount point, it is impossible to access any files outside that directory. In our case, the /mnt-home mount point gives us access to the host's /home, but, from there, we can't access anything outside of that. The obvious trick of using .. to try to access files outside of /home won't work because it's the UML that will interpret the .., not the host. Trying to "dotdot" your way out of this will get you to the UML instance's /, not the host's /.

Using -o is at the option of the user within the instance. Many times, the host administrator wants all hostfs mounts confined to a host subdirectory and makes it impossible to access the host's /. There is a command-line option to UML to allow this, hostfs=/path/to/UML/jail. With this enabled, hostfs mounts within the UML will be restricted to the specified host subdirectory. If the UML user does a mount specifying a mount path with -o, that path will be appended to the directory on the command line. So, -o can be used to mount subdirectories of whatever directory the UML's hostfs has been confined to, but can't be used to escape it.

Now, let's create a file within the host mount:

UML# touch /mnt/tmp/uml-file UML# ls -l /mnt/tmp/uml-file -rw-r--r--  1 500 500 0 Jun 10 13:02 /mnt/tmp/uml-file

The ownerships on this new file are somewhat unexpected. We are root inside the UML, and thus expect that any new files we create will be owned by root. However, we are creating files on the host, and the host is responsible for the file, including its ownerships. The UML instance is owned by user ID (UID) 500, so from its point of view, a process owned by UID 500 created a file in /tmp. It's perfectly natural that it would end up being owned by that UID. The host doesn't know or care that the process contains another Linux kernel that would like that file to be owned by root.

This seems perfectly reasonable and innocent, but it has a number of consequences that make hostfs unusable for a number of purposes. To demonstrate this, let's become a different, unprivileged user inside UML and see how hostfs behaves:

UML# su user UML% cd /mnt/tmp UML% echo foo > x UML% ls -l x -rw-r--r-- 1  500 500 4 Jun 10 14:31 x UML% echo bar >> x sh: x: Permission denied UML% rm x rm: remove write-protected regular file `x'? y rm: cannot remove `x': Operation not permitted UML% chmod 777 x chmod: changing permissions of `x': Operation not permitted

Here we see a number of unexpected permission problems arising from the ownership of the new file. We created a file in the host's /tmp and found that we couldn't subsequently append to it, remove it, or change its permissions.

It is created with the owner UID 500 on the host and is writable by that UID. However, I became user, with UID 1001, inside the UML instance, so my attempts to modify the file don't even make it past the UML's permission checking. When the file was created on the host, it was given its ownership and permissions by the host. hostfs shows those permissions, rather than the ones the UML instance provided, because they are more "real."

The ownership and permissions are interpreted locally by the UML when seeing whether a file operation should succeed. The fact that the file ownerships are set by the host to something different from what the UML expects can cause files to be unmodifiable by their owner within UML.

This isn't a problem for the root user within UML because the superuser doesn't undergo the same permission checks as a normal user, so the permission checks occur on the host.

However, this issue does make it impossible for multiple users within the UML to use hostfs. In fact, only root within the UML can realistically use it. The only way for a normal UML user to use hostfs is for its UID to match the host UID that the UML is running as. So, if user within UML had UID 500 (matching the UML instance's UID on the host), the previous example would have been more successful.

Let's look at another problem, in which root within the UML doesn't have permission to do some things that it should be able to do:

UML# mknod ubda b 0 98 mknod: `ubda': Operation not permitted

Here, creating a device node for ubda doesn't work, even for root. Again, the reason is that the operation is forwarded to the host, where it is attempted as the nonroot UML user, and fails because this operation requires root privileges. You will see similar problems with creating a couple of other types of files.

If you experiment long enough with hostfs, you will discover other problems, such as accessing UNIX sockets. If the hostfs mount contains sockets, they were created by processes on the host. When one is opened on the host, it can be used to communicate with the process that created it. However, they are visible within a hostfs mount, but a UML process opening one will fail to communicate with anything. The UML kernel, not the host kernel, will interpret the open request and attempt to find the process that created it. Within the UML kernel, this will fail because there is no such process.

Creating a directory on the host with a UML root filesystem in it, and booting from it, is also problematic. The filesystem, by and large, should be owned by root, and it won't be. All of the files are owned by whoever created them on the host. At this writing, there is a kludge in the hostfs code that changes (internally to the UML kernel) the ownerships of these files to root when the hostfs filesystem is the UML root filesystem. This makes booting from hostfs work, more or less, but all the problems described above are still there. Other kernel developers have objected to this ownership changing, and this kludge likely won't be available much longer. When this "feature" does disappear, booting from a hostfs root filesystem likely won't work anymore.

I've spent a good amount of time describing the deficiencies of hostfs, but I'd like to point out that, for a common use case, hostfs is exactly what you want. If you have a private UML instance, are logged in to it as root, and want access to your own files on the host, hostfs is perfect. The filesystem semantics will be exactly what you expect, and no prior host setup is needed. Just run the hostfs mount command, and you have all of your files available.

Most of the problems with hostfs that I've described stem from the fact that all hostfs file operations go through both the UML's and the host's permission checking. This is because both systems look at the same data, the file metadata on the host, in order to decide what's allowed and what's not.

UNIX domain sockets and named pipes are sort of a reflection of a process within the filesystemthere is supposed to be a process at the other end of it. When the filesystem (including the sockets) is exported to another system, whether a UML instance with a hostfs mount or another system with an NFS mount, the process isn't present on the other system. In this case, the file doesn't have the meaning it does on its home system.

`humfs`

We can fix these problems by making sure we see, inside UML, distinct file ownerships, permissions, and types from the host. To achieve this, UML can store these in a separate place, freeing itself from the host's permission checks. This is what humfs does. The actual file data is stored in exactly the same way that hostfs doesin a directory hierarchy on the host. However, permissions information is stored separately, by default, in a parallel directory hierarchy.

For example, here are the data and metadata for a file stored in this way:

host% ls -l data/usr/bin/ls -rwxr-x--x  1 jdike jdike 201642 May 1 10:01 data/usr/bin/ls host% ls -l file_metadata/usr/bin/ls -rw-r--r--  1 jdike jdike 8 Jun 10 18:04 file_metadata/usr/bin/ls host% cat file_metadata/usr/bin/ls 493 0 0

The actual ls binary is stored in data/usr/bin/ls, while its ownership and permissions are stored in file_metadata/usr/bin/ls. Notice that the permissions on the binary are wide open for the file's owner. This, in effect, disables permission checking on the host, allowing UML's ideas about what's allowed and what's not to prevail.

Next, notice the contents of the metadata file. For a normal file, such as /usr/bin/ls, the permissions and ownerships are stored here. In the last line of the output, 493 is the decimal equivalent of 0755, and the zeros are UID root and group ID (GID) root.

We can see this by looking at this file inside UML:

UML# ls -l usr/bin/ls -rwxr-xr-x  1 root root 201642 May 1 10:01 usr/bin/ls

The humfs filesystem has taken the file size and date from data/usr/bin/ls and merged the ownership and permission information from file_metadata/usr/bin/ls.

By storing this metadata as the contents of a file on the host, UML may modify it in any way it sees fit. We can go through the list of hostfs problems I described earlier and see why this approach fixes them all.

In the case of a new file having unexpected ownerships, we can see that this just doesn't happen in humfs. The data file's ownership will, in fact, be determined by the UID and GID of the UML process, but this doesn't matter since the ownerships you will see inside UML will be determined by the contents of the file_metadata file.

So, you will be able to create a file on a humfs mount and do anything with it, such as append to it, remove it, or change permissions.

Now, let's try to make a block device:

UML# mknod ubda b 98 0 UML# ls -l ubda brw-r--r--  2 root root 98, 0 Jun 10 18:46 ubda

This works, and it looks as we would expect. To see why, let's look at what occurred on the host:

host% ls -l data/tmp/ubda -rwxrw-rw-  1 jdike jdike 0 Jun 10 18:46 data/tmp/ubda host% ls -l file_metadata/tmp/ubda -rw-r--r--  1 jdike jdike 15 Jun 10 18:46 file_metadata/tmp/ubda host% cat file_metadata/tmp/ubda 420 0 0 b 98 0

The file is empty, just a token to let the UML filesystem know a file is there. Almost all of the device's data is in the metadata file. The first three elements are the same permissions and ownership information that we saw earlier. The rest, which don't appear for normal files, describe the type of file, namely, a block device with major number 98 and minor number 0.

The host definitely won't recognize this as a block device, which is why this works. Creating a device requires root privileges, so hostfs can't create one unless the UML is run by root. Under humfs, creating a device is simply a matter of creating this new file with contents that describe the device.

It is apparent that the host socket and named pipe problem can't happen on this filesystem. Everything in this directory on the host is a normal file or directory. Host sockets and named pipes just don't exist. If a UML process makes a UNIX domain socket or a named pipe, that will cause the file's type to appear in the metadata file.

Along with these advantages, humfs has one disadvantage: It needs to be set up beforehand. You can't just take an arbitrary host subdirectory and mount it as a humfs filesystem. So, humfs is not really useful for quick access to your files on the host.

In order to set up humfs, you need to decide what's going to be in your humfs mount, create an empty directory, copy the files to the data subdirectory, and run a script that will create the metadata. As a quick example, here's how to create a humfs version of your host's /bin.

host% mkdir humfs-test host% cd humfs-test host# cp -a /bin data host# perl ..humfsify.pl jdike jdike 100M host% ls -al total 24 drwxrw-rw-   5 jdike jdike 4096 Jun 10 19:40 . drwxrw-r--  16 jdike jdike 4096 Jun 10 19:40 .. drwxrwxrwx   2 jdike jdike 4096 May 23 12:12 data drwxr-xr-x   2 jdike jdike 4096 Jun 10 19:40 dir_metadata drwxr-xr-x   2 jdike jdike 4096 Jun 10 19:40 file_metadata -rw-r--r--   1 jdike jdike   58 Jun 10 19:40 superblock

Two of the commands, the creation of the data subdirectory and the running of humfsify, have to be run as root. The copying of the directory needs to preserve file ownerships so that humfsify can record them in the metadata, and humfsify needs to change those ownerships so that you own all the files.

We now have two metadata directories, one for files and one for directories, and a superblock file. This file contains information about the filesystem as a whole, rather like the superblock on a disk-based filesystem:

host% cat superblock version 2 metadata shadow_fs used 6877184 total 104857600

This tells the UML filesystem:

What version of humfs it is dealing with
What metadata format is being used
How much disk space is used
How much total disk space is available

The shadow_fs metadata format describes the parallel metadata directories. There are some other possibilities, which will be described later in this section. The total disk space amount is simply the number given to humfsify. This number is used by the filesystem within UML to enforce the limit on disk consumption. Quotas on the host can be used, but they are not necessary.

You may have noticed that it would be particularly easy to change the amount of disk space in this filesystem. Simply changing the total field by editing this file would seem to do the trick, and it does. At this writing, this ability is not implemented, but it is simple enough and easy enough to do that it will be implemented at some point.

Now, having created the humfs directory, we can mount it within the UML:

UML# mkdir /mnt-test UML# mount none /mnt-test -t humfs -o \     path=/home/jdike/linux/humfs-test UML# cd /mnt-test

If you do an ls at this point, you'll see your copy of the host's /bin. Note that the mount command is very similar to the hostfs mount command. It's a virtual filesystem, so we're telling it to mount none since there is no block device associated with it, and we specify the filesystem type and the host mount point. In the case of humfs, specifying the host mount point is mandatory because it must be prepared ahead of time. humfs is passed the root of the humfs tree, which is the directory in which the data and metadata directories were created.

You can now do all the things that didn't work under humfs and see that they do work here. humfs works as expected in all cases, with no interference from the host's permission checking. So, humfs is usable as a UML root filesystem, whereas hostfs can be used only with some trickery.

Now I'll cover some aspects of humfs that I didn't explain earlier. First, version 2 of humfs was created because version 1 had a bug, and fixing that bug led to the separate file_metadata and dir_metadata directories. As we've seen, the metadata files for files are straightforward. Directories have ownerships and permissions and need meta-data files, but they introduce problems in some corner cases.

The initial shadowfs design required a file called metadata in each directory in the metadata tree that would hold the ownerships and permissions for the parent directory. Of course, each file in the original directory would have a file in the metadata tree with the same name. But I missed this case: What metadata file should be used for a file called metadata? Both the file and the parent directory would want to use the same metadata file, metadata.

Another problem occurs with a subdirectory called metadata. In this case, the metadata file will want to be both a directory (because the metadata directory structure is identical to the data directory structure) and a file (because the parent directory will want to put its metadata there.

The solution I chose was to separate the file and directory metadata information from each other. With them in separate directory trees, the first collision I described doesn't exist. However, the second does. The solution to that is to allow the metadata directory to be created, and rename the parent directory's metadata file. It turns out that it can be renamed to anything that doesn't collide with a subdirectory. The reason is that in the dir_metadata tree, there will be only one normal file in each directory. If metadata is a directory, the humfs filesystem will need to scan the directory for a normal file, and that will be the metadata file for the parent directory.

The next question is this: Why do we specify the metadata format in the superblock file? When I first introduced humfs, with the version 1 shadow_fs format, there were a bunch of suggestions for alternate formats. They generally have advantages and disadvantages compared to the shadow_fs format, and I thought it would be interesting to support some of them and let system administrators choose among them.

These proposals came in two classesthose that preserved some sort of shadow metadata directory hierarchy, and those that put the metadata in some sort of database. An interesting example of the first class was to make all of the metadata files symbolic links, rather than normal files, and store the metadata in the link target. This would make them dangling links, as the targets would not exist, but it would allow somewhat more efficient reading of the metadata.

Reading a file requires three system calls: an open, a read, and a close. Reading the target of a symbolic link requires onea readlink. Against this slight performance gain, there would be some loss of manageability, as system administrators and their tools expect to read contents of files, not targets of symbolic links.

The second class of proposals, storing metadata in databases of various sorts, is also interesting. Depending on the database, it could allow for more efficient retrieval of metadata, which is nice. However, what makes it more interesting to me is that the database could be used on the host to do queries much more quickly than with a normal filesystem. The host administrator could ask questions about what files had been modified recently or what files are setuid root and could get answers very quickly, without having to search the entire filesystem.

Even more interesting would be the ability to import this capability into the UML, where the UML administrator, who probably cares about the answers more than the host administrator does, could ask these questions. I'm planning to allow this through yet another filesystem, which would make a database look like a filesystem. The UML admin would mount this filesystem inside the UML and query the database underneath it this like:

UML# cat /sqlfs/"select name from root_fs where setuid = 1" /usr/bin/newgrp /usr/bin/traceroute6 /usr/bin/chfn /usr/bin/chsh /usr/bin/gpasswd /usr/bin/passwd

The "file" associated with a query would contain the results of that query. In the example above, we searched the database for all setuid files, and the results came back as the contents of a file.

With humfs, only the file metadata would be indexed in the data-base. It is possible to do the same thing with the contents of files. This would take a different framework than that which enables humfs but is still not difficult. It would be possible to load a UML filesystem into a database, be it SQL, Glimpse, or Google, and have that database imported into UML as a bootable filesystem. Queries to the database would be provided by a separate filesystem, as described earlier. In this way, UML users would have access to their files through any database the host administrator is willing to provide.

An alternate use of this is to load some portion of your data, such as your mail, into such a database-backed filesystem. These directories and files will remain accessible in the normal way, but the database interface to them will allow you to search the file contents more quickly than is possible with utilities such as find and grep. For example, loading your mail directory into a filesystem indexed by something like Glimpse would give you a very fast way to search your mail. It would still be a normal Linux filesystem, so mail clients and the like would still work on it, and the index would be kept up to date constantly since the filesystem sees all changes and feeds them into the index. This means that you could search for something soon after it is created (and find it) rather than waiting for the next indexing run, which would probably be in the wee hours, making the change visible in the index the following day.