Host Kernel Version


Technically, UML will run on any x86 host kernel from a stable series (Linux kernel versions 2.2, 2.4, or 2.6) since 2.2.15. However, the 2.2 kernel is of historic interest onlyif you have such a machine that you are going run UML instances on, you should upgrade. The 2.4 and 2.6 kernels make good hosts, but 2.6 is preferred. UML will run on any x86_64 (Opteron/AMD64 or Intel EM64T) host, which is a newer architecture and has had the necessary basic support since the beginning. However, x86_64 hosts are stable only on hosts running 2.6.12 or later. On S/390, a fairly new 2.6 host kernel is required because of bugs that were found and fixed during the UML port to that architecture.

UML makes use of the AIO and O_DIRECT facilities in the 2.6 kernels for better performance and lower memory consumption. AIO is kernel-level asynchronous I/O, where a number of I/O requests can be issued at once, and the process that issued them can receive notifications asynchronously when they finish. The kernel issues the notifications when the data is available, and the order in which that happens may not be related to the order in which they are issued.

The alternative, which is necessary on earlier kernels, is to either make normal read and write system calls, which are synchronous, and make the process sleep until the operation finishes, or to dedicate a thread (or multiple threads) to I/O operations. Doing I/O synchronously allows only one operation to be pending at any given time. Doing I/O asynchronously by having a separate thread do synchronous I/O at least allows the process to do other work while the operation is pending. On the other hand, only one operation can be pending for each such I/O thread, and the process must context-switch back and forth from these threads and communicate with them as operations are issued and completed. Having one thread for each pending I/O operation is hugely wasteful.

glibc has AIO support in all kernels, even those without AIO support, and it implements this using threads, potentially one thread per outstanding I/O request. UML, on such hosts, emulates AIO in a similar way. It creates a single thread, allowing one I/O request to be pending at a time.

The AIO facility present in the 2.6 kernel series allows processes to do true AIO. UML uses this by having a separate thread handle all I/O requests, but now, this thread can have many operations pending at once. It issues operations to the host and waits for them to finish. As they finish, the thread interrupts the main UML kernel so that it can finish the operations and wake up anything that was waiting for them.

This allows better I/O performance because more parallel I/O is possible, which allows data to be available earlier than if only one I/O request can be pending.

O_DIRECT allows a process to ask that an I/O request be done directly to and from its own address space without being cached in the kernel, as shown in Figure 9.1. At first glance, the lack of caching would seem to hurt performance. If a page of data is read twice with O_DIRECT enabled, it will be read from disk twice, rather than the second request being satisfied from the kernel's page cache. Similarly, write requests will go straight to disk, and the request won't be considered finished until the data is on the disk.

Figure 9.1. O_DIRECT I/O compared to buffered I/O. When a process does a buffered read, the data is first read from disk and stored in the kernel's page cache. Then it is copied into the address space of the process that initiated the read. Buffering it in the page cache provides faster access to the data if it is needed again. However, the data is copied and stored twice. When a process performs an O_DIRECT read, the data is read directly from the disk into the process address space. This eliminates the extra copy operation and the extra memory consumption caused by a buffered read. However, if another process needs the data, it must be read from disk rather than simply copied from the kernel's page cache. The figure also shows a read done by the kernel for its own purposes, to compare it to the O_DIRECT read. In both cases, the data is read directly from disk and stored only once. When the process doing the O_DIRECT read is UML reading data into its own page cache, the two cases are identical.


However, O_DIRECT is intended for specialized applications that implement their own caching and use AIO. For an application like this, using O_DIRECT can improve performance and lower its total memory requirements, including memory allocated on its behalf inside the kernel. UML is such an application, and use of O_DIRECT actually makes it behave more like a native kernel.

A native kernel must wait for the disk when it writes data, and there is no caching level below it (except perhaps for the on-disk cache), so if it reads data, it must again wait for the disk. This is exactly the behavior imposed on a process when it uses O_DIRECT I/O.

The elimination of the caching of data at the host kernel level means that the only copy of the data is inside the UML instance that read it. So, this eliminates one copy of the data, reducing the memory consumption of the host. Eliminating this copy also improves I/O latency, making the data available earlier than if it was read into the host's page cache and then copied (or mapped) into the UML instance's address space.

For these reasons, for x86 hosts, a 2.6 host kernel is preferable to 2.4. As I pointed out earlier, running UML on x86_64 or S/390 hosts requires a 2.6 host because of host bugs that were fixed fairly recently.



User Mode Linux
User Mode Linux
ISBN: 0131865056
EAN: 2147483647
Year: N/A
Pages: 116
Authors: Jeff Dike

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net