The UML Networking Transports

Now that we've had an in-depth look at using TUN/TAP devices on the host to get a UML instance on the network, it's time to look at the other mechanisms that can be used. There are a total of six, probably two of which are by far the most commonly used. However, there are situations in which you would choose to use one of the other four, albeit very rare situations for some of them.

In order to classify them, we can first divide them between transports that can be used to connect a UML to the host and those that can be used only to connect UML instances to each other. In the first group are TUN/TAP, Ethertap, SLIP, and Slirp. In the second are the virtual switch and multicast. Blurring this distinction somewhat is that uml_switch has an option to attach itself to a host TUN/TAP device, thereby providing access to the host. The final transport, pcap, is fundamentally different from the others and doesn't really belong in either group. It does connect to the host, but it can only receive packets, not transmit them. pcap allows you to use a UML instance as a preconfigured packet sniffer.

Access to the Host Network

TUN/TAP and Ethertap

Among the transports that can provide access to the host network, TUN/TAP is very much the preferred option. Ethertap is an older interface that does the same thing, only worse. Ethertap was the standard for this on Linux version 2.2, and early in 2.4. At that point, TUN/TAP entered the Linux kernel in its current form. It supplanted Ethertap because it lacked various problems that made Ethertap hard to work with.

These problems are pretty well hidden by the uml_net helper and the UML Ethertap driver, but they do affect performance and possibly security. These effects are caused by the fact that there needs to be a root helper to create the Ethertap device and to handle every packet going through the device. It's impossible for the helper to open a file descriptor to the Ethertap interface and pass it to UML, as is the case with TUN/TAP. So, UML sends and receives packets over a pipe to the helper, which communicates with the interface. This extra step hurts latency and throughput compared to TUN/TAP. Having a root helper running continuously may also be a security issue, as it would be a continuous target for any attacks.

The one advantage that Ethertap has over TUN/TAP is that it's available on Linux kernels that predate early version 2.4. So, if you have a host running such a kernel, and it can't be updated, you have to use Ethertap for your UML networking.

SLIP

The SLIP transport exists because it was the first networking mechanism for UML. Ethertap was available on the first host on which I developed UML, but SLIP was the first mechanism I learned about. There is essentially no reason to use it now. The only one I can think of is that maybe some UML hosts don't have either TUN/TAP or Ethertap available, and this can't be changed. Then SLIP would be the mechanism of choice, even though it's a poor choice.

The following issues are among its disadvantages.

It can carry only IP traffic. Important non-IP protocols such as DHCP and ARP, and other lesser-known protocols from the likes of Apple and Novell, can't be carried over it.
The encapsulation required by the SLIP protocol is a performance drag.
It can't carry Ethernet frames, so it can't talk directly to an Ethernet network. All packets must be routed through the host, which will convert them into Ethernet frames.

Slirp

Slirp is interesting but little used. The Slirp networking emulator provides network access without needing any root privileges or help whatsoever. It is unique in this regard, as all of the other transports require some sort of root assistance.

However, it has a number of disadvantages.

It is slow. Slirp contains a network stack of its own that is used to parse the packets coming from the UML network stack. Slirp opens a normal socket connection to the target and sends the packet payload to it. When receiving packets, the process is reversed. The data coming from the remote side is assembled into a network packet that is immediately disassembled by the UML network stack.
It can't receive connections on well-known ports. Since it receives connections by attaching to host ports, as an unprivileged process, it can only attach to ports greater than 1024. Since it doesn't act as a full network node, it can't have its own ports that the host can route packets to.
The disadvantages of SLIP also apply, since Slirp provides an emulated SLIP connection.

Nevertheless, in some situations, Slirp is the only mechanism for providing a UML instance access to the outside network. I've seen cases where people are running UML instances on hosts on which they have no privileges. In one case, the "host" was a vserver instance on which the user had "root" privileges, but the vserver was so limited that Slirp was the only way to get the UML instance on the network. Cases like these are rare, but when they do happen, Slirp is invaluable, despite its limitations.

Isolated Networks

There are two purely virtual transports, which can connect a UML only to other UML instances: uml_switch and multicast.

uml_switch

uml_switch is a process that implements a virtual switch. UML instances connect to it and communicate with it over a UNIX domain socket on the host. It can act as a switch, which is its normal operation, or as a hub, which is sometimes useful when you want to sniff the traffic between two UML instances from a third. It also has the ability to connect to a preconfigured TUN/TAP device, allowing the UML instances attached to it to communicate with the host and outside network.

Multicast

Multicast is the second purely virtual network transport for UML. As its name suggests, it uses a multicast network on the host in order to transmit Ethernet frames from one UML instance to another. The UML instances all join the same multicast network, so that a packet sent from any instance is seen by all of the others. This is somewhat less efficient than the virtual switch because it behaves like a huball packets are received by all nodes attached to it. So, the UML instances will have to process and drop any packets that aren't intended for it, unnecessarily consuming host CPU time.

`pcap`

The last transport is unlike the others, in that it doesn't provide two-way network traffic. A UML interface based on pcap is read-onlyit receives packets but doesn't transmit them. This allows UML to act as a preconfigured network sniffer. A variety of network sniffing and traffic analysis tools are available, and they can be complicated to configure. This transport makes it possible to install a set of network analysis tools in a UML root filesystem, configure them, and distribute the filesystem.

Users can then boot UML on this filesystem and specify the pcap interface on the command line or with uml_mconsole. The traffic analysis will then work, with no further configuration needed.

As the name suggests, this transport is based on the pcap library, which underlies tcpdump and other tools. Use of this may require some familiarity with libpcap or tcpdump, especially if you want to filter packets before the tools inside UML see them. In this case, you will need to provide a filter expression to select the desired packets. Anyone who has done anything similar with tcpdump will know how to write an appropriate expression. For those who have not used tcpdump, the man page contains a good reference to the expression language.

How to Choose the Right Transport

Now that we've seen all of the UML network transports, we can make decisions about when to use each one. The advantages and disadvantages discussed earlier should make this pretty clear, but it's useful to summarize them.

If you need to give the UML instances access to the outside network, TUN/TAP is preferred. This has been standard in Linux kernels since early version 2.4, so virtually all Linux machines that might host UML instances should be sufficiently new to have TUN/TAP support. If you have one that is not, upgrading would probably be a better idea than falling back to Ethertap.

Once you've decided to use TUN/TAP, the next decision is whether to give each UML its own TUN/TAP device or to connect them with uml_switch and have it forward packets to the host through its own TUN/TAP interface. Using the switch instead of individual TUN/TAP devices has a number of trade-offs.

The switch is a single point of control, meaning that bandwidth tracking and management as well as filtering can be done at a single interface, and it is a single point of failure.
The switch is more efficient than individual TUN/TAP devices for traffic between the UML instances because the packets experience only Ethernet routing by the switch rather than IP routing by the host. However, for external traffic, there's one more process handling the packets, so that will introduce more latency.
The switch may be less of a security worry. If you are concerned about making /dev/net/tun world accessible (or even group accessible by a uml-users group), you may be happier having it owned by a user whose only purpose is to run uml_switch. In this way, faked packets can be injected into the host only by an attacker who has managed to penetrate that one account.

Against this, there is the UNIX socket that uml_switch uses to set up connections with UML instances. This needs to be writable by any users who are allowed to connect UML instances to the switch. A rogue process could possibly connect to it and inject packets to the switch, for forwarding to the UML instances or the outside network.

This would seem to be a wash, where we are replacing a security concern about /dev/net/tun with the same concern about the UNIX socket used by the switch. However, access to /dev/ net/tun allows the creation of new interfaces, which aren't subject to whatever filtering is applied to "authorized" TUN/TAP interfaces. Any packets injected through the UNIX socket that go to the outside network will need to pass through the filters on the TUN/TAP interface used by the switch. On balance, I would have to call this a slight security gain.

SLIP and Slirp are useful only in very limited circumstances. Again, I would recommend fixing the host so that TUN/TAP can be used before using either SLIP or Slirp. If you must get a UML with network access, and you have absolutely no way to get root assistance, you may need to use Slirp.

For an isolated network, the choice is between uml_switch and multicast. Multicast is trivial to set up, as we will see in the next section. However, the switch isn't that difficult either. If you want a quick-and-dirty isolated network, multicast is likely the better choice. However, multicast is less efficient because of the hub behavior I mentioned earlier.

Configuring the Transports

We need to take care of one loose end. The usage of the transports varies somewhat because of their differing configuration needs. In most cases, these differences are confined to the configuration string provided to UML on the command line or to uml_mconsole. In the case of uml_switch, we also need to look at the invocation of the switch.

Despite the differences, there are some commonalities. The parameters to the device are separated by commas. Many parameters are optional; to exclude one, just specify it as an empty string. Trailing commas can be omitted. For example, a TUN/TAP interface that the uml_net helper will set up can look like this:

eth0=tuntap,,fd:fe:1:2:3:4,192.168.0.1

Leaving out the Ethernet MAC would make it look like this:

eth0=tuntap,,,192.168.0.1

Omitted parameters will be provided with default values. In the case above, the omitted MAC will be initialized as described below. The omitted TUN/TAP interface name will be determined by the uml_net helper when it configures the interface.

The transports that create an Ethernet device inside UML can take an Ethernet MAC in the device specification. If not specified, it will be assigned a MAC when it is first assigned an IP address. The MAC will be derived from the IPthe first two bytes are 0xfd and 0xfe, and the last four are the IP address. This makes the MAC as unique as the IP address. Normally, the MAC can be left out. However, when you want the UML instance to be able to use DHCP, you must specify a MAC because the device will not operate without one and it must have a MAC in order for the DHCP server to provide an IP address. When it is acceptable for the UML interface to not work until it is assigned an IP address, you can let the driver assign the MAC.

However, if the interface is already up before it is assigned an IP address, the driver cannot change the MAC address on its own. Some distributions enable interfaces like this. In this case, the MAC will end up as fd:fe:00:00:00:00. If you are running several UML instances, it is likely that these MACs will conflict, causing mysterious network failures. The easiest way to fix this problem is to provide the MAC on the command line. You can also take the interface down and bring it back up by hand. When you bring it back up, you should specify the IP address on the ifconfig command line. This will ensure that the driver knows the IP address when the interface is enabled, so it can be assigned a reasonable MAC.

Whenever there is a network interface on the host that the transport communicates through, such as a TUN/TAP or Ethertap device, the IP address of that interface, the host-side IP, can be included. As we saw earlier in the chapter, when an IP address is specified in the device configuration, the driver will run the uml_net helper in order to set up the device on the host. When it is omitted, a preconfigured host device should be included in the configuration string.

As we have already seen, the configuration syntax for a device is identical whether it is being configured on the UML command line or being hot-plugged with an MConsole client.

TUN/TAP

The TUN/TAP configuration string comes in two forms, depending on whether you are assigning the UML interface a preconfigured host interface or whether you want the uml_net helper to create and configure the host interface.

In the first case, you specify

tuntap
The host interface name
Optionally, the MAC of the UML interface

For example:

eth0=tuntap,my-uml-tap,fe:fd:1:2:3:4

eth0=tuntap,my-uml-tap

In the second case, you specify

tuntap
An empty parameter, in place of the host interface name
Optionally, the MAC of the UML interface
The IP address of the host interface to be configured

For example:

eth0=tuntap,,fe:fd:1:2:3:4,192.168.0.1

eth0=tuntap,,,192.168.0.1

The three commas mean that parameters two and three (the host interface name and Ethernet MAC) are empty and will be assigned values by the driver.

Ethertap

The Ethertap configuration string is nearly identical, except that the device type is ethertap and that you must specify a host interface name. When the host interface doesn't exist and you provide an IP address, uml_net will configure that device. This example tells the driver to use a preconfigured Ethertap interface:

eth0=ethertap,tap0

This results in the uml_net helper creating and configuring a new Ethertap interface:

eth0=ethertap,tap0,,192.168.0.1

SLIP

The SLIP configuration is comparatively simpleonly the IP address of the host SLIP device needs to be specified. It must be there since uml_net will always run in order to configure the SLIP interface. There is no possibility of specifying a MAC since the UML interface will not be an Ethernet device. This means that DHCP and other Ethernet protocols, such as ARP, can't be used with SLIP.

eth0=slip,192.168.0.1

Slirp

The Slirp configuration requires

slirp
Optionally, the MAC of the UML interface
The command line of the Slirp executable

If you decide to try this, you should probably first configure and run Slirp without UML. Once you can run it by hand, you can put the Slirp command line in the configuration string and it will work as it did before.

Adding the Slirp command line requires that it be transformed somewhat in order to not confuse the driver's parser. First, the command and its arguments should be separated by commas rather than spaces. Second, any spaces embedded in an argument should be changed to underscores. However, in the normal case Slirp takes no arguments, and only the path to the Slirp executable needs to be specified.

If some arguments need to be provided, Slirp will read options from your ~/.sliprc. Putting the requisite information there will simplify the UML command line. It is also possible to pass the name of a wrapper script that will invoke slirp with the correct arguments.

Multicast

Multicast is the simplest transport to configure, if you want the defaults:

eth0=mcast

The full configuration contains

mcast
Optionally, the MAC of the UML interface
Optionally, the address of the multicast network
Optionally, the port to bind to in order to send and receive multi-
cast packets
Optionally, the time to live (TTL) for transmitted packets

Specifying the MAC is the same with mcast as with all the other transports.

The address determines which multicast group the UML instance will join. You can have multiple, simultaneous, mcast-based virtual networks by assigning the interfaces to different multicast groups. All IP addresses within the range 224.0.0.0 to 239.255.255.255 are multicast addresses. If a value isn't specified, 239.192.168.1 will be used.

The TTL determines how far the packets will propagate.

0: The packet will not leave the host.
1: The packet will not leave the local network and will not cross a router.
Less than 32: The packet will not leave the local organization.
Less than 64: The packet will not leave the region.
Less than 128: The packet will not leave the continent.
All other values: The packet is unrestricted.

Obviously, the terms "local organization," "region," and "continent" are not well defined in terms of networking hardware, even if they are well-defined geographically, which they often aren't. It is up to the router administrators to decide whether or not their equipment is on the border of one of these areas and configure it appropriately. Once configured, the routers will drop any multicast packets that have insufficient TTLs to cross the border.

The default TTL is 1, so the packet can leave the host but not the local Ethernet.

The port should be specified if there are multiple UML instances on different multicast networks on the host so that instances on different networks are attached to different ports. The default port is 1102.

However, not all hosts provide multicast support. The CONFIG_IP_MULTICAST and CONFIG_IP_MROUTE (under "IP: Multicast router" in the kernel configuration) must be enabled. Without these, you'd see:

mcast_open: IP_ADD_MEMBERSHIP failed, error = 19 There appears not to be a multicast-capable network \    interface on the host. eth0 should be configured in order to use the multicast \    transport.

uml_switch

The daemon TRansport differs from all the others in requiring a process to be started before the network will work. The process is uml_switch, which implements a virtual switch, as its name suggests. The simplest invocation is this:

host% uml_switch uml_switch attached to unix socket '/tmp/uml.ctl'

The corresponding UML device configuration would be:

eth0=daemon

The defaults of both uml_switch and the UML driver are such that they will interoperate with each other. So, if you want a single switch on the host, the configurations above will work.

If you want multiple switches on the host, then all but one of them, and the UML instances that will connect to them, need to be configured differently. The switch and the UML instances communicate with datagrams over UNIX domain sockets. The default socket is /tmp/ uml.ctl, as the message from the switch indicates.

A different socket can be specified with:

host% uml_switch -unix /tmp/uml-2.ctl

In order to attach to this switch, the same socket must be provided to the UML network driver:

eth0=daemon,,unix,/tmp/uml-2.ctl

unix specifies the type of socket to use, and the following argument specifies the socket address. At this writing, only UNIX domain sockets are supported, but this is intended to extend to allowing communication over IP sockets as well. In this case, the socket address would consist of an IP address or host name and a port number.

Some distributions (notably Debian) change the default location of the pipe used by uml_switch (to /var/run/uml-utilities/ uml_switch.ctl2 in Debian's case). If you use the defaults as described above and there is no connection between the UML instance and the uml_switch process, you need to figure out where the uml_switch socket is and configure the UML interface to use it.

As I mentioned earlier, uml_switch normally acts as a switch, so that it remembers what Ethernet MACs it has seen on what ports and transmits packets only to the port that the UML instance with the destination MAC is attached to. This saves the switch from having to forward all packets to all its instances, and it also saves the UML instances from having to receive and parse them and discard all packets not addressed to them.

uml_switch can be configured as a hub by using the -hub switch. In this case, all instances attached to it will see all packets on the network. This is sometimes useful when you want to sniff traffic between two UML instances from a third.

Normally, the switch provides an isolated virtual network, with no access to the host network. There is an option to have it connect to a preconfigured TUN/TAP device, in which case, that device will be another port on the switch, and packets will be forwarded to the host through it as appropriate. The command line would look like this:

uml_switch -tap switch-tap

switch-tap must be a TUN/TAP device that has already been created and configured as described in the TUN/TAP section earlier. Either bridging or routing, IP packet forwarding, and proxy arp should already be configured for this device.

The full UML device configuration contains

daemon
Optionally, the MAC of the UML interface
Optionally, the socket type, which currently must be unix
Optionally, the socket that the switch has attached to

pcap

The oddball transport, pcap, has a configuration unlike any of the others. The configuration comprises

pcap
The host interface to sniff
A filter expression that determines which packets the UML inter-
face will emit
Up to two options from the set promisc, nopromisc, optimize, and nooptimize

The host interface may be the special string any. This will cause all host interfaces to be opened and sniffed.

The filter expression is a pcap filter program that specifies which packets should be selected.

The promisc flag determines whether libpcap will explicitly set the interface as promiscuous. The default is 1, so promisc has no effect, except for documentation. Even if nopromisc is specified, the pcap library may make the interface promiscuous for some other reason, such as being required to sniff the network.

The optimize and nooptimize flags control whether libpcap optimizes the filter expression.

Here is an example of configuring a pcap interface to emit only TCP packets to the UML interface:

eth0=pcap,eth0,tcp

This configures a second interface that would emit only non-TCP packets:

eth0=pcap,eth0,\!tcp