Section 1.1. Service Discovery with Zeroconf | Zero Configuration Networking: The Definitive Guide

1.1. Service Discovery with Zeroconf

None of the examples that took advantage of Zeroconf began with someone thinking, "You know what I could really use right now? An IP address." Certainly, it's a

Zeroconf's Many Names

The seeds of Zeroconf were planted in some postings by Stuart Cheshire on the Net-Thinkers mailing list in 1997. This led to the IETF holding two "Birds of a Feather" (BOF) sessions at the March and July 1999 IETF meetings on the subject of "Networking in the Small" (NITS), co-chaired by Stuart Cheshire and Peter Ford.

Out of the NITS BOF meetings, the Zero Configuration Networking (Zeroconf) Working Group was formed in September 1999.

In May 2002, Apple announced its trademark "Rendezvous" for the Zeroconf technologies, a little like the way Apple uses its trademark "AirPort" for IEEE 802.11 wireless networking.

Unfortunately for Apple, another company also had a networking product by the name of "Rendezvous," and in April 2005, Apple announced the new Apple name for the Zeroconf technologies: "Bonjour." Other third-party products can also carry the Bonjour name and logo. Apple doesn't charge any money to license the name and logo; the products just have to pass Apple's Bonjour Conformance Test to verify that they do in fact implement the specifications properly.

Meanwhile, other open source implementations of the Zeroconf technologies have also been created, including Howl and Avahi.

The terms "Bonjour" and "Zeroconf" are often used interchangeably, but as a general rule, this book uses the term "Zeroconf" when referring to the technology in general and "Bonjour" when referring to it in an Apple-specific context. For example, iChat on Mac OS X doesn't have a "Zeroconf" window; it has a "Bonjour" window (it says "Bonjour" at the top of the window).

rare person who takes the time to say, "Now that I have an IP address, I could use a friendly domain name. I should learn how to set up DNS on my laptop." A typical user of Zeroconf should not be aware of the infrastructure required. She just wants to use a printer, stream music, exchange photos, or use some other service.

The architecture of Zeroconf is built around simplicity. It should be as easy for an end user to connect to a printer or locate streamed music as it is for him to turn on a light bulb. The simplicity extends to implementers as well. A vendor of an inexpensive device who desires to use Zeroconf should not find it hard to implement Zeroconf, even in devices with extremely limited memory capacity.

1.1.1. Service Discovery

To the end user, the most important facet of Zeroconf is the ability to easily browse for available services. It is worth taking a moment to appreciate the significance of the concepts encapsulated in that short phrase. Start with these five highlighted words as the prime directive for Zeroconf.

1.1.1.1. Browse for services

With Zeroconf, you browse for services, not for hardware. The reason for this is simple but important: if you want to print, there is little benefit to discovering hardware that doesn't do printing. Similarly, there is little benefit to discovering things that are printers but speak only a printing protocol that your client does not support, since you wouldn't be able to use those printers. Conversely, suppose that there is a device on the network in a legal office that functions, protocol-wise, as a printer, but instead of printing on paper, it archives documents as date-stamped PDF files on recordable CDs. You would want your printing client to discover this service, since it's a service your printing client can use. Suppose there were an inexpensive USB printer (which doesn't have Postscript or networking) connected to a desktop computer (which does), with software making Postscript printing service available to other machines on the network via IPP (Internet Printing Protocol). You would want your Postscript IPP printing client to discover this service, since it's a service you can use. What is it that your printing client is discovering, in this case? The USB printer? The desktop computer? The software? No. The insight here is to realize that what your printing client is discovering is the aggregate service offered by the computer, the printer, and the software working in concert, and it is that aggregate service that is being advertised as a logical entity on the network in its own right. The USB printer could break and be replaced, and the logical service being offered would remain the same. The desktop computer could break and be replaced, and the logical service being offered would remain the same. Even the software could be upgraded or replaced, while the logical Postscript IPP printing service being offered to network clients would remain unchanged. The important principle here is that when you're looking for services on the network, the relevant question is not "What are you?" or even "What do you do?" but "Do you speak my language?"

1.1.1.2. Available services

The list that the user gets should be services that are currently available to them. They should be able to see the list of currently available printers, select one, and use it. As with all such network protocol designs, there is a trade-off between timeliness of information and network efficiency. Continuously querying the network to find what services are available gives accurate, up-to-date information but can impose an unreasonable burden on the network. Querying the network just once is much more efficient, but the client's information soon gets out of date, necessitating a "refresh" button in the UI, which then puts the burden on the human user to keep clicking the refresh button (which puts a burden on the network). Zeroconf solves these problems using a variety of techniques. For efficiency, clients query the network infrequently, as little as once per hour. To avoid long delays before new services are discovered, when a service starts up it sends a few multicast announcement packets, so clients become aware of the new service even before performing their next scheduled query. IP Multicast addresses are special destination addresses that cause packets to be delivered to all interested parties on the local network, rather than just to a single machine. When services go away, they send multicast "goodbye" packets, so they are promptly removed from all clients' UI lists. In the event that a service is unceremoniously disconnected without getting a chance to send its "goodbye" packet, stale data may remain in lists for a while, but even this case is handled by Zeroconf. When a client attempts to contact a stale service that is no longer present, the failure is noted, and the service is promptly removed from the list of available services. This prompt removal occurs not only on the client that directly experienced the failure but also on all the other clients on the same network link, which passively observe the failure and update their own lists too. Zeroconf uses these and a variety of other techniques to provide timely, accurate information while keeping the network traffic to a minimum.

This kind of peer-to-peer, multicast-based protocol is great for small networks because it is very reliable and requires no dedicated service-discovery infrastructure, but no matter how efficient the protocol, there will come a network size where it no longer makes sense. In an organization with thousands of machines, having every single machine multicasting to every other machine all the time would not be reasonable. Beyond a certain size, every service-discovery protocol has to transition from using peer-to-peer multicast to some kind of centralized repository to hold service information. Services and clients communicate with the centralized repository using a wide-area protocol. In Zeroconf, the centralized repository is one that most companies already havea DNS serverand the wide-area protocol is the standard DNS protocol with two small extensions, Update Leases and Long-Lived Queries. Update Leases allow a DNS server to expire server records if the service that created them crashes, and Long-Lived Queries allow a client to be notified as services come and go, rather than having to keep polling the server to find out what's new.

1.1.1.3. Easy browsing

Zeroconf would never have been so widely adopted if using it required popping open a terminal window and typing in obscure commands. Command-line tools are great for developers and network administrators, but end users will be browsing for services within a context. They are not conscious that they are requesting a list of services that implement a protocol. For example, when running iTunes, users simply see a list called "Shared Music." They don't need to be aware that iTunes is performing a query for Zeroconf service type _daap._tcp to find the list of local servers offering the Digital Audio Access Protocol (DAAP) service.

Another thing you'll notice is that the names of shared music sources displayed in iTunes don't need to look like "thing.company.com," all lowercase with no spaces or other punctuation. In the example at the beginning of this chapter, the printer was named "Third Floor Meeting Rooms," not "f3mr.company.com." In command-line user interfaces, you want names to be short and quick to type. In graphical user interfaces, you don't need to type names because you just select them from a list of choices, so they can be long and descriptive and can contain rich punctuation, accented letters, and non-roman characters, such as Kanji.

1.1.2. Names and Addresses

Although service discovery is the most visible element of Zeroconf, Zeroconf is more than just that. Zeroconf is a three-layer foundation for IP networking, with service discovery sitting atop the two lower layers, addressing and naming.

1.1.2.1. Claiming an IP address

The first requirement for IP networking is an IP address. There are existing mechanisms for IPv4 address allocation, such as using manual configuration or a DHCP server, but when neither of these is available, Zeroconf-capable devices will use a self-assigned IPv4 link-local address instead. In brief, the mechanism behind self-assigned addresses is that the device selects an address at random within a prescribed range, sends some ARP requests, and then, if no answers are received, proceeds to use that address. Self-assigned IPv4 link-local addresses are discussed in detail in Chapter 2. IPv6 also has self-assigned link-local addresses, though sadly, at the present timeeven though Mac OS X, Windows, and Linux all support IPv6most of the low-cost peripherals that they talk to, such as printers and cameras, don't yet support IPv6.

1.1.2.2. Claiming a name

The second requirement is that the typical usage model for IP networking expects hosts to have names, not just numerical addresses. Having to remember and type numerical addresses is cumbersome at best, and when the addresses are being picked randomly, it may not even be possible. We need a way to associate a stable name with each device, in order to determine what address it has picked for itself, at this instant. The Internet's existing mechanism for associating names with addresses is a DNS server, but when no DNS server is available, Zeroconf-capable devices will use Multicast DNS (mDNS) to achieve substantially the same effect on the local link, without having to set up and maintain a dedicated DNS server. In brief, the mechanism behind mDNS names is very similar to self-assigned addresses: the device sends a few mDNS queries for its desired name, and if no answers are received, the device can then use that name. Multicast DNS naming is discussed in detail in Chapter 3.