1.1. Service Discovery with ZeroconfNone of the examples that took advantage of Zeroconf began with someone thinking, "You know what I could really use right now? An IP address." Certainly, it's a
rare person who takes the time to say, "Now that I have an IP address, I could use a friendly domain name. I should learn how to set up DNS on my laptop." A typical user of Zeroconf should not be aware of the infrastructure required. She just wants to use a printer, stream music, exchange photos, or use some other service. The architecture of Zeroconf is built around simplicity. It should be as easy for an end user to connect to a printer or locate streamed music as it is for him to turn on a light bulb. The simplicity extends to implementers as well. A vendor of an inexpensive device who desires to use Zeroconf should not find it hard to implement Zeroconf, even in devices with extremely limited memory capacity. 1.1.1. Service DiscoveryTo the end user, the most important facet of Zeroconf is the ability to easily browse for available services. It is worth taking a moment to appreciate the significance of the concepts encapsulated in that short phrase. Start with these five highlighted words as the prime directive for Zeroconf. 1.1.1.1. Browse for servicesWith Zeroconf, you browse for services, not for hardware. The reason for this is simple but important: if you want to print, there is little benefit to discovering hardware that doesn't do printing. Similarly, there is little benefit to discovering things that are printers but speak only a printing protocol that your client does not support, since you wouldn't be able to use those printers. Conversely, suppose that there is a device on the network in a legal office that functions, protocol-wise, as a printer, but instead of printing on paper, it archives documents as date-stamped PDF files on recordable CDs. You would want your printing client to discover this service, since it's a service your printing client can use. Suppose there were an inexpensive USB printer (which doesn't have Postscript or networking) connected to a desktop computer (which does), with software making Postscript printing service available to other machines on the network via IPP (Internet Printing Protocol). You would want your Postscript IPP printing client to discover this service, since it's a service you can use. What is it that your printing client is discovering, in this case? The USB printer? The desktop computer? The software? No. The insight here is to realize that what your printing client is discovering is the aggregate service offered by the computer, the printer, and the software working in concert, and it is that aggregate service that is being advertised as a logical entity on the network in its own right. The USB printer could break and be replaced, and the logical service being offered would remain the same. The desktop computer could break and be replaced, and the logical service being offered would remain the same. Even the software could be upgraded or replaced, while the logical Postscript IPP printing service being offered to network clients would remain unchanged. The important principle here is that when you're looking for services on the network, the relevant question is not "What are you?" or even "What do you do?" but "Do you speak my language?" 1.1.1.2. Available servicesThe list that the user gets should be services that are currently available to them. They should be able to see the list of currently available printers, select one, and use it. As with all such network protocol designs, there is a trade-off between timeliness of information and network efficiency. Continuously querying the network to find what services are available gives accurate, up-to-date information but can impose an unreasonable burden on the network. Querying the network just once is much more efficient, but the client's information soon gets out of date, necessitating a "refresh" button in the UI, which then puts the burden on the human user to keep clicking the refresh button (which puts a burden on the network). Zeroconf solves these problems using a variety of techniques. For efficiency, clients query the network infrequently, as little as once per hour. To avoid long delays before new services are discovered, when a service starts up it sends a few multicast announcement packets, so clients become aware of the new service even before performing their next scheduled query. IP Multicast addresses are special destination addresses that cause packets to be delivered to all interested parties on the local network, rather than just to a single machine. When services go away, they send multicast "goodbye" packets, so they are promptly removed from all clients' UI lists. In the event that a service is unceremoniously disconnected without getting a chance to send its "goodbye" packet, stale data may remain in lists for a while, but even this case is handled by Zeroconf. When a client attempts to contact a stale service that is no longer present, the failure is noted, and the service is promptly removed from the list of available services. This prompt removal occurs not only on the client that directly experienced the failure but also on all the other clients on the same network link, which passively observe the failure and update their own lists too. Zeroconf uses these and a variety of other techniques to provide timely, accurate information while keeping the network traffic to a minimum. This kind of peer-to-peer, multicast-based protocol is great for small networks because it is very reliable and requires no dedicated service-discovery infrastructure, but no matter how efficient the protocol, there will come a network size where it no longer makes sense. In an organization with thousands of machines, having every single machine multicasting to every other machine all the time would not be reasonable. Beyond a certain size, every service-discovery protocol has to transition from using peer-to-peer multicast to some kind of centralized repository to hold service information. Services and clients communicate with the centralized repository using a wide-area protocol. In Zeroconf, the centralized repository is one that most companies already havea DNS serverand the wide-area protocol is the standard DNS protocol with two small extensions, Update Leases and Long-Lived Queries. Update Leases allow a DNS server to expire server records if the service that created them crashes, and Long-Lived Queries allow a client to be notified as services come and go, rather than having to keep polling the server to find out what's new. 1.1.1.3. Easy browsingZeroconf would never have been so widely adopted if using it required popping open a terminal window and typing in obscure commands. Command-line tools are great for developers and network administrators, but end users will be browsing for services within a context. They are not conscious that they are requesting a list of services that implement a protocol. For example, when running iTunes, users simply see a list called "Shared Music." They don't need to be aware that iTunes is performing a query for Zeroconf service type _daap._tcp to find the list of local servers offering the Digital Audio Access Protocol (DAAP) service. Another thing you'll notice is that the names of shared music sources displayed in iTunes don't need to look like "thing.company.com," all lowercase with no spaces or other punctuation. In the example at the beginning of this chapter, the printer was named "Third Floor Meeting Rooms," not "f3mr.company.com." In command-line user interfaces, you want names to be short and quick to type. In graphical user interfaces, you don't need to type names because you just select them from a list of choices, so they can be long and descriptive and can contain rich punctuation, accented letters, and non-roman characters, such as Kanji. 1.1.2. Names and AddressesAlthough service discovery is the most visible element of Zeroconf, Zeroconf is more than just that. Zeroconf is a three-layer foundation for IP networking, with service discovery sitting atop the two lower layers, addressing and naming. 1.1.2.1. Claiming an IP addressThe first requirement for IP networking is an IP address. There are existing mechanisms for IPv4 address allocation, such as using manual configuration or a DHCP server, but when neither of these is available, Zeroconf-capable devices will use a self-assigned IPv4 link-local address instead. In brief, the mechanism behind self-assigned addresses is that the device selects an address at random within a prescribed range, sends some ARP requests, and then, if no answers are received, proceeds to use that address. Self-assigned IPv4 link-local addresses are discussed in detail in Chapter 2. IPv6 also has self-assigned link-local addresses, though sadly, at the present timeeven though Mac OS X, Windows, and Linux all support IPv6most of the low-cost peripherals that they talk to, such as printers and cameras, don't yet support IPv6. 1.1.2.2. Claiming a nameThe second requirement is that the typical usage model for IP networking expects hosts to have names, not just numerical addresses. Having to remember and type numerical addresses is cumbersome at best, and when the addresses are being picked randomly, it may not even be possible. We need a way to associate a stable name with each device, in order to determine what address it has picked for itself, at this instant. The Internet's existing mechanism for associating names with addresses is a DNS server, but when no DNS server is available, Zeroconf-capable devices will use Multicast DNS (mDNS) to achieve substantially the same effect on the local link, without having to set up and maintain a dedicated DNS server. In brief, the mechanism behind mDNS names is very similar to self-assigned addresses: the device sends a few mDNS queries for its desired name, and if no answers are received, the device can then use that name. Multicast DNS naming is discussed in detail in Chapter 3. |