Section 5.5. DNS Long-Lived Queries (DNS-LLQ)


5.5. DNS Long-Lived Queries (DNS-LLQ)

Software development teams that are collocated experience synergies that are harder to replicate in teams working remotely. You may be working in one area of the code and overhear two people discussing another area of the code that you have quite a lot of experience with. You can easily jump in and join the conversation. There are also benefits of working remotely. Once you take the time for commuting, the time taken on incidental conversation with colleagues, and the time spent trying to block out the noise of others working, you find many more hours in the day to get work done. How, in this setting, do you make yourself available to exchange ideas with your colleagues? Email, Net meetings, instant messaging, and IRC chat rooms all contribute to the virtual work environmentbut it is not the same as being there.

The same synergies exist for DNS-SD in a Multicast DNS environment. When you're on the local network, you hear announcements when new services arrive. You hear goodbye packets with services go away. If a service goes away without sending its goodbye packet, and later, another client attempts (unsuccessfully) to contact it, you hear that too. What we want to do is provide similar timeliness for remote clients that may be far removed from the local network.

A standard DNS query gives you the answer that's true at that moment in time. If you want to find out later what's changed, you have to do another query. Querying very frequently puts a large load on the network and on the DNS server. Querying only occasionally imposes lower load, but your information may become out of date. When using polling, there is no good answer.

Instead of polling, DNS-SD extends DNS to support long-lived queries (LLQ). In addition to asking a question of the server, it uses an EDNS0 extension to say, in effect, "...and tell me in the future if the answer to this question changes." DNS long-lived queries are described at http://files.dns-sd.org/draft-sekar-dns-llq.txt.

5.5.1. LLQ Message Format

LLQ messages extend the standard DNS message format described in RFC 1035 (http://www.ietf.org/rfc/rfc1035.txt) with a new OPT-RR and RDATA format, similar to the way that Dynamic DNS Update was extended as described earlier. This time, the RDATA triples are of the form OPTION-CODE, OPTION-LENGTH, and LLQ-Metadata. The OPTION-CODE is filled with the value of the EDNS0 Option Code for LLQ, which is 1.

The LLQ-Metadata consists of a VERSION field used to identify the version of the LLQ protocol implemented and an LLQ Opcode field that consists of one of the following codes: LLQ-SETUP (1), LLQ-Refresh (2), or LLQ-EVENT (3). The ERROR field is next and contains one of the following error codes: NO-ERROR (0), SERV-FULL (1), STATIC (2), FORMAT-ERR (3), NO-SUCH-LLQ (4), BAD-VERS (5), and UNKNOWN-ERR (6). The remaining two fields are LLQ-ID, which is a unique identifier for a particular LLQ, and LEASE-LIFE, which indicates how long the LLQ will remain in effect.

5.5.2. LLQ Setup Four-way Handshake

The setup of long-lived queries is a four-way handshake consisting of the following steps.

5.5.2.1. Step 1: initial request

The initial request is sent by the client to the server. The format for this request is an extension of a standard DNS query using an OPT-RR containing LLQ metadata in its Additionals section. The RDATA triple consists of an OPTION-CODE, OPTION-LENGTH, and LLQ-Metadata. This triple may appear one or more times. Values should be as follows:

  • The OPTION-CODE should be set to LLQ (1).

  • The LLQ-Metadata section consists of fields for the LLQ-OPCODE, ERROR, LLQ-ID, and LEASE-LIFE.

  • In an initial setup request, the LLQ-OPCODE is set to LLQ-SETUP and the LLQ-ID is set to 0.

  • In requests the ERROR field should be set to NOERROR and the LEASE-LIFE should contain the desired life of the LLQ request in seconds.

5.5.2.2. Step 2: challenge

In response to an LLQ setup request, a server will send a setup challenge to the requestor. The reason for the challenge is to prevent abuse of the LLQ feature by rogue machines that might otherwise use spoof source addresses to set up LLQs on behalf of some other unsuspecting machine. The challenge packet contains a large number selected at random by the DNS server. A legitimate client setting up an LLQ receives the challenge and answers it correctly. An impostor generating fake packets with spoof source addresses will not receive the challenge packet and will be unable to fake a correct response to the challenge it never received.

This challenge is a DNS Response, with the DNS message ID matching that of the request and with all questions contained in the request present in the questions section of the response. The challenge contains one OPT-RR with an LLQ metadata section for each LLQ request, which will indicate the success or failure of each request.

The LLQ-Metadata section consists of a field VERSION, which indicates the version of the LLQ protocol implemented in the server, and an LLQ-OPCODE field with value LLQ-SETUP. The remaining fields are ERROR, LLQ-ID, and LEASE-LIFE. Possible values for the ERROR-CODE include the following:

  • NO-ERROR

  • FORMAT-ERR, which indicates the LLQ was improperly formatted

  • SERV-FULL, which indicates the server is overloaded by the number of LLQs being managed or by the rate at which the requests are being received

  • STATIC, which indicates the data for this name and type is not expected to change frequently, so the server does not consider LLQ the appropriate mechanism for this service

  • BAD-VERS, which indicates the protocol version in the client request is not supported on the server

  • UNKOWN-ERR

On error, LLQ-ID is set to 0.

On success, a large random number generated by the server that is unique for the requested name, type, and class is created and stored as the LLQ-ID. The LEASE-LIFE is set to the actual life of the LLQ in seconds. This value may be less than, equal to, or greater than the LLQ requested.

In the case of a SERV-FULL error, the LEASE-LIFE is used for a different purpose. It is set to the time in seconds after which the client may retry the LLQ Setup. For all other errors, the LEASE-LIFE is set to 0.

5.5.2.3. Step 3: challenge response

The client has been listening for a response to the original setup request. If no response was received, then up to three requests are transmitted with two seconds between the first two and four seconds between the second and third. Another eight seconds after transmitting the third request, the server should be assumed to be down or unreachable and the client should begin the process again no more than once per hour.

When the client receives a successful setup challenge, it sends a challenge response, which is a DNS request with questions from the request and challenge, and a single OPT-RR in the Additionals section with the RDATA that echoes the random LLQ-ID and granted LEASE-LIFE for each set of fields in the order that the questions were issued.

If the client receives a challenge with an error, it responds as follows:

  • For a STATIC error, the client honors the resource record TTLs in the response and does not poll the server.

  • In the case of a SERV-FULL error, the client may retry the LLQ Setup Request after an interval equal to that contained in the LEASE-LIFE field.

  • If there is another type of error or the server is determined not to support LLQ, the client may resort to polling the server not more than once every 30 minutes for a given query.

5.5.2.4. Step 4: ACK and answers

The final step of the handshake is the acknowledgment that the server sends when it receives a successful challenge response. A successful challenge response is one in which the LLQ-ID and LEASE-LIFE echoed by the client match the values issued by the server. The server sends a DNS response containing all available answers to the questions contained in the original setup request, additional resource records for those answers in the Additionals section, and, finally, an OPT-RR with the RDATA format as follows:

  • An OPTION-CODE with value LLQ, followed by an OPTION-LENGTH field.

  • The LLQ-Metadata portion is the now familiar VERSION and LLQ-OPCODE, which is set to LLQ-SETUP.

  • The ERROR field should be set to NO-ERROR.

  • The LLQ-ID is the originally granted long identification number.

  • The LEASE-LIFE is the remaining life of LLQ in seconds.

The reason for the challenge/response precaution at steps 3 and 4 is to prevent a kind of network attack called a byte-multiplication attack. Suppose you were a mischievous individual with a desire to cause trouble for bigcompany.com. You might decide to try to flood their network with traffic. You send nonsense data to their web server as fast as you can over your DSL line at home, but you find two things: your DSL line is so slow compared to their connection that they don't even notice you, and when they do notice you, they can easily trace the packet stream back to the source, and you go to jail. Imagine how much better your attack could be if you could convince other machines, with much fatter network pipes, to flood the victim's machine on your behalf. This is the essence of a byte-multiplication attack. You send request packets to well-connected machines, using the IP address of your intended victim as the fake source address in your packets, so that all the replies go to the victim's machine instead of yours. If a reply packet is 100 times bigger than the request packet, then 1 Mbps of requests can generate 100 Mbps of responses directed at the victim's machine. The challenge/response phase prevents DNS LLQ from being abused in this way. Before it begins sending answers, the DNS server sends the challenge to the target machine, requesting positive confirmation that it truly requested that stream of answer packets. Because the challenge packet is about the same size as the initial request packet, this phase of the protocol itself can't be used to mount a very effective byte-multiplication attackit only multiplies the attack size by one! Given that existing conventional DNS queries can already be crafted to result in a multiplier ratio larger than this, this means that DNS LLQ doesn't add any new byte-multiplication potential to the DNS protocol.

5.5.3. Refreshes and Expiration

In order to extend the LLQ beyond the granted LEASE-LIFE, the client sends a Refresh request when 80% of its lease life has elapsed. This request is identical to the LLQ Challenge Response, with the exception that the LLQ-OPCODE is set to LLQ-REFRESH instead of LLQ-SETUP. The client should coalesce refresh methods for all LLQs established with a given server as long as one of them has elapsed at least 80% of its LEASE-LIFE. If including all of the LLQs causes the message to no longer fit in a single packet, the client should include all that will fit, preferring those closest to expiration . The requested LEASE-LIFE for a single LLQ should equal the original granted LEASE-LIFE. For multiple LLQs, the client should request the same LEASE-LIFE for all of them as the one granted for the soonest to expire.

The server responds to an LLQ refresh message with a response similar to the ACK described in step 4 above with the LLQ-OPCODE set to LLQ-REFRESH. If the client attempts to refresh an expired or nonexistent LLQ, the server returns an ERROR value of NO-SUCH-LLQ. If the client fails to extend the LLQ beyond the granted LEASE-LIFE, or if the client terminates a lease by sending a request with LEASE-LIFE equal to 0, the lease expires.

5.5.4. Event Responses

Once the LLQ has been successfully set up, the server delivers change notifications to the client. There are two kinds of change notification that can occur and require action:


Add Events

These occur when a new resource record appears that answers an LLQ. Often, these are the result of a dynamic DNS update. This added record is sent in the Answer section of the event to the client.


Remove Events

These occur when a resource record becomes invalid. The deleted resource record is sent in the Answer section of the event to the client, with the TTL of the resource record set to -1 to indicate the record has been removed.

The format of the OPT-RR RDATA begins with the OPTION-CODE with value LLQ and the OPTION-LENGTH field. The VERSION is the version of the LLQ protocol implemented in the server, and the LLQ-OPCODE is set to LLQ-EVENT. The ERROR field has value 0, the LLQ-ID is as above, and the LEASE-LIFE is set to 0.

Upon receiving a change notification from the server, the client sends an acknowledgment back to the server. This acknowledgment is a DNS response echoing the OPT-RR contained in the change notification, with the message ID of the notification echoed in the message header.

5.5.5. Identifying Whether the Local DNS Cache Supports LLQ

A client can first try to issue its LLQ request to the local DNS caching server, just like normal DNS queries. However, most DNS caches today don't implement LLQ and will return a NOTIMPL or FORMERR error.

In this case, the client should contact the authoritative server directly to issue its LLQ request. The client first uses an SOA query to determine the zone and authoritative server responsible for the name it's querying. It then does an SRV query for the name _dns-llq._udp.zone to find the target host and port number where LLQ service is offered for this zone. Usually, it will be the same host as the master DNS server but on a port number other than the normal DNS port 53. If the client receives an NXDOMAIN response to its SRV query, the client concludes that the zone does not support LLQs and instead resorts to low-rate polling to keep its data reasonably up to date.