E-Mail and SMTP | Lan Tutorial With Glossary of Terms: A Complete Introduction to Local Area Networks (Lan Networking Library)

How E-mail Travels the Internet

Whether you're creating and sending your message from a proprietary e-mail package or an online service, your e-mail travels the same road that all Internet-based information-such as an ftp file transfer, a telnet session, or a Web page download-travels. That is, your e-mail traverses the Internet. The sender creates an e-mail message on an application. When the user sends the message, it is transmitted to the user 's Internet mail server.

Figure 1 shows a simplified model of how e-mail travels from one user to another via the Internet. The sender creates an e-mail message on an application. The client system is known as a user agent, or UA. When the user sends the message, it is transmitted to the user's Internet mail server.

Figure 1: The Simple Mail Transfer Protocol (SMTP) allows e-mail-enabled client systems, or user agents (UAs), to send messages in ASCII text to other UAs. Internet mail servers act as message transfer agents (MTAs), relying on SMTP, conventional e-mail addresses, and the domain name service database to relay the e-mail message from server to server, until it reaches the mail server that services the recipient UA.

Once the message reaches the Internet mail server, it enters the Internet's message transfer system, or MTS. The MTS relies on other Internet mail servers to act as message transfer agents (MTAs), which relay the message towards the receiving UA. Once an MTA passes the message to the recipient's Internet mail server, the receiving UA can access the message.

The Format For E-Mail

RFC 822 defines the standard format for e-mail messages, treating an e-mail message as having two parts : an envelope and its contents. According to RFC 822, the envelope contains information needed to transmit and deliver an e-mail message to its destination. The contents, obviously, are the message that the sender wants delivered to the recipient.

The envelope contains the e-mail address of the sender, the e-mail address of the receiver, and a delivery mode, which in our case states that the message is to be sent to a recipient's mailbox. We can divide the contents of the message into two parts, a header and a body. The header is a required part of the message format, and the sending UA automatically includes it at the top of the message; the user does not input this information. The receiving UA may reformat the header information or delete it entirely to make the message easier for the recipient to read.

Take a look at Listing 1, which shows the e-mail message I sent to my corporate e-mail account from my CompuServe account.

Listing 1: Sample E-mail

 Received: from arl-img-5.compuserve.com by mfi.com (SMTPLINK V2.11)  ; Wed, 18 Dec 96 09:22:58 PST  Return-Path: <71154.2131@CompuServe.COM>  Received: by arl-img-5.compuserve.com (8.6.10/5.950515)  id MAA28405; Wed, 18 Dec 1996 12:24:54 -0500  Date: 18 Dec 96 12:04:12 EST  From: Lee Chae <71154.2131@CompuServe.COM>  To: Lee Chae <lchae@mfi.com>  Subject: Sample e-mail message  Message-ID: <961218170411_71154.2131_DHB86-1@CompuServe.COM>        Here is the sample message you wanted to show in your Tutorial.  -Lee

In this sample message, the header information takes up the first 10 lines. An individual header consists of a field and a field value. For example, To: is a field, and Lee Chae <lchae@mfi.com> is its value. As you can see, the header contains detailed information about who sent the message, who is to receive the message, and how the message got from the sending point to the receiving point.

In this case, the header tells you that the message was sent from Lee Chae at the client system identified as 71154.2131@compuserve.com. It was received by an MTA identified as arl-img-5.compuserve.com and sent to mail server mfi.com, which is the mail server that services my corporate e-mail system. The message is directed to my e-mail account, lchae@mfi.com.

In addition, the header displays the date of the message, the times at which the different MTAs received the message, and the unique ID of the message.

The body of the message contains the actual text the sender typed and is separated from the header by a "null" line. RFC 822 doesn't define the message body, as it can be anything the user enters, as long as it is ASCII text.

Aside from the message format, the other important standard is the e-mail address. You're probably familiar with the e-mail addressing system, but I'll go over it quickly, in case you aren't. Basically, a standard e-mail address usually follows the following form:

The mailbox ID is the name of an individual mailbox on a local machine. In my case, my mailbox ID would be lchae. The domain name is the name of a valid domain registered in the Domain Name System (DNS), which is the distributed database that keeps track of the different host names, network names , and IP addresses used on the Internet. The DNS mail entries are the keys to Internet e-mail because they allow MTAs to find the machine specified in the recipient's e-mail address.

To finish with my example, the domain name for my e-mail address is mfi.com. Put it together and you get lchae@mfi.com.

SMTP: The Transport

The transmission of an e-mail message through the Internet relies on the Simple Mail Transfer Protocol, which is defined in RFC 821. SMTP governs the way a UA establishes a connection with an MTA and the way it transmits its e-mail message. MTAs also use SMTP to relay the e-mail from MTA to MTA, until it reaches the appropriate MTA for delivery to the receiving UA.

The interactions that happen between two machines, whether a UA to an MTA or an MTA to another MTA, have similar processes and follow a basic call-and-response procedure. The main difference between a UA-to-MTA transaction and an MTA-to-MTA transaction is that with the latter, the sending MTA must locate a receiving MTA.

To do this, the sending MTA contacts the DNS to look up the domain name specified in the recipient e-mail address. The DNS may return the IP address of the domain name-in which case the sending MTA tries to establish a mail connection to the host at that domain-or the DNS may return a set of mail relaying records that contain the domain names of intermediate MTAs that can act as relays to the recipient. In this case, the sending MTA tries to establish a mail connection to the first host listed in the mail relaying record.

Now I'll use the case of two MTAs, a sending MTA and a receiving MTA, to illustrate the call-and-response mail transaction. First, as mentioned, the sending MTA chooses a receiving MTA, which may be the final destination of the message or an intermediate MTA that will relay the message to another MTA.

Next , the sending MTA requests a TCP connection to the receiving MTA. The receiving MTA responds with a server ID and a status report, which indicates whether or not it is available for the mail transaction. If it isn't, the transaction is over; the sending MTA can try again later or attempt another route. If the receiving MTA is free to handle a session, it will accept the TCP connection.

The sending MTA then sends a HELO command followed by its domain name information to the receiving MTA, which responds with a greeting. Next, the sending MTA sends a Mail From command that identifies the e-mail address from which the message originated, as well as a list of the MTAs that the message has passed through. This information is also known as a return path . If the receiving MTA can accept mail from that address, it responds with an OK reply.

The sending MTA then sends a Rcpt To command, which identifies the e-mail address of the recipient. If the receiving MTA can accept mail for that recipient (it may perform a DNS lookup to verify this) it responds with an OK reply. If not, it rejects that recipient. (An e-mail message may be addressed to more than one recipient, in which case this process is repeated for each recipient address.)

Once the receiving MTA identifies the recipient's address, the sending MTA sends the Data command. The receiving MTA accepts command by responding with OK. It then considers all succeeding lines of data to be the message text. Once the sending MTA gets an OK reply, it starts sending the message. The sending MTA signals the end of the message by transmitting a line that contains only a period (.).

When the receiving MTA receives the signal for the end of the message, it replies with an OK to signal its acceptance of the message. If for some reason, the receiving MTA can't process the message, it will signal the sending MTA with a failure code. After the message has been sent to the receiving MTA and the sending MTA gets an OK reply, the sending MTA can either start another message transfer or use the Quit command to end the session.

Once the receiving MTA accepts the message, it reverses its role and becomes a sending MTA, contacting the MTA next in line for the relay of the message. The process stops once the message reaches the Internet mail server that services the recipient specified in the Rcpt To e-mail address.

If at any point along the way an MTA can't deliver the e-mail-for whatever reason (for instance, if it can't identify the recipient address)-it generates an error report, also known as an undeliverable mail notification. The MTA uses MTAs identified in the return path to relay the error report back to the original sender.

Some Important Footnotes

SMTP can handle only messages containing the 7-bit ASCII text defined in RFC 822. This means that alone, SMTP is ill fit to handle other types of data such as 8-bit binary data and other multimedia formats that more and more people are sending both within the body of e-mail messages and as attachments. However, as a solution to this limitation, the IETF developed the Multipurpose Internet Mail Extensions (MIME) protocol (RFC 1521), which packs multimedia data into a format that SMTP can handle.

In addition, you may have noticed that my description of the e-mail relay process stopped at the recipient's Internet mail server, or MTA. The reason for this is that a user agent can employ different methods to access, or retrieve, its e-mail from the MTA. For instance, the majority of companies rely on proprietary e-mail packages, such as cc:Mail, to handle e-mail operations on the local network. And individual users often use e-mail applications offered by online services such as CompuServe and America Online, which you can also consider proprietary e-mail programs. In both cases, the proprietary system acts as the last leg between the final MTA and the recipient UA.

However, you may want to keep an eye out for two important e-mail protocols: Post Office Protocol 3 (POP-3) and Internet Message Access Protocol 4 (IMAP-4). POP-3 is an older standard that defines a method for a client system, or UA, to access its e-mail. IMAP-4 is a new standard, rising in popularity, that offers a more robust UA. Both allow UAs direct access to MTAs for the retrieval of messages, although IMAP-4 gives you more options for handling and storing e-mail. You can find more information on POP-3 and IMAP-4 by looking up their RFCs, 1725 and 2060, respectively, on the Web. (There's a great Web site set up for IMAP at http://www.imap.org.) For more about IMAP-4, see "IMAP's New Territory."

This tutorial, number 103, by Lee Chae, was originally published in the March 1997 issue of LAN Magazine/Network Magazine.