How SMTP Works

As an application protocol, SMTP relies on the error-detection and correction mechanisms of the underlying transport protocols and does not implement these sorts of functions in the SMTP protocol. For example, TCP uses sequence numbers to keep track of TCP segments sent and acknowledged. Those that are not acknowledged within a timely fashion are retransmitted. Thus, SMTP, using TCP as a transport protocol, doesn't have to worry about this sort of thing. SMTP has also been implemented using other transfer protocols, including NCP, NITS, and X.25. For purposes of this text, I will be focusing on SMTP using the TCP Transport service, because that is the most common model you are likely to see. Because SMTP is an application protocol, it's associated with a port number just like FTP, Telnet, and other applications that make up the TCP/IP suite. The port generally used for SMTP is TCP port 25.

Note

SMTP was first defined by RFC 821, but it has been superceded by RFC 2821, "Simple Mail Transfer Protocol." Additional RFC documents have added functionality to the protocol. For example, RFC 3207, "SMTP Service Extensions for Secure SMTP over Transport Layer Security." This RFC provides for both authentication and encryption for the transfer of email, based on TLS. TLS is basically an upgrade to the Secure Sockets Layer (SSL) security used by many Web browsers. For more information on these security protocols, see Chapter 47, "Encryption Technology."

Tip

When testing an email relay, you might want to use the telnet command and the port. This can be done by opening a command prompt or terminal session and connecting to the relay via port 25 or 110. If you can connect and run commands, your email relay is up and functional. The syntax for this test is

Telnet <ip address> <port>

Telnet 10.1.1.1 25 Telnet 10.1.1.1 110

Once connected, you can run commands such as HELO, and so on.

SMTP is used to send email from a client to an SMTP server and for SMTP servers to exchange mail. Other protocols, such as POP3 and IMAP, are used by clients to retrieve mail from mailboxes that reside on SMTP servers. SMTP is not used for that purpose, as you can see in Figure 26.1.

Figure 26.1. SMTP is used to send and receive email. SMTP is used primarily to upload email to the server, whereas POP3 is generally used to download mail.

In this figure, you can see that Computer A sends outgoing email to its local SMTP server operated by the Internet Service Provider (ISP). Computer A uses the POP3 protocol (Post Office Protocol) to check for and retrieve messages stored on the server. If Computer A needs to send an email to Computer B, the message travels first through SMTP to the local SMTP server. This server looks up the mail server for the domain in which the recipient on Computer B resides and sends the message, again using SMTP, to Computer B's SMTP server. When Computer B decides to check messages, it uses POP3 and gets the email sent by Computer A. Note that if either computer wants to send email to Computer C, then still another SMTP server becomes involved.

Note, however, that there isn't real centralization. SMTP servers communicate among themselves directly and do not go through any central clearinghouse. It's possible that a mail message will take a route through several SMTP servers to reach the eventual mailbox that is the destination of the email. When a client initially starts a session with an SMTP server, it can give the server a source-route (list of hosts) through which the message should travel to get to its destination. This is called a forward-path. In addition, the client can give the server a reverse-path, which is a source-route to return error messages to the client if something happens during the transmission of the email message.

Because it is a decentralized system, the operation is simplified. The failure of an SMTP server, here and there, doesn't affect the entire Internet. The only people who get to complain are those who use the downed SMTP servers for their email.

When mail is downloaded from an SMTP server using POP3, the messages are deleted from the SMTP server's database and stored locally on the user's computer. If you delete the email messages stored locally on your computer, they are gone forever!

The SMTP Model

RFC 2821 recognizes that an SMTP server would have to service both local clients and relay mail messages to other SMTP servers when the destination is not a client of the original server. In the original RFC, names were given for the different processes involved, depending on who is doing what to whom. For example, SMTP applications can act as either of the following, depending on the direction of the flow of information:

Sender-SMTP The client establishes a two-way (full-duplex) session with the local SMTP sever.
Receiver-SMTP The SMTP server receives commands from the Sender-SMTP. The Receiver-SMTP process can be an SMTP server that can deliver the message to its recipient's mailbox or to another SMTP server.

To bring these definitions up-to-date using more modern terminology, the Sender-SMTP is now referred to in the RFCs as the SMTP Client (Sender-SMTP) and the SMTP Server (Receiver-SMTP).

Note that when a message passes through several SMTP servers, one server becomes the SMTP Client and the server to which the message is being sent becomes the SMTP Server. The SMTP Client process does not always indicate the original client that created the email message in the first place.

In fact, there are four types of SMTP server roles that are dependent on the services provided:

Originator A server that originates an email message and sends it out onto the Internet (or an intranet).
Delivery A server that receives email messages and stores them for the client to retrieve.
Relay An SMTP server that relays an email message from one SMTP server to another, and is not the originator or delivery SMTP server.
Gateway A server that acts as a go-between SMTP and another messaging system. The gateway may modify the contents of the SMTP message to accommodate the other messaging system.

Most of the original definition of SMTP from RFC 821 remains intact. A few other RFCs over the years have added minor changes to the protocol, but it has remained basically a system for request/reply messages, or in the words of the RFC, a lock-step method. A request is made and a reply is sent. In the original version, a client sends a command to the server and the server responds with a single reply. The connection between the client and the SMTP server is a simple two-way channel (using the single TCP port 25).

SMTP Service Extensions

SMTP was developed more than a decade ago. Over time it has been necessary to provide additional functionality to the protocol, called service extensions. These were first added to SMTP by RFC 1425, "SMTP Service Extensions." Further RFCs added to these additional services, which are now covered by RFC 2821. An additional extension has been added by RFC 2920, "SMTP Service Extension for Command Pipelining."

Service extensions are negotiated between SMTP servers to find out which extensions are supported by each server. There are four basic categories of service extensions:

Delivery
Authentication and Security
Command Pipelining
Enhanced Status Codes

The Internet Assigned Numbers Authority (IANA) is responsible for maintaining a list of SMTP extensions. You can consult the IANA at www.iana.org.

SMTP Commands and Response Codes

The first command that the Sender-SMTP client sends is either the HELO command or the EHLO command. EHLO is now the preferred command and is part of the basic service extensions. If an SMTP server does not support additional service extensions, it will respond with an error message indicating a syntax error.

This is the basic syntax for SMTP commands:

<command> <arguments> <CRLF>

In this syntax, <CRLF> indicates that a carriage-return followed by a line-feed character is used to signal the end of the command line.

In the following commands, the term forward-path is a list of hosts the message travels through to reach its destination. The term reverse-path is used to indicate how to get back to the sender of the email, which can be helpful when returning error or other informational messages.

Note

One important thing to note about SMTP commands is that they are not case-sensitive. The client or server code must accept both upper- and lowercase text for commands and not differentiate between the two. Commands can even be a mixture of upper- and lowercase letters. This is not true, however, of user mailbox names, although hostnames (that is, the portion of the email address following the "@" sign) also are not case-sensitive. Because SMTP allows mailbox names to be case-sensitive, the actual user's mailbox name may be limited to a particular case on some servers and should be preserved by the server and transmitted exactly as received.

The basic SMTP commands include the following:

HELO This command (or the next one in this list) is sent by the Sender-SMTP client to the SMTP server to begin the mail transfer session. The argument to this command is the hostname of the Sender-SMTP computer.
EHLO This is now the preferred command that replaces the HELO command and indicates that the Sender-SMTP client wants to use SMTP extensions. This command also uses the Domain name instead of the IP address. If the SMTP server supports SMTP extensions, it returns a code of 250 to the client. If the server does not support the extensions, it returns a code of 500.
AUTH This stands for authenticate. The user provides a username/password to the SMTP server to authenticate the client before mail can be sent.
ATRN This stands for authenticated TURN. After a client has been authenticated to the SMTP server, this command instructs the Receiver-SMTP to return an OK response. In that case, the SMTP server must assume the function as the sender of the mail. Otherwise, the SMTP server can return a Bad Gateway message (reply number 502) and remain in the role as Receiver-SMTP.
DATA This command is followed by actual data that makes up the email message. This includes both the body text and such things as the subject line.
EXPN This stands for expand. This command requests the SMTP server to tell the client whether the argument included with the command is a mailing list. If so, the server returns a list of the members of the mailing list to the client.
HELP This command instructs the SMTP server to return help information to the sender. The HELP command might or might not contain arguments.
MAIL This command includes the reverse path as its argument. This is the name of the sender, but it also can be a list of hosts that were used to relay the mail message from its original Sender-SMTP. In a list of hosts, the first host is the current Receiving-SMTP server. The last is the destination of the email.
NOOP This is the "no operation" command. The server responds with OK.
QUIT The Sender-SMTP sends this command when it is finished. The server should return an OK message and then close down the transmission channel (that is, TCP connection).
RCPT This stands for recipient. The argument for this command is a single recipient, specified by using a forward-path list preceded by the letters TO:. If a mail message is being sent to more than one recipient, a separate RCPT command must be issued for every recipient.
RSET This aborts the current email transaction. The SMTP server should respond with an OK message.
SAML This stands for Send and Mail. Mail is the typical use today with SMTP. The send method is meant to be used when the SMTP server has been implemented to deliver mail directly to a recipient that is actively connected. The argument for this command, again, is a reverse-path showing the path to the destination of the email. The reverse-path text is preceded by the text FROM:.
SEND This command, not often implemented, specifies that the mail message be delivered directly to the destination, if it's actively connected. If this cannot be done, the server returns a message code of 450 (the mailbox is not available). Similar to the SAML command, the argument for this command is the text FROM: followed by the reverse-path to the destination mailbox.
SIZE This command lets the Sender-SMTP inform the server of the size of the mail message it wants to send. This is supported only by SMTP implementations that use the SMTP Service Extensions. The server can return a message indicating that it cannot handle mail of the size requested, or it can accept the message.
SOML This stands for Send or Mail. Similar to SAML, this command requests that the mail be "sent" (for example, directly to the actively connected recipient) or mailed. The server tries the Send method first, and if that fails, the server attempts to deliver the message to the destination mailbox.
TURN This command instructs the Receiver-SMTP to assume the role of the sender of the mail (in which an OK response is returned). The server can refuse (with a code 502) and remain in the role of Receiver-SMTP.
VRFY This command asks the Receiver-SMTP to verify that the username that is passed as an argument with the command be checked to determine whether it's valid. If the username is a valid one, the full name and mailbox of the user are returned.

Because SMTP allows for sending a single message to multiple recipients, a large mailing list could generate a lot of network traffic. Thus, the original SMTP RFC 821 recommends that only one copy of the actual email be sent to the server in this sort of situation. The SEND command (and its associated commands) was intended originally to send a message directly to a user's terminal. In today's world of PCs and workstations, it isn't typical to find a user sitting at a terminal. It also usually is not desirable to have mail pop up suddenly on a user's terminal if this function is still supported in your network. Instead, the MAIL command and its other associated commands are usually used.

The DATA portion of the mail message is terminated using the period (.) character by itself on a single linewhich is followed by the line terminating characters (CRLF). Typically this will be <CRLF>.<CRLF>, because the first <CRLF> terminates the last line of actual data.

Note

SMTP commands are sent as single lines of commands or data terminated with <CRLF>. However, the term mail object is used to describe an email message. A mail object consists of an envelope and the content of the mail message. The envelope contains such information as the sender and recipient addresses and protocol extensions. The content consists of the data sent by the message.

SMTP Response Codes

Remember that for each command issued by the Sender-SMTP, a single response is expected from the Receiver-SMTP. This simple lock-step method keeps things synchronized so that both sides of the connection are aware of the current state of the transaction. The three-digit response codes that the Receiver-SMTP can use are similar in format to those returned by FTP servers. The first digit indicates the general meaning (or category) of the response.

These are the first digits for SMTP response codes:

1 This is a positive response. The command has been accepted by the SMTP server and the server is waiting for further information to determine whether it should continue or abort processing. At this time, no SMTP commands allow this kind of reply message.
2 This is a positive response. The function requested by the client-SMTP has been completed, and the server is ready for another command.
3 This is a positive response. This is similar to category 2 but indicates that the action requested is being held up, waiting for further information or commands from the Sender-SMTP.
4 This is a negative response. It indicates that something went wrong and the command could not be completely processed. The Sender-SMTP should retry the command, or sequence of commands, that led up to this response.
5 This is a negative response. Unlike the "4" response code category, this one indicates that an error has occurred that prevents the command from being executed, such as a misspelling. The command can be tried again, but not unless the Sender-SMTP can determine the problem and correct it before trying again.

The second digit of the response code provides a further subdivision within that category to further indicate the response. The second digit can have the following values:

0 This indicates a syntax error or that the particular command is not supported by the server. For example, if the Sender-SMTP client supports the SMTP service extensions but the Receiver-SMTP server does not, it returns this value.
1 This is used in replies that return help messages to the client.
2 This is used for replies that refer to the transmission channel.
5 This is used in replies that are reporting the status of the mail system, as it pertains to the requested mail transfer or the current command.

Note in the preceding list that numbers 3 and 4 are omitted. They are undefined at this time.

The third digit further delineates the response indicated by the first two digits, and the list of possible codes that result from this combination is much too long to list here. Table 26.1 lists just a few of the more common numeric responses that are used in most SMTP implementations today.

Table 26.1. Typical Reply Codes Used by SMTP
Reply Code	Definition
500	Syntax error or unrecognized command.
501	Syntax error in parameters or arguments.
502	Command not implemented.
503	Bad sequence of commands.
504	Command parameter not implemented.
211	System status or system help reply.
214	Help message (useful only for the user).
250	The requested mail action is okay or completed.
251	The user is not a local user, so the email will be forwarded.
252	Cannot `VRFY` (see preceding commands), but will attempt delivery.
450	Mailbox unavailable (that is, busy), so requested action not taken.
550	Mailbox not found, not accessible, or rejected due to policy.
451	Action aborted due to error in processing.
551	User is not local, followed by a possible forward-path to use.
452	Action not taken due to insufficient system storage.
552	Action aborted because it will exceed the storage allocation.
553	Action not taken because mailbox syntax is incorrect.
354	Start mail input and end with `<CRLF>.<CRLF>`.
554	Transaction failed or reply that there is SMTP service available.

Figure 26.1. SMTP is used to send and receive email. SMTP is used primarily to upload email to the server, whereas POP3 is generally used to download mail.

The SMTP Model

SMTP Service Extensions

SMTP Commands and Response Codes

SMTP Response Codes

Table 26.1. Typical Reply Codes Used by SMTP