28.4 SMTP Futures

28.4 SMTP Futures

Changes are taking place with Internet mail. Recall the three pieces that comprise Internet mail: the envelope, headers, and body. New SMTP commands are being added that affect the envelope, non-ASCII characters can be used in the headers, and structure is being added to the body (MIME). In this section we consider the extensions to each of these three pieces in order.

Envelope Changes: Extended SMTP

RFC 1425 [Klensin et al. 1993a] defines the framework for adding extensions to SMTP. The result is called extended SMTP (ESMTP). As with other new features that we've described in the text, these changes are being added in a backward compatible manner, so that existing implementations aren't affected.

A client that wishes to use the new features initiates the session with the server by issuing an EHLO command, instead of HELO. A compatible server responds with a 250 reply code. This reply is normally multiline, with each line containing a keyword and an optional argument. These keywords specify the SMTP extensions supported by the server. New extensions will be described in an RFC and will be registered with the IANA. (In a multiline reply all lines except the last have a hyphen after the numeric reply code. The last line has a space after the numeric reply code.)

We'll show the initial connection to four SMTP servers, three of which support extended SMTP. We connect to them using Telnet, but have removed the extraneous Telnet client output.

 sun  % telnet vangogh.cs.berkeley.edu 25  220-vangogh.CS.Berkeley.EDU Sendmail 8.1C/6.32 ready at Mon, 2 Aug 1993 15:    47:48 -0700    220 ESMTP spoken here  ehlo sun.tuc.noao.edu  250-vangogh.CS.Berkeley.EDU Hello sun.tuc.noao.edu [140.252.1.29], pleased    to meet you    250-EXPN    250-SIZE    250 HELP 

This server gives a multiline 220 reply for its greeting message. The extended commands listed in the 250 reply to the EHLO command are EXPN, SIZE, and HELP. The first and last are from the original RFC 821 specification, but they are optional commands. ESMTP servers state which of the optional RFC 821 commands they support, in addition to newer commands.

The SIZE keyword that this server supports is defined in RFC 1427 [Klensin, Freed, and Moore 1993]. It lets the client specify the size of the message in bytes on the MAIL FROM command line. This lets the server verify that it will accept a message of that size, before the client starts to send it. This command was added since the size of Internet mail messages is growing, with the support for message content other than ASCII lines (i.e., images, audio, etc.).

The next host also supports ESMTP. Notice that the 250 reply specifying that the SIZE keyword is supported contains an optional argument. This indicates that this server will accept a message size up to 461 Mbytes.

 sun  % telnet ymir.claremont.edu 25  220 ymir.claremont.edu -- Server SMTP (PMDF V4.2-13 #4220)  ehlo sun.tuc.noao.edu  250-ymir.claremont.edu    250-8BITMIME    250-EXPN    250-HELP    250-XADR    250 SIZE 461544960 

The keyword 8BITMIME is from RFC 1426 [Klensin et al. 1993b]. This allows the client to add the keyword BODY to the MAIL FROM command, specifying whether the body contains NVT ASCII characters (the default) or 8-bit data. Unless the client receives the 8BITMIME keyword from the server in response to an EHLO command, the client is forbidden from sending any characters other than NVT ASCII. (When we talk about MIME in this section, we'll see that an 8-bit SMTP transport is not required by MIME.)

This server also advertises the XADR keyword. Any keyword that begins with an X refers to a local SMTP extension.

The next server also supports ESMTP, advertising the HELP and SIZE keywords that we've already seen. It also supports three local extensions that begin with an X.

 sun  % telnet dbc.mtview.ca.us 25  220 dbc.mtview.ca.us Sendmail 5.65/3.1.090690, it's Mon, 2 Aug 93 15:48:50     -0700  ehlo sun.tuc.noao.edu  250-Hello sun.tuc.noao.edu, pleased to meet you    250-HELP    250-SIZE    250-XONE    250-XVRB    250 XQUE 

Finally we see what happens when the client tries to use ESMTP by issuing the EHLO command to a server that doesn't support it.

 sun %  telnet relay1.uu.net 25  220 relay1.UU.NET Sendmail 5.61/UUNET-internet-primary ready at Mon, 2 Aug    93 18:50:27 -0400  ehlo sun.tuc.noao.edu  500 Command unrecognized  rset  250 Reset state 

Instead of receiving a 250 reply to the EHLO command, the client receives a 500 reply. The client should then issue the RSET command, followed by a HELO command.

Header Changes: Non-ASCII Characters

RFC 1522 [Moore 1993] specifies a way to send non-ASCII characters in RFC 822 message headers. The main use of this is to allow additional characters in the sender and receiver names , and in the subject.

The header fields can contain encoded words. They have the following format:

= ? charset ? encoding ? encoded-text ? =

charset is the character set specification. Valid values are the two strings us-ascii and iso “8859 “X, where X is a single digit, as in iso “8859 “1.

encoding is a single character to specify the encoding method. Two values are supported.

  1. Q encoding means quoted-printable, and is intended for Latin character sets. Most characters are sent as NVT ASCII (with the high-order bit set to 0, of course). Any character to be sent whose eighth bit is set is sent instead as three characters: first the character =, followed by two hexadecimal digits. For example, the character (whose binary 8-bit value is 0xe9 ) is sent as the three characters =E9. Spaces are always sent as either an underscore or the three characters =20. This encoding is intended for text that is mostly ASCII, with a few special characters.

  2. B means base-64 encoding. Three consecutive bytes of text (24 bits) are encoded as four 6-bit values. The 64 NVT ASCII characters used to represent each of the possible 6-bit values are shown in Figure 28.6.

    Figure 28.6. Encoding of 6-bit values (base-64 encoding).
    graphics/28fig06.gif

    When the number of characters to encode is not a multiple of three, equal signs are used as the pad characters.

The following example of these two encodings is from RFC 1522:

 From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>     To: =?ISO-8859-l?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>     CC: =?ISO-8859-l?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>     Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=      =?ISO-8859-2?B?dSB1bmR1cnN0YW5kIHRoZSBleGFtcGxlLg==?= 

A user agent capable of handling these headers would output:

 From: Keith Moore <moore@cs.utk.edu>    To: Keld Jrn Simonsen <keld@dkuug.dk>    CC: Andr Pirard <PIRARD@vm1.ulg.ac.be>    Subject: If you can read this you understand the example. 

To see how base-64 encoding works, look at the first four encoded characters in the subject line: SWYg. Write out the 6-bit values for these four characters from Figure 28.6 ( S=0xl2, W=0xl6, Y=0xl8, and g=0x20 ) in binary:

 010010 010110 011000 100000 

Then regroup these 24 bits into three 8-bit bytes:

 01001001 01100110 00100000      =0x49     =0x66   =0x20 

which are the ASCII representations for I, f, and a space.

Body Changes: Multipurpose Internet Mail Extensions (MIME)

We've said that RFC 822 specifies the body as lines of NVT ASCII text, with no structure. RFC 1521 [Borenstein and Freed 1993] defines extensions that allow structure in the body. This is called MIME, for Multipurpose Internet Mail Extensions.

MIME does not require any of the extensions that we've described previously in this section (extended SMTP or non-ASCII headers). MIME just adds some new headers (in accordance with RFC 822) that tell the recipient the structure of the body. The body can still be transmitted using NVT ASCII, regardless of the mail contents. While some of the extensions we've just described might be nice to have along with MIME ” the extended SMTP SIZE command, since MIME messages can become large, and non-ASCII headers ” these extensions are not required by MIME. All that's required to exchange MIME messages with another party is for both ends to have a user agent that understands MIME. No changes are required in any of the MTAs.

MIME defines the five new header fields:

 Mime-Version:     Content-Type:     Content-Transfer-Encoding:     Content-ID:     Content-Description: 

As an example, the following two header lines can appear in an Internet mail message:

 Mime-Version: 1.0     Content-Type: TEXT/PLAIN; charset=US-ASCII 

The current MIME version is 1.0 and the content type is plain ASCII text, the default for Internet mail. The word PLAIN is considered a subtype of the content type ( TEXT ), and the string charset=US-ASCII is a parameter.

Text is just one of MIME's seven defined content types. Figure 28.7 summarizes the 16 different content types and subtypes defined in RFC 1521. Numerous parameters can be specified for certain content types and subtypes .

Figure 28.7. MIME content types and subtypes.
graphics/28fig07.gif

The content type and the transfer encoding used for the body are independent. The former is specified by the Content-Type header field, and the latter by the Content-Transfer-Encoding header field. There are five different encoding formats defined in RFC 1521.

  1. 7bit, which is NVT ASCII, the default.

  2. quoted-printable, which we saw an example of earlier with non-ASCII headers. It is useful when only a small fraction of the characters have their eighth bit set.

  3. base64, which we showed in Figure 28.6.

  4. 8bit containing lines of characters, some of which are non-ASCII and have their eighth bit set.

  5. binary encoding, which is 8-bit data that need not contain lines.

Only the first three of these are valid for an RFC 821 MTA, since these three generate a body containing only NVT ASCII characters. Using extended SMTP with 8BITMIME support allows 8bit encoding to be used.

Although the content type and encoding are independent, RFC 1521 recommends quoted-printable for text with non-ASCII data, and base64 for image, audio, video, and octet-stream application data. This allows maximum interoperability with RFC 821 conformant MTAs. Also, the multipart and message content types must be encoded as 7bit.

As an example of a multipart content type, Figure 28.8 shows a mail message from the RFC distribution list. The subtype is mixed, meaning each of the parts should be processed sequentially, and the boundary between the parts is the string NextPart, preceded by two hyphens at the start of a line.

Figure 28.8. Example of a MIME multipart message.
graphics/28fig08.gif

Each boundary can be followed with a line specifying the header fields for the next part. Everything in the message before the first boundary is ignored, as is everything following the final boundary.

Since a blank line follows the first boundary, and not header fields, the content type of the data between the first and second boundaries is assumed to be text/plain with a character set of us-ascii. This is a textual description of the new RFC.

The second boundary, however, is followed by header fields. It specifies another multipart message, with a boundary of OtherAccess. The subtype is alternative, and two different alternatives are present. The first OtherAccess alternative is to fetch the RFC using electronic mail, and the second is to fetch it using anonymous FTP. A MIME user agent would list the two alternatives, allow us to choose one, and then automatically fetch a copy of the RFC using either mail or anonymous FTP.

This section has been a brief overview of MIME. For additional details and examples of MIME, see RFC 1521 and [Rose 1993].



TCP.IP Illustrated, Volume 1. The Protocols
TCP/IP Illustrated, Vol. 1: The Protocols (Addison-Wesley Professional Computing Series)
ISBN: 0201633469
EAN: 2147483647
Year: 1993
Pages: 378

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net