First, I have to place a disclaimer here: this chapter will provide you with the necessary ideas and walk you through the process of designing and developing a specialized transport channel. It isn't the objective of this chapter to give you a commercial grade SMTP/POP3 channel. Even though you could just use this channel in your application, you'll need to understand it fully before doing so because neither the author nor the publisher will provide any support when you include this channel in your applications.
Okay, now let's start with the implementation of this channel. Before doing this you nevertheless have to know the protocol (or at least the relevant parts of it) that you are going to use. Every Internet protocol has its so-called request-for-comment (RFC) documents. The SMTP protocol in its most recent version is shown in RFC2821 and POP3 in RFC1939. You can get them at http://www.ietf.org/rfc/rfc2821.txt and http://www.ietf.org/rfc/rfc1939.txt.
You can generally search for any RFC at http://www.faqs.org/rfcs/index.html, but you should keep in mind that normally RFCs are identified by the full protocol name and not the more common acronym. For example, you'd have to search for "Simple Mail Transfer Protocol", not "SMTP", to find the mentioned RFCs. If you don't know the real protocol name, you can search for the abbreviation on http://www.webopedia.com/.
Generally the transfer of e-mails is split between two protocols: SMTP and POP3. SMTP is used to send e-mails from your mail client to a mail server and will then be used for inter–mail-server communication until it reaches a destination mailbox. POP3 is used to receive e-mails from a server's mailbox to your e-mail client.
To save you from having to read through the whole RFC2821, I provide here a short summary of the relevant parts needed to implement this channel. First, SMTP uses a request/reply syntax normally sent over a TCP connection to port 25 on the server. The client can send a list of commands, and the server replies to them with a status code and a message. The status code is a three-digit number of which the first digit specifies the class. These are shown in Table 10-1. The message that might be sent after the status code is not standardized and should be ignored or used only for reporting errors back to the user.
Positive response. Command accepted and executed.
Intermediate or transient positive. Command accepted, but more information is needed. In plain English, this means the client can now start to transfer the mail (if received as a reply to the DATA command).
Transient error. Try again later.
Permanent error. You quite certainly did something wrong!
A successful SMTP conversion can therefore look like the following code. (The → symbol indicates that the client sends this line and the ← symbol indicates the server's reply.)You can easily try this out yourself by entering telnet <mailserver> 25—but be aware that the commands you input might not be echoed back to you.
← 220 MyRemotingServer Mercury/32 v3.21c ESMTP server ready. → HELO localhost ← 250 MyRemotingServer Hi there, localhost. → MAIL FROM: <client_1@localhost> ← 250 Sender OK - send RCPTs. → RCPT TO: <server_1@localhost> ← 250 Recipient OK - send RCPT or DATA. → DATA ← 354 OK, send data, end with CRLF.CRLF → <sending message contents inclusive headers> → sending <CR><LF>.<CR><LF> (i.e. a "dot" between two CR/LFs) ← 250 Data received OK. → QUIT ← 221 MyRemotingServer Service closing channel.
As you can see here, several commands can be sent by the client. At the beginning of the session, after the server announces itself with 220 servername message, the client will send HELO hostname. This starts the SMTP session, and the server responds with 250 servername message.
For each e-mail the client wants to send via this server, the following process takes place. First, the client starts with MAIL FROM: <e-mail address> (note that the e-mail address has to be enclosed in angle brackets). The server replies with 250 message if it allows e-mails from this sender; otherwise, it replies with a 4xx or 5xx status code. The client then sends one or more RCPT TO: <e-mail address> commands that designate the recipients of this e-mail and that are also confirmed by 250 message replies.
As soon as all recipients have been specified, the client sends DATA to notify the server that it's going to send the e-mail's content. The server replies with 354 message and expects the client to send the e-mail and finish with "." (dot) on a single new line (that is, the client sends <CR><LF><DOT><CR><LF>). The server then acknowledges the e-mail by replying with 250 message.
At this point the client can send further e-mails by issuing MAIL FROM or can terminate the session with the QUIT command, which will be confirmed by the server via a 221 message reply. The server will then also close the TCP connection. Sometime after the message is sent, it will be placed in a user's mailbox from where it can be retrieved by the POP3 protocol.
Generally POP3 works in quite the same way as SMTP: it's also a request/response protocol. POP3 messages are generally sent over a TCP connection to port 110. Instead of the status codes SMTP relies on, POP3 supports three kinds of replies. The first two are +OK message to indicate success and -ERR message to indicate failure. The messages after the status code aren't standardized, and as such should be used only to report errors to your user and not be parsed by a program.
Another type of reply is a content reply, which is used whenever you request information from the server that might span multiple lines. In this case the server will indicate the end of the response with the same <CR><LF><DOT><CR><LF> sequence that is used by SMTP to end the transfer of the e-mail text.
A sample POP3 session might look like this. To start it, enter telnet <mailserver> 110.
← +OK <1702038211.21388@vienna01>, MercuryP/32 v3.21c ready. → USER server_1 ← +OK server_1 is known here. → PASS server_1 ← +OK Welcome! 1 messages (231 bytes) → LIST ← +OK 1 messages, 3456 bytes ← 1 231 ← . → RETR 1 ← +OK Here it comes... ← <e-mail text> ← . → DELE 1 ← +OK Message deleted. → QUIT ← +OK vienna01 Server closing down.
As you can see in this connection trace, the client authenticates itself at the server by sending the commands USER username and PASS password. In this case the password is transported in clear text over the network. Most POP3 servers also support an APOP command based on an MD5 digest that incorporates a server-side timestamp to disable replay attacks. You can read more about this in RFC1939.
The server then replies with a message like +OK Welcome! 1 messages (231 bytes). You should never try to parse this reply to receive the message count; instead, either send the command STAT, which will be answered by +OK number_of_messages total_bytes or issue a LIST command, which will first return +OK message and then return this line once for each message: message_number bytes. The reply concludes with a <CR><LF><DOT><CR><LF> sequence.
The client can then issue a RETR message_number command to retrieve the content of a specific message (with a final <CR><LF><DOT><CR><LF> as well) and a DELE message_number statement to delete it at the server. This deletion is only carried out after sending QUIT to the server, which then closes the connection.
After reading the last few pages, you are almost fully equipped to start with the design of your channel. The last thing you need to know before you can start to write some code is how the resulting e-mail has to look like.
Because there's no standard for the binding of .NET Remoting to custom transfer protocols, I just elected Simon Fell's recommendation for SOAP binding to SMTP as the target specification for this implementation. You can find the latest version of this document at http://www.pocketsoap.com/specs/smtpbinding/. In essence, it basically says that the content has to be either Base64 or Quoted-Printable encoded and needs to supply a given number of e-mail headers. So, what does this mean? Essentially, the e-mail format we know today has been designed ages ago when memory was expensive, WAN links were slow, and computers liked to deal with 7-bit ASCII characters.
Nowadays we instead use Unicode, which allows us to have huge numbers of characters so that even languages like Japanese, Chinese, or Korean can be encoded. This is, of course, far from 7 bit, so you have to find a way to bring such data back to a 7-bit format of "printable characters." Base64 does this for you; it is described in detail in section 5.2 of RFC1521, available at http://www.ietf.org/rfc/rfc1521.txt. To encode a given byte array with the Base64 algorithm, you can use Convert.ToBase64String().
An e-mail contains not only the body, but also header information. For a sample SOAP-via-SMTP request, the complete e-mail might look like this:
From: client_1@localhost To: server_1@localhost Message-Id: <26fc4f4cd8de4567a66ccea6897dc481@REMOTING> MIME-Version: 1.0 Content-Type: text/xml; charset=utf-8 Content-Transfer-Encoding: BASE64 ...encoded SOAP request here...
To match a response message to the correct request, the value of the Message-Id header will be included in the In-Reply-To header of the response message:
From: server_1@localhost To: client_1@localhost Message-Id: <97809278530983552398576545869067@REMOTING> In-Reply-To: <26fc4f4cd8de4567a66ccea6897dc481@REMOTING> MIME-Version: 1.0 Content-Type: text/xml; charset=utf-8 Content-Transfer-Encoding: BASE64 ...encoded SOAP response here...
You also need to include some special headers that are taken from the ITransportHeaders object of the .NET Remoting request. Those will be preceded by X-REMOTING- so that a complete Remoting request e-mail might look like this:
From: client_1@localhost To: server_1@localhost Message-Id: <26fc4f4cd8de4567a66ccea6897dc481@REMOTING> MIME-Version: 1.0 Content-Type: text/xml; charset=utf-8 Content-Transfer-Encoding: BASE64 X-REMOTING-Content-Type: text/xml; charset="utf-8" X-REMOTING-SOAPAction: "http://schemas.microsoft.com/clr/ns/System.Runtime .Remoting.Activation.IActivator#Activate" X-REMOTING-URI: /RemoteActivationService.rem ...encoded .NET Remoting request here...
The encoded .NET Remoting request itself can be based on either the binary formatter or SOAP!