Chapter 29: Network Programming | Professional VB 2005 with .NET 3.0 (Programmer to Programmer)

Just as it is difficult to live your life without talking with people, your applications also need to communicate, perhaps with other programs or perhaps with hardware devices. As shown throughout this book, you can use a variety of techniques to have your program communicate, including .NET Remoting, Web Services, and Enterprise Services. This chapter looks at yet another way to communicate: using the basic protocols that the Internet and many networks have been built on. You will learn how the classes in System.Net can provide a variety of techniques for communicating with existing applications such as Web or FTP servers, and how you can use them to create your own communication applications.

Before you start writing applications using these classes, however, it would be good to get some background on how networks are bolted together, and how machines and applications are identified.

Protocols, Addresses, and Ports

No discussion of a network is complete without a huge number of acronyms, seemingly random numbers, and the idea of a protocol. For example, the World Wide Web runs using a protocol called HTTP, or HyperText Transfer Protocol. Similarly, there is File Transfer Protocol (FTP), Network News Transfer Protocol (NNTP), and Gopher, also a protocol. Each application you run on a network communicates with another program using a defined protocol. The protocol is simply the expected messages each program sends the other, in the order they should be sent. For a real-world example, if you want to go see a movie with a friend, a simplified conversation could look like this:

  You: Dials phone Friend: Hears phone ringing, answers phone, "Hello" You: "Hello. Want to go see 'Freddie and Jason Escape from New York, part 6?'" Friend: "No, I saw that one already. What about 'Star Warthogs?'" You: "OK, 9:30 showing downtown?" Friend: "Yes" You: "Later" Friend: "See you," hangs up

Apart from a bad taste in movies, you can see a basic protocol here. Someone initiates a communication channel. The recipient accepts the channel and signals the start of the communication. The initial caller then sends a series of messages to which the recipient replies, either to signify they have been received or as either a positive or negative response. Finally, one of the messages indicates the end of the communication channel, and the two disconnect.

Similarly, network applications have their own protocols defined by the application writer. For example, sending an e-mail using SMTP (Simple Mail Transfer Protocol) could look like this:

  220 schroedinger Microsoft ESMTP MAIL Service, Version: 6.0.2600.2180 ready at Wed, 6 Oct 2004 15:58:28 -0700 HELLO 250 schroedinger Hello [127.0.0.1] FOO 500 5.3.3 Unrecognized command MAIL FROM: me 250 2.1.0 me@schroedinger....Sender OK RCPT TO: him 250 2.1.5 him@schroedinger DATA 354 Start mail input; end with <CRLF>.<CRLF> subject: Testing SMTP Hello World, via mail. . 250 2.6.0 <SCHROEDINGERKaq65r500000001@schroedinger> Queued mail for delivery QUIT 221 2.0.0 schroedinger Service closing transmission channel Connection to host lost.

In this case, lines beginning with numbers are coming from the server, while the items in uppercase (and the message itself) were sent from the client. If the client sends an invalid message (as in the “FOO” message in the preceding code example), it receives a gentle rebuff from the server, whereas correct messages receive an “OK” or “Go on” reply. Traditionally, for SMTP and many other protocols, the reply is a three-digit number (see the following table) identifying the response (the text after the number, such as “2.1.0 me@schroedinger...Sender OK,” isn’t really needed). The return values for the services generally fall into one of five ranges. Each range identifies a certain family of responses:

Open table as spreadsheet

Range	Description
100−199	Message is good, but the server is still working on the request
200−299	Message is good, and the server has completed acting on the request
300−399	Message is good, but the server needs more information to work on the request
400−499	Message is good, but the server could not act on the request. You may try the request again to see if it works in the future
500−599	The server could not act on the request. Either the message was bad or an error occurred. It likely won’t work next time.

Other protocols use this technique as well (leading to the infamous HTTP 404 error for “Page not found”), but they don’t have to. Having a good reference is key to your success, and the best reference for existing protocols is the Request for Comments (RFC) for the protocol. These are the definitions that are used by protocol authors to create their implementation of the standard. Many of these RFCs are available online at the IETF (www.ietf.org) and the World Wide Web Consortium (www.w3.org).

Addresses and Names

The next important topic necessary to a thorough understanding of network programming is the relationship between the names and addresses of each of the computers involved. Each form of network communication (such as TCP/IP networks such as the Internet) has its own way of mapping the name of a computer (or host) to an address. The reason for this is simple: Computers deal with numbers better than text, and humans can remember text better than numbers (generally). Therefore, while you may have named your computer something clever like “l33t_#4x0R,” applications and other computers know it by its IP (Internet Protocol) address. This address is a 32-bit value, usually written in four parts (each part is a byte that is a number from 0 to 255), such as 192.168.1.39. This is the standard the Internet has operated on for many years. However, because only about four billion unique addresses can be generated using this method, another standard, IPv6, has been proposed. It is called IPv6 because it is the sixth recommendation in the series (the older 32-bit addresses are often called IPv4 to differentiate them from this new standard). With IPv6, a 128-bit address is used, leading to a maximum number of about 3 × 1028unique addresses - more than enough for every Internet-enabled toaster.

This IP (whether IPv4 or IPv6) address must uniquely identify each host on a network (actually subnet-work, as you’ll see shortly). If not, messages cannot be routed to their destination properly, and chaos ensues. The matter gets more complicated when another 32-bit number, the subnet mask, is brought into the picture. This is a value that is masked (using a Boolean AND operation) over the address to identify the subnetwork of the network that the computer is on. All addresses on the same subnetwork must be unique. Two subnetworks may have the same address, however, as long as the subnet mask is different between the two.

Many common subnetworks use the value 255.255.255.0 for the subnet mask. When this is applied to the network address (see below), only the last address is considered significant. Therefore, the subnetwork can only include 254 unique addresses (0 and 255 are used for other purposes).

  Network address:        192.168.  1.107 Subnet Mask:            255.255.255.  0 Result:                 192.168.  1.  0

Because computers and humans use two different means of identifying computers, there must obviously be some way for the two to be related. The term for this process is name resolution. In the case of the Internet, a common means of name resolution is yet another protocol, the Domain Naming System (DNS). A computer, when faced with an unknown text-based name, sends a message to the closest DNS server. It then determines whether it knows the IP address of that host. If it does, then it passes it back to the requester. If not, then it asks another DNS server. This process continues until either the IP address is found or you run out of DNS servers. Once the IP address is found, all the servers (and the original computer) store that number for a while in case they are asked again.

Keeping in mind the problems that can ensue during name resolution can often solve many development problems. For example, if you are having difficulty communicating with a computer that should be responding, then it may be because your computer simply can’t resolve the name of the remote computer. Try using the IP address instead. This removes any name-resolution problems from the equation, and may allow you to continue developing while someone else fixes the name-resolution problem.

Ports: They’re Not Just for Ships

As the previous sections described, each computer or host on a network is uniquely identified by an address. How does your computer determine which running applications, possibly of many, are meant to receive a given message arriving on the network? This is determined by the port the message is targeted at. The port is another number, in this case an integer value from 1 to 32,767. The unique combination of address and port identifies the target application.

For example, assume you currently have a Web server (IIS) running, as well as an SMTP server, and a few browser windows open. If a network message comes in, how does the operating system “know” which of these applications should receive the packet? Each of the applications (either client or server) that may receive a message is assigned a unique port number. In the case of servers, this is typically a fixed number, whereas client applications, such as your Web browser, are assigned a random available port.

To make communication with servers easier, they typically use a well-known assigned port. In the case of Web servers, this is port 80, while SMTP servers use port 25. You can see a list of common servers and their ports in the file %windows%\system32\drivers\etc\services.

If you’re writing a server application, you can either use these common port numbers (and you should if you’re attempting to write a common type of server) or choose your own. If you’re writing a new type of server, choose a port that has not been assigned to another server; choosing a port higher than 1024 should prevent any conflicts, as these are not assigned. When writing a client application, it isn’t usually necessary to assign a port, as a dynamic port is assigned to the client for communication with a server.

Tip

Ports below 1024 should be considered secure ports, and applications that use them should have administrative access.

Firewalls: Can’t Live with Them, Can’t Live without Them

Many people have a love/hate relationship with firewalls. While they are invaluable in today’s network, sometimes it would be nice if they got out of the way. A firewall is a piece of hardware or software that monitors network traffic, either incoming, outgoing, or both. They can be configured to allow only particular ports or applications to transmit information beyond the firewall. They protect against hackers or viruses that may attempt to connect to open ports, leveraging them to their own ends. They protect against spyware applications that may attempt to communicate from your machine.

They also “protect” against any network programming you may attempt to do. You must invariably cooperate with your network administrators, working within their guidelines for network access. If they make only certain ports available, then your applications should use only those ports. Alternately, you may be able to get them to configure the firewalls involved to permit the ports needed by your applications. Thankfully, creating network messages is a bit easier with Visual Basic 2005. The following sections demonstrate how.