4.1 Output Streams | Java Network Programming, Third Edition

Java's basic output class is java.io.OutputStream :

 public abstract class OutputStream

This class provides the fundamental methods needed to write data. These are:

 public abstract void write(int b) throws IOException public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length)   throws IOException public void flush( ) throws IOException public void close( ) throws IOException

Subclasses of OutputStream use these methods to write data onto particular media. For instance, a FileOutputStream uses these methods to write data into a file. A TelnetOutputStream uses these methods to write data onto a network connection. A ByteArrayOutputStream uses these methods to write data into an expandable byte array. But whichever medium you're writing to, you mostly use only these same five methods. Sometimes you may not even know exactly what kind of stream you're writing onto. For instance, you won't find TelnetOutputStream in the Java class library documentation. It's deliberately hidden inside the sun packages. It's returned by various methods in various classes in java.net , like the getOutputStream() method of java.net.Socket . However, these methods are declared to return only OutputStream , not the more specific subclass TelnetOutputStream . That's the power of polymorphism. If you know how to use the superclass, you know how to use all the subclasses, too.

OutputStream 's fundamental method is write(int b) . This method takes an integer from 0 to 255 as an argument and writes the corresponding byte to the output stream. This method is declared abstract because subclasses need to change it to handle their particular medium. For instance, a ByteArrayOutputStream can implement this method with pure Java code that copies the byte into its array. However, a FileOutputStream will need to use native code that understands how to write data in files on the host platform.

Take note that although this method takes an int as an argument, it actually writes an unsigned byte. Java doesn't have an unsigned byte data type, so an int has to be used here instead. The only real difference between an unsigned byte and a signed byte is the interpretation. They're both made up of eight bits, and when you write an int onto a network connection using write(int b) , only eight bits are placed on the wire. If an int outside the range 0-255 is passed to write(int b) , the least significant byte of the number is written and the remaining three bytes are ignored. (This is the effect of casting an int to a byte .) On rare occasions, however, you may find a buggy third-party class that does something different, such as throwing an IllegalArgumentException or always writing 255, so it's best not to rely on this behavior, if possible.

For example, the character generator protocol defines a server that sends out ASCII text. The most popular variation of this protocol sends 72-character lines containing printable ASCII characters. (The printable ASCII characters are those between 33 and 126 inclusive that exclude the various whitespace and control characters.) The first line contains characters 33 through 104, sorted. The second line contains characters 34 through 105. The third line contains characters 35 through 106. This continues through line 29, which contains characters 55 through 126. At that point, the characters wrap around so that line 30 contains characters 56 through 126 followed by character 33 again. Lines are terminated with a carriage return (ASCII 13) and a linefeed (ASCII 10). The output looks like this:

 !"#$%&'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh "#$%&'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghi #$%&'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij $%&'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijk %&'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijkl &'( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm '( )*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmn

Since ASCII is a 7-bit character set, each character is sent as a single byte. Consequently, this protocol is straightforward to implement using the basic write( ) methods, as the next code fragment demonstrates :

 public static void generateCharacters(OutputStream out)    throws IOException {       int firstPrintableCharacter     = 33;    int numberOfPrintableCharacters = 94;    int numberOfCharactersPerLine   = 72;    int start = firstPrintableCharacter;    while (true) { /* infinite loop */      for (int i = start; i < start+numberOfCharactersPerLine; i++) {        out.write((         (i-firstPrintableCharacter) % numberOfPrintableCharacters)           + firstPrintableCharacter);      }      out.write('\r'); // carriage return      out.write('\n'); // linefeed      start = ((start+1) - firstPrintableCharacter)         % numberOfPrintableCharacters + firstPrintableCharacter;    }

The character generator server class (the exact details of which will have to wait until we discuss server sockets in Chapter 10) passes an OutputStream named out to the generateCharacters( ) method. Bytes are written onto out one at a time. These bytes are given as integers in a rotating sequence from 33 to 126. Most of the arithmetic here is to make the loop rotate in that range. After each 72 character chunk is written, a carriage return and a linefeed are written onto the output stream. The next start character is calculated and the loop repeats. The entire method is declared to throw IOException . That's important because the character generator server will terminate only when the client closes the connection. The Java code will see this as an IOException .

Writing a single byte at a time is often inefficient. For example, every TCP segment that goes out your Ethernet card contains at least 40 bytes of overhead for routing and error correction. If each byte is sent by itself, you may be stuffing the network with 41 times more data than you think you are! Consequently, most TCP/IP implementations buffer data to some extent. That is, they accumulate bytes in memory and send them to their eventual destination only when a certain number have accumulated or a certain amount of time has passed. However, if you have more than one byte ready to go, it's not a bad idea to send them all at once. Using write(byte[] data) or write(byte[] data, int offset, int length) is normally much faster than writing all the components of the data array one at a time. For instance, here's an implementation of the generateCharacters() method that sends a line at a time by packing a complete line into a byte array:

 public static void generateCharacters(OutputStream out)   throws IOException {      int firstPrintableCharacter = 33;   int numberOfPrintableCharacters = 94;   int numberOfCharactersPerLine = 72;   int start = firstPrintableCharacter;   byte[] line = new byte[numberOfCharactersPerLine+2];   // the +2 is for the carriage return and linefeed        while (true) { /* infinite loop */           for (int i = start; i < start+numberOfCharactersPerLine; i++) {       line[i-start] = (byte) ((i-firstPrintableCharacter)         % numberOfPrintableCharacters + firstPrintableCharacter);     }     line[72] = (byte) '\r'; // carriage return     line[73] = (byte) '\n'; // line feed     out.write(line);     start = ((start+1)-firstPrintableCharacter)       % numberOfPrintableCharacters + firstPrintableCharacter;   }    }

The algorithm for calculating which bytes to write when is the same as for the previous implementation. The crucial difference is that the bytes are packed into a byte array before being written onto the network. Also, notice that the int result of the calculation must be cast to a byte before being stored in the array. This wasn't necessary in the previous implementation because the single byte write() method is declared to take an int as an argument.

Streams can also be buffered in software, directly in the Java code as well as in the network hardware. Typically, this is accomplished by chaining a BufferedOutputStream or a BufferedWriter to the underlying stream, a technique we'll explore shortly. Consequently, if you are done writing data, it's important to flush the output stream. For example, suppose you've written a 300-byte request to an HTTP 1.1 server that uses HTTP Keep-Alive. You generally want to wait for a response before sending any more data. However, if the output stream has a 1,024-byte buffer, the stream may be waiting for more data to arrive before it sends the data out of its buffer. No more data will be written onto the stream until the server response arrives, but the response is never going to arrive because the request hasn't been sent yet! The buffered stream won't send the data to the server until it gets more data from the underlying stream, but the underlying stream won't send more data until it gets data from the server, which won't send data until it gets the data that's stuck in the buffer! Figure 4-1 illustrates this Catch-22. The flush() method breaks the deadlock by forcing the buffered stream to send its data even if the buffer isn't yet full.

Figure 4-1. Data can get lost if you don't flush your streams

It's important to flush your streams whether you think you need to or not. Depending on how you got hold of a reference to the stream, you may or may not know whether it's buffered. (For instance, System.out is buffered whether you want it to be or not.) If flushing isn't necessary for a particular stream, it's a very low cost operation. However, if it is necessary, it's very necessary. Failing to flush when you need to can lead to unpredictable, unrepeatable program hangs that are extremely hard to diagnose if you don't have a good idea of what the problem is in the first place. As a corollary to all this, you should flush all streams immediately before you close them. Otherwise, data left in the buffer when the stream is closed may get lost.

Finally, when you're done with a stream, close it by invoking its close( ) method. This releases any resources associated with the stream, such as file handles or ports. Once an output stream has been closed, further writes to it throw IOException s. However, some kinds of streams may still allow you to do things with the object. For instance, a closed ByteArrayOutputStream can still be converted to an actual byte array and a closed DigestOutputStream can still return its digest.