Chapter 12. Input/Output FacilitiesIn this chapter, we continue our exploration of the Java API by looking at many of the classes in the java.io and java.nio packages. These packages offer a rich set of tools for basic I/O and also provide the framework on which all file and network communication in Java is built. Figure 12-1 shows the class hierarchy of these packages. We start by looking at the stream classes in java.io , which are subclasses of the basic InputStream , OutputStream , Reader , and Writer classes. Then we'll examine the File class and discuss how you can read and write files using classes in java.io . We also take a quick look at the data compression classes provided in java.util.zip . Finally, we begin our investigation of the java.nio package. The NIO, or "new" I/O, package (introduced in Java 1.4) adds significant new functionality for building high-performance services. |
12.1. StreamsMost fundamental I/O in Java is based on streams . A stream represents a flow of data, or a channel of communication with (at least conceptually) a writer at one end and a reader at the other. When you are working with the java.io package to perform terminal input and output, reading or writing files, or communicating through sockets in Java, you are using various types of streams. Later in this chapter, we look at the NIO package, which introduces a similar concept called a channel . But for now, let's summarize the available types of streams:
Streams in Java are one-way
Figure 12-2. Basic input and output stream functionality
InputStream
and
OutputStream
are
abstract
classes that define the
Reader and Writer are very much like InputStream and OutputStream , except that they deal with characters instead of bytes. As true character streams, these classes correctly handle Unicode characters, which was not always the case with byte streams. Often, a bridge is needed between these character streams and the byte streams of physical devices, such as disks and networks. InputStreamReader and OutputStreamWriter are special classes that use a character-encoding scheme to translate between character and byte streams.
This section describes all the interesting stream types with the exception of
FileInputStream
,
FileOutputStream
,
FileReader
, and
FileWriter
. We
12.1.1. Terminal I/O
The
InputStream stdin = System.in;
OutputStream stdout = System.out;
OutputStream stderr = System.err;
This example hides that System.out and System.err aren't really OutputStream objects, but more specialized and useful PrintStream objects. We'll explain these later, but for now we can reference out and err as OutputStream objects because they are a type of OutputStream as well. We can read a single byte at a time from standard input with the InputStream 's read( ) method. If you look closely at the API, you'll see that the read( ) method of the base InputStream class is an abstract method. What lies behind System.in is a particular implementation of InputStream that provides the real implementation of the read( ) method:
try {
int val = System.in.read( );
} catch ( IOException e ) {
...
}
Although we said that the
read( )
method reads a byte value, the return type in the example is
int
, not
byte
. That's because the
read( )
method of basic input streams in Java use a convention from the C language to
int val;
try {
while( (val=System.in.read( )) != -1 )
System.out.println((byte)val);
} catch ( IOException e ) { ... }
As we've shown in the examples, the read( ) method can also throw an IOException if there is an error reading from the underlying stream source. Various subclasses of IOException may indicate that a source such as a file or network connection has had an error. Additionally, higher-level streams that read data types more complex than a single byte may throw EOFException , indicating an unexpected or premature end of stream. An overloaded form of read( ) fills a byte array with as much data as possible up to the capacity of the array and returns the number of bytes read:
byte [] buff = new byte [1024];
int got = System.in.read( buff );
We can also check the number of bytes available for reading at a given time on an
InputStream
with the
available( )
method. Using that information, we could create an array of exactly the right
int waiting = System.in.available( );
if ( waiting > 0 ) {
byte [] data = new byte [ waiting ];
System.in.read( data );
...
}
However, the reliability of this technique depends on the ability of the underlying stream implementation to detect how much data can be retrieved. It
These read( ) methods block until at least some data is read (at least one byte). You must, in general, check the returned value to determine how much data you got and if you need to read more. (We look at nonblocking I/O later in this chapter.) The skip( ) method of InputStream provides a way of jumping over a number of bytes. Depending on the implementation of the stream, skipping bytes may be more efficient than reading them.
Finally, the
close( )
method shuts down the stream and
12.1.2. Character Streams
In early versions of Java, some
InputStream
and
OutputStream
types included methods for reading and writing strings, but most of them operated by naively
The
java.io Reader
and
Writer
character stream classes were introduced as streams that handle character data only. When you use these classes, you think only in terms of characters and string data and allow the underlying implementation to handle the conversion of bytes to a specific character encoding. As we'll see, some direct implementations of
Reader
and
Writer
exist, for example, for reading and writing files. But more generally, two special classes,
InputStreamReader
and
OutputStreamWriter
, bridge the gap between the world of character streams and the world of byte streams. These are, respectively, a
Reader
and a
Writer
that can be wrapped around any underlying byte stream to make it a character stream. An encoding scheme is used to convert between possibly multibyte encoded values and Java Unicode characters. An encoding scheme can be specified by
For example, let's parse a
try {
InputStreamReader converter = new InputStreamReader( System.in );
BufferedReader in = new BufferedReader( converter );
String line = in.readLine( );
int i = NumberFormat.getInstance( ).parse( line ).intValue( );
} catch ( IOException e ) {
} catch ( ParseException pe ) { }
First, we wrap an InputStreamReader around System.in . This reader converts the incoming bytes of System.in to characters using the default encoding scheme. Then, we wrap a BufferedReader around the InputStreamReader . BufferedReader adds the readLine( ) method, which we can use to grab a full line of text (up to a platform-specific, line-terminator character combination) into a String . The string is then parsed into an integer using the techniques described in Chapter 10. The important thing to note is that we have taken a byte-oriented input stream, System.in , and safely converted it to a Reader for reading characters. If we wished to use an encoding other than the system default, we could have specified it in the InputStreamReader 's constructor like so: InputStreamReader reader = new InputStreamReader( System.in, "UTF-8" );
For each character read from the reader, the
InputStreamReader
reads one or more bytes and
In Chapter 13, we use an InputStreamReader and a Writer in our simple web server example, where we must use a character encoding specified by the HTTP protocol. We also return to the topic of character encodings when we discuss the java.nio.charset API, which allows you to query for and use encoders and decoders explicitly on buffers of characters and bytes. Both InputStreamReader and OutputStreamWriter can accept a Charset codec object as well as a character encoding name. 12.1.3. Stream WrappersWhat if we want to do more than read and write a sequence of bytes or characters? We can use a "filter" stream, which is a type of InputStream , OutputStream , Reader , or Writer that wraps another stream and adds new features. A filter stream takes the target stream as an argument in its constructor and delegates calls to it after doing some additional processing of its own. For example, you could construct a BufferedInputStream to wrap the system standard input: InputStream bufferedIn = new BufferedInputStream( System.in ); The BufferedInputStream is a type of filter stream that reads ahead and buffers a certain amount of data. (We'll talk more about it later in this chapter.) The BufferedInputStream wraps an additional layer of functionality around the underlying stream. Figure 12-3 shows this arrangement for a DataInputStream , which is a type of stream that can read higher-level data types, such as Java primitives and strings. Figure 12-3. Layered streams
As you can see from the previous code snippet, the
BufferedInputStream
filter is a type of
InputStream
. Because filter streams are
Java provides base classes for creating new types of filter streams: FilterInputStream , FilterOutputStream , FilterReader , and FilterWriter . These superclasses provide the basic machinery for a "no op" filter (a filter that doesn't do anything) by delegating all their method calls to their underlying stream. Real filter streams subclass these and override various methods to add their additional processing. We'll make an example filter stream later in this chapter. 12.1.3.1 Data streams
DataInputStream
and
DataOutputStream
are filter streams that let you read or write strings and primitive data types comprised of more than a single byte.
DataInputStream
and
DataOutputStream
implement the
DataInput
and
DataOutput
interfaces, respectively. These interfaces define methods for reading or writing strings and all of the Java primitive types, including
You can construct a DataInputStream from an InputStream and then use a method such as readDouble( ) to read a primitive data type:
DataInputStream dis = new DataInputStream( System.in );
double d = dis.readDouble( );
This example wraps the standard input stream in a DataInputStream and uses it to read a double value. readDouble( ) reads bytes from the stream and constructs a double from them. The DataInputStream methods expect the bytes of numeric data types to be in network byte order , a standard that specifies that the high-order bytes are sent first (also known as "big endian," as we discuss later).
The
DataOutputStream
class provides write methods that
The
readUTF( )
and
writeUTF( )
methods of
DataInputStream
and
DataOutputStream
read and write a Java
String
of Unicode characters using the UTF-8 "transformation format." UTF-8 is an ASCII-compatible encoding of Unicode characters commonly used for the transmission and storage of Unicode text. Not all encodings are
12.1.3.2 Buffered streams
The
BufferedInputStream
,
BufferedOutputStream
,
BufferedReader
, and
BufferedWriter
classes add a data buffer of a specified size to the stream
BufferedInputStream bis =
new BufferedInputStream(myInputStream, 4096);
...
bis.read( );
In this example, we specify a buffer size of 4096 bytes. If we leave off the size of the buffer in the constructor, a reasonably
A BufferedOutputStream works in a similar way. Calls to write( ) store the data in a buffer; data is actually written only when the buffer fills up. You can also use the flush( ) method to wring out the contents of a BufferedOutputStream at any time. The flush( ) method is actually a method of the OutputStream class itself. It's important because it allows you to be sure that all data in any underlying streams and filter streams has been sent (before, for example, you wait for a response).
Some input streams such as
BufferedInputStream
support the ability to mark a location in the data and later reset the stream to that position. The
mark( )
method sets the return point in the stream. It takes an integer value that specifies the number of bytes that can be read before the stream gives up and forgets about the mark. The
reset( )
method returns the stream to the
This functionality could be useful when you are reading the stream in a parser. You may occasionally fail to parse a structure and so must try something else. In this situation, you can have your parser generate an error (a homemade
ParseException
) and then reset the stream to the point before it
BufferedInputStream input;
...
try {
input.mark( MAX_DATA_STRUCTURE_SIZE );
return( parseDataStructure( input ) );
}
catch ( ParseException e ) {
input.reset( );
...
}
The
BufferedReader
and
BufferedWriter
classes work just like their byte-based counterparts but
12.1.3.3 PrintWriter and PrintStream
Another useful wrapper stream is
java.io.PrintWriter
. This class provides a suite of overloaded
print( )
methods that
PrintWriter is an unusual character stream because it can wrap either an OutputStream or another Writer . PrintWriter is the more capable big brother of the older PrintStream byte stream. The System.out and System.err streams are PrintStream objects; you have already seen such streams strewn throughout this book:
System.out.print("Hello, world...\n");
System.out.println("Hello, world...");
System.out.println("The answer is %d", 17 );
System.out.println( 3.14 );
PrintWriter
and
PrintStream
have a
When you create a PrintWriter object, you can pass an additional Boolean value to the constructor, specifying whether it should "auto-flush." If this value is true , the PrintWriter automatically performs a flush( ) on the underlying OutputStream or Writer each time it sends a newline:
boolean autoFlush = true;
PrintWriter p = new PrintWriter( myOutputStream, autoFlush );
When this technique is used with a buffered output stream, it corresponds to the behavior of terminals that send data line by line. The other big advantage that print streams have over regular character streams is that they shield you from exceptions thrown by the underlying streams. Unlike methods in other stream classes, the methods of PrintWriter and PrintStream do not throw IOException s. Instead they provide a method to explicitly check for errors on your own basis. This makes life a lot easier for printing text, which is a very common operation. You can check for errors with the checkError( ) method:
System.out.println( reallyLongString );
if ( System.out.checkError( ) ) // uh oh
12.1.4. PipesNormally, our applications are directly involved with one side of a given stream at a time. PipedInputStream and PipedOutputStream (or PipedReader and PipedWriter ), however, let us create two sides of a stream and connect them, as shown in Figure 12-4. This can be used to provide a stream of communication between threads, for example, or as a loopback for testing. Most often it's used as a crutch to interface a stream-oriented API to a non-stream-oriented API. Figure 12-4. Piped streams
To create a byte-stream pipe, we use both a PipedInputStream and a PipedOutputStream . We can simply choose a side and then construct the other side using the first as an argument:
PipedInputStream pin = new PipedInputStream( );
PipedOutputStream pout = new PipedOutputStream( pin );
Alternatively:
PipedOutputStream pout = new PipedOutputStream( );
PipedInputStream pin = new PipedInputStream( pout );
In each of these examples, the effect is to produce an input stream, pin , and an output stream, pout , that are connected. Data written to pout can then be read by pin . It is also possible to create the PipedInputStream and the PipedOutputStream separately and then connect them with the connect( ) method. We can do exactly the same thing in the character-based world, using PipedReader and PipedWriter in place of PipedInputStream and PipedOutputStream . Once the two ends of the pipe are connected, use the two streams as you would other input and output streams. You can use read( ) to read data from the PipedInputStream (or PipedReader ) and write( ) to write data to the PipedOutputStream (or PipedWriter ). If the internal buffer of the pipe fills up, the writer blocks and waits until space is available. Conversely, if the pipe is empty, the reader blocks and waits until some data is available. One advantage to using piped streams is that they provide stream functionality in our code without compelling us to build new, specialized streams. For example, we can use pipes to create a simple logging or "console" facility for our application. We can send messages to the logging facility through an ordinary PrintWriter , and then it can do whatever processing or buffering is required before sending the messages off to their ultimate destination. Because we are dealing with string messages, we use the character-based PipedReader and PipedWriter classes. The following example shows the skeleton of our logging facility:
//file: LoggerDaemon.java
import java.io.*;
class LoggerDaemon extends Thread {
PipedReader in = new PipedReader( );
LoggerDaemon( ) {
start( );
}
public void run( ) {
BufferedReader bin = new BufferedReader( in );
String s;
try {
while ( (s = bin.readLine( ))
!= null ) {
// process line of data
}
} catch (IOException e ) { }
}
PrintWriter getWriter( ) throws IOException {
return new PrintWriter( new PipedWriter( in ) );
}
}
class myApplication {
public static void main ( String [] args ) throws IOException {
PrintWriter out = new LoggerDaemon( ).getWriter( );
out.println("Application starting...");
// ...
out.println("Warning: does not compute!");
// ...
}
}
LoggerDaemon
reads strings from its end of the pipe, the
PipedReader
named
in
.
LoggerDaemon
also provides a method,
getWriter( )
, which returns a
PipedWriter
that is connected to its input stream. To begin sending messages, we create a new
LoggerDaemon
and fetch the output stream. In order to read strings with the
readLine( )
method,
LoggerDaemon
wraps a
BufferedReader
around its
PipedReader
. For convenience, it also
One advantage of implementing LoggerDaemon with pipes is that we can log messages as easily as we write text to a terminal or any other stream. In other words, we can use all our normal tools and techniques (including printf( )) . Another advantage is that the processing happens in another thread, so we can go about our business while any processing takes place. 12.1.5. Streams from Strings and Back
StringReader
is another useful stream class; it
String data = "There once was a man from Nantucket...";
StringReader sr = new StringReader( data );
char T = (char)sr.read( );
char h = (char)sr.read( );
char e = (char)sr.read( );
Note that you will still have to catch IOException s thrown by some of the StringReader 's methods. The StringReader class is useful when you want to read data in a String as if it were coming from a stream, such as a file, pipe, or socket. Suppose you create a parser that expects to read from a stream, but you want to provide an alternative method that also parses a big string. You can easily add one using StringReader . Turning things around, the StringWriter class lets us write to a character buffer via an output stream. The internal buffer grows as necessary to accommodate the data. When we are done, we can fetch the contents of the buffer as a String . In the following example, we create a StringWriter and wrap it in a PrintWriter for convenience:
StringWriter buffer = new StringWriter( );
PrintWriter out = new PrintWriter( buffer );
out.println("A moose once bit my sister.");
out.println("No, really!");
String results = buffer.toString( );
First, we print a few lines to the output stream to give it some data and then retrieve the results as a string with the toString( ) method. Alternately, we could get the results as a StringBuffer object using the getBuffer( ) method.
The
StringWriter
class is useful if you want to capture the output of something that normally sends output to a stream, such as a file or the console. A
PrintWriter
wrapped around a
StringWriter
is a
The ByteArrrayInputStream and ByteArrayOutputStream work with bytes in the same way the previous examples worked with characters. You can write byte data to a ByteArrayOutputStream and retrieve it later with the toByteArray( ) method. Conversely, you can construct a ByteArrayInputStream from a byte array as StringReader does with a String . For example, if we want to see exactly what our DataOutputStream is writing when we tell it to encode a particular value, we could capture it with a byte array output stream:
ByteArrayOutputStream bao = new ByteArrayOutputStream( );
DataOutputStream dao = new DataOutputStream( bao );
dao.writeInt( 16777216 );
dao.flush( );
byte [] bytes = bao.toByteArray( );
for( byte b : bytes )
System.out.println( b ); // 1, 0, 0, 0
12.1.6. The rot13InputStream ClassBefore we leave streams, let's try making one of our own. We mentioned earlier that specialized stream wrappers are built on top of the FilterInputStream and FilterOutputStream classes. It's quite easy to create our own subclass of FilterInputStream that can be wrapped around other streams to add new functionality.
The following example,
rot13InputStream
, performs a
rot13
(rotate by 13
//file: rot13InputStream.java
package learningjava.io;
import java.io.*;
public class rot13InputStream extends FilterInputStream {
public rot13InputStream ( InputStream i ) {
super( i );
}
public int read( ) throws IOException {
return rot13( in.read( ) );
}
// should override additional read( ) methods
private int rot13 ( int c ) {
if ( (c >= 'A') && (c <= 'Z') )
c=(((c-'A')+13)%26)+'A';
if ( (c >= 'a') && (c <= 'z') )
c=(((c-'a')+13)%26)+'a';
return c;
}
}
The
FilterInputStream
needs to be
The primary feature of a
FilterInputStream
is that it delegates its input
read( )
is the only
InputStream
method that
FilterInputStream
Strictly speaking,
rot13InputStream
works only on an ASCII byte stream since the underlying algorithm is based on the Roman alphabet. A more generalized character-
|