|
|
CONTENTS |
|
In this chapter, we continue our exploration of the Java API by looking at many of the classes in the java.io and java.nio packages. These packages offer a rich set of tools for basic I/O and also provide the framework on which all file and network communication in Java is built.
Figure 11-1 shows the class hierarchy of these packages.
We'll start by looking at the stream classes in java.io , which are subclasses of the basic InputStream , OutputStream , Reader , and Writer classes. Then we'll examine the File class and discuss how you can interact with the filesystem using classes in java.io . We'll also take a quick look at the data compression classes provided in java.util.zip . Finally, we'll begin our investigation of the new java.nio package, introduced in Java 1.4. The NIO package adds significant new functionality for building high performance services.
Most fundamental I/O in Java is based on streams . A stream represents a flow of data, or a channel of communication with (at least conceptually) a writer at one end and a reader at the other. When you are working with the java.io package to perform terminal input and output, reading or writing files, or communicating through sockets in Java, you are using various types of streams. Later in this chapter we'll look at the NIO package, which introduces a similar concept called a channel . But for now we'll start by summarizing the available types of streams.
Abstract classes that define the basic functionality for reading or writing an unstructured sequence of bytes. All other byte streams in Java are built on top of the basic InputStream and OutputStream .
Abstract classes that define the basic functionality for reading or writing a sequence of character data, with support for Unicode. All other character streams in Java are built on top of Reader and Writer .
"Bridge" classes that convert bytes to
Specialized stream filters that add the ability to read and write simple data types, such as numeric primitives and String objects, in a universal format.
Specialized stream filters that are capable of writing whole serialized Java objects and reconstructing them.
Specialized stream filters that add buffering for additional efficiency.
A specialized character stream that makes it simple to print text.
"Double-ended" streams that normally occur in pairs. Data written into a PipedOutputStream or PipedWriter is read from its corresponding PipedInputStream or PipedReader .
Implementations of InputStream , OutputStream , Reader , and Writer that read from and write to files on the local filesystem.
Streams in Java are one-way
InputStream
and
OutputStream
are
abstract
classes that define the
Reader and Writer are very much like InputStream and OutputStream , except that they deal with characters instead of bytes. As true character streams, these classes correctly handle Unicode characters, which was not always the case with byte streams. Often, a bridge is needed between these character streams and the byte streams of physical devices such as disks and networks. InputStreamReader and OutputStreamWriter are special classes that use an encoding scheme to translate between character and byte streams.
We'll discuss all the interesting stream types in this section, with the exception of
FileInputStream
,
FileOutputStream
,
FileReader
, and
FileWriter
. We'll
The
InputStream stdin = System.in; OutputStream stdout = System.out; OutputStream stderr = System.err;
This example hides the fact that System.out and System.err aren't really OutputStream objects, but more specialized and useful PrintStream objects. We'll explain these later, but for now we can reference out and err as OutputStream objects, because they are a type of OutputStream as well.
We can read a single byte at a time from standard input with the InputStream 's read() method. If you look closely at the API, you'll see that the read() method of the base InputStream class is an abstract method. What lies behind System.in is a particular implementation of InputStream ; the subclass provides a real implementation of the read() method.
try {
int val = System.in.read();
...
}
catch (IOException e) {
...
}
Note that the return type of read() in this example is int , not byte as you'd expect. That's because Java's input stream read() method uses a convention of the C language. Although read() provides only a byte of information, its return type is int . This allows it to use the special return value of an integer -1 , indicating that end of stream has been reached. You'll need to test for this condition when using the simple read() method. If an error occurs during the read, an IOException is thrown. All basic input and output stream commands can throw an IOException , so you should arrange to catch and handle them appropriately.
To retrieve the value as a byte , perform a cast:
byte b = (byte) val;
Be sure to check for the end-of-stream condition before you perform the cast.
An overloaded form of read() fills a byte array with as much data as possible up to the capacity of the array, and returns the number of bytes read:
byte [] buff = new byte [1024]; int got = System.in.read(buff);
We can also check the number of bytes available for reading on an
InputStream
with the
available()
method. Using that information, we could create an array of exactly the right
int waiting = System.in.available();
if (waiting > 0) {
byte [] data = new byte [ waiting ];
System.in.read(data);
...
}
However, the reliability of this technique depends on the ability of the underlying stream implementation to detect how much data can be retrieved. It
These read() methods block until at least some data is read (at least one byte). You must, in general, check the returned value to determine how much data you got and if you need to read more.
InputStream
provides the
skip()
method as a way of jumping over a number of bytes. Depending on the implementation of the stream, skipping bytes may be more efficient than reading them. The
close()
method shuts down the stream and
Some
InputStream
and
OutputStream
subclasses of early versions of Java included methods for reading and writing strings, but most of them operated by naively
try {
InputStreamReader converter = new InputStreamReader(System.in);
BufferedReader in = new BufferedReader(converter);
String text = in.readLine();
int i = NumberFormat.getInstance().parse(text).intValue();
}
catch (IOException e) { }
catch (ParseException pe) { }
First, we wrap an InputStreamReader around System.in . This object converts the incoming bytes of System.in to characters using the default encoding scheme. Then, we wrap a BufferedReader around the InputStreamReader . BufferedReader gives us the readLine() method, which we can use to convert a full line of text into a String . The string is then parsed into an integer using the techniques described in Chapter 9.
We could have programmed the previous example using only byte streams, and it might have worked for users in the United States, at least. But character streams correctly support Unicode strings. Unicode was designed to support almost all the written languages of the world. If you want to write a program that works in any part of the world, in any language, you definitely want to use streams that don't mangle Unicode.
So how do you decide when you need a byte stream (
InputStream
or
OutputStream
) and when you need a character stream? If you want to read or write character strings, use some variety of
Reader
or
Writer
.
Another example comes from the Internet. Web servers serve files as byte streams. If you want to read Unicode strings with a particular encoding scheme from a file on the network, you'll need an appropriate InputStreamReader wrapped around the InputStream of the web server's socket (as we'll see in Chapter 12).
What if we want to do more than read and write a sequence of bytes or characters? We can use a "filter" stream, which is a type of InputStream , OutputStream , Reader , or Writer that wraps another stream and adds new features. A filter stream takes the target stream as an argument in its constructor and delegates calls to it after doing some additional processing of its own. For example, you could construct a BufferedInputStream to wrap the system standard input:
InputStream bufferedIn = new BufferedInputStream(System.in);
The BufferedInputStream is a type of filter stream that reads ahead and buffers a certain amount of data. (We'll talk more about it later in this chapter.) The BufferedInputStream wraps an additional layer of functionality around the underlying stream. Figure 11-3 shows this arrangement for a DataInputStream .
As you can see from the previous code snippet, the
BufferedInputStream
filter is a type of
InputStream
. Because filter streams are
There are four superclasses corresponding to the four types of filter streams: FilterInputStream , FilterOutputStream , FilterReader , and FilterWriter . The first two are for filtering byte streams, and the last two are for filtering character streams. These superclasses provide the basic machinery for a "no op" filter (a filter that doesn't do anything) by delegating all their method calls to their underlying stream. Real filter streams subclass these and override various methods to add their additional processing. We'll make an example filter stream a little later in this chapter.
DataInputStream
and
DataOutputStream
are filter streams that let you read or write strings and primitive data types comprised of more than a single byte.
DataInputStream
and
DataOutputStream
implement the
DataInput
and
DataOutput
interfaces, respectively. These interfaces define the methods required for streams that read and write strings and Java primitive numeric and boolean types in a
You can construct a DataInputStream from an InputStream and then use a method such as readDouble() to read a primitive data type:
DataInputStream dis = new DataInputStream(System.in); double d = dis.readDouble();
This example wraps the standard input stream in a DataInputStream and uses it to read a double value. readDouble() reads bytes from the stream and constructs a double from them. The DataInputStream methods expect the bytes of numeric data types to be in network byte order , a standard that specifies that the high-order bytes are sent first (also known as "big endian," as we'll discuss later).
The
DataOutputStream
class provides write methods that
The
readUTF()
and
writeUTF()
methods of
DataInputStream
and
DataOutputStream
read and write a Java
String
of Unicode characters using the UTF-8 "transformation format." UTF-8 is an ASCII-compatible encoding of Unicode characters commonly used for the transmission and storage of Unicode text. This
We can use a DataInputStream with any kind of input stream, whether it be from a file, a socket, or standard input. The same applies to using a DataOutputStream , or, for that matter, any other specialized streams in java.io .
The
BufferedInputStream
,
BufferedOutputStream
,
BufferedReader
, and
BufferedWriter
classes add a data buffer of a specified size to the stream
BufferedInputStream bis = new BufferedInputStream(myInputStream, 4096); ... bis.read();
In this example, we specify a buffer size of 4096 bytes. If we leave off the size of the buffer in the constructor, a reasonably
A BufferedOutputStream works in a similar way. Calls to write() store the data in a buffer; data is actually written only when the buffer fills up. You can also use the flush() method to wring out the contents of a BufferedOutputStream at any time. The flush() method is actually a method of the OutputStream class itself. It's important because it allows you to be sure that all data in any underlying streams and filter streams has been sent (before, for example, you wait for a response).
Some input streams such as
BufferedInputStream
support the ability to mark a location in the data and later reset the stream to that position. The
mark()
method sets the return point in the stream. It takes an integer value that specifies the number of bytes that can be read before the stream gives up and forgets about the mark. The
reset()
method returns the stream to the
This functionality is
BufferedInputStream input;
...
try {
input.mark(MAX_DATA_STRUCTURE_SIZE);
return(parseDataStructure(input));
}
catch (ParseException e) {
input.reset();
...
}
The
BufferedReader
and
BufferedWriter
classes work just like their byte-based counterparts but
Another useful wrapper stream is
java.io.PrintWriter
. This class provides a suite of overloaded
print()
methods that
PrintWriter is the more capable big brother of the older PrintStream byte stream. The System.out and System.err streams are PrintStream objects; you have already seen such streams strewn throughout this book:
System.out.print("Hello world...\n");
System.out.println("Hello world...");
System.out.println("The answer is: " + 17);
System.out.println(3.14);
PrintWriter
and
PrintStream
have a
When you create a
PrintWriter
object, you can pass an additional boolean value to the constructor. If this value is
true
, the
PrintWriter
automatically
boolean autoFlush = true; PrintWriter p = new PrintWriter(myOutputStream, autoFlush);
When this technique is used with a buffered output stream, it corresponds to the behavior of terminals that send data line by line.
Unlike methods in other stream classes, the methods of PrintWriter and PrintStream do not throw IOException s. This makes life a lot easier for printing text, which is a very common operation. Instead, if we are interested, we can check for errors with the checkError() method:
System.out.println(reallyLongString); if (System.out.checkError()) // uh oh
Normally, our applications are directly involved with one side of a given stream at a time. PipedInputStream and PipedOutputStream (or PipedReader and PipedWriter ), however, let us create two sides of a stream and connect them together, as shown in Figure 11-4. This can be used to provide a stream of communication between threads, for example, or as a "loopback" for testing.
To create a byte-stream pipe, we use both a PipedInputStream and a PipedOutputStream . We can simply choose a side and then construct the other side using the first as an argument:
PipedInputStream pin = new PipedInputStream(); PipedOutputStream pout = new PipedOutputStream(pin);
Alternatively:
PipedOutputStream pout = new PipedOutputStream(); PipedInputStream pin = new PipedInputStream(pout);
In each of these examples, the effect is to produce an input stream, pin , and an output stream, pout , that are connected. Data written to pout can then be read by pin . It is also possible to create the PipedInputStream and the PipedOutputStream separately and then connect them with the connect() method.
We can do exactly the same thing in the character-based world, using PipedReader and PipedWriter in place of PipedInputStream and PipedOutputStream .
Once the two ends of the pipe are connected, use the two streams as you would other input and output streams. You can use read() to read data from the PipedInputStream (or PipedReader ) and write() to write data to the PipedOutputStream (or PipedWriter ). If the internal buffer of the pipe fills up, the writer blocks and waits until space is available. Conversely, if the pipe is empty, the reader blocks and waits until some data is available.
One advantage to using piped streams is that they provide stream functionality in our code without compelling us to build new, specialized streams. For example, we can use pipes to create a simple logging or "console" facility for our application. We can send messages to the logging facility through an ordinary PrintWriter , and then it can do whatever processing or buffering is required before sending the messages off to their ultimate destination. Because we are dealing with string messages, we use the character-based PipedReader and PipedWriter classes. The following example shows the skeleton of our logging facility:
//file: LoggerDaemon.java
import java.io.*;
class LoggerDaemon extends Thread {
PipedReader in = new PipedReader();
LoggerDaemon() {
start();
}
public void run() {
BufferedReader bin = new BufferedReader(in);
String s;
try {
while ((s = bin.readLine()) != null) {
// process line of data
}
} catch (IOException e) { }
}
PrintWriter getWriter() throws IOException {
return new PrintWriter(new PipedWriter(in));
}
}
class myApplication {
public static void main (String [] args) throws IOException {
PrintWriter out = new LoggerDaemon().getWriter();
out.println("Application starting...");
// ...
out.println("Warning: does not compute!");
// ...
}
}
LoggerDaemon
reads strings from its end of the pipe, the
PipedReader
named
in
.
LoggerDaemon
also provides a method,
getWriter()
, which returns a
PipedWriter
that is connected to its input stream. To begin sending messages, we create a new
LoggerDaemon
and fetch the output stream. In order to read strings with the
readLine()
method,
LoggerDaemon
wraps a
BufferedReader
around its
PipedReader
. For convenience, it also
One advantage of implementing LoggerDaemon with pipes is that we can log messages as easily as we write text to a terminal or any other stream. In other words, we can use all our normal tools and techniques. Another advantage is that the processing happens in another thread, so we can go about our business while the processing takes place.
StringReader
is another useful stream class; it
String data = "There once was a man from Nantucket..."; StringReader sr = new StringReader(data); char T = (char)sr.read(); char h = (char)sr.read(); char e = (char)sr.read();
Note that you will still have to catch IOException s thrown by some of the StringReader 's methods.
The StringReader class is useful when you want to read data in a String as if it were coming from a stream, such as a file, pipe, or socket. For example, suppose you create a parser that expects to read from a stream, but you want to provide an alternative method that also parses a big string. You can easily add one using StringReader .
Turning things around, the StringWriter class lets us write to a character buffer via an output stream. The internal buffer grows as necessary to accommodate the data. When we are done we can fetch the contents of the buffer as a String . In the following example, we create a StringWriter and wrap it in a PrintWriter for convenience:
StringWriter buffer = new StringWriter();
PrintWriter out = new PrintWriter(buffer);
out.println("A moose once bit my sister.");
out.println("No, really!");
String results = buffer.toString();
First we print a few lines to the output stream, to give it some data, then retrieve the results as a string with the toString() method. Alternately, we could get the results as a StringBuffer object using the getBuffer() method.
The
StringWriter
class is useful if you want to capture the output of something that normally sends output to a stream, such as a file or the console. A
PrintWriter
wrapped around a
StringWriter
is a
The ByteArrrayInputStream and ByteArrayOutputStream work with bytes in the same way the previous examples worked with characters. You can write byte data to a ByteArrayOutputStream and retrieve it later with the toByteArray() method. Conversely, you can construct a ByteArrayInputStream from a byte array as StringReader does with a String .
Before we leave streams, let's try our hand at making one of our own. We mentioned earlier that specialized stream wrappers are built on top of the FilterInputStream and FilterOutputStream classes. It's quite easy to create our own subclass of FilterInputStream that can be wrapped around other streams to add new functionality.
The following example,
rot13InputStream
, performs a
rot13
(rotate by 13
//file: rot13InputStream.java
package learningjava.io;
import java.io.*;
public class rot13InputStream extends FilterInputStream {
public rot13InputStream (InputStream i) {
super(i);
}
public int read() throws IOException {
return rot13(in.read());
}
private int rot13 (int c) {
if ((c >= 'A') && (c <= 'Z'))
c=(((c-'A')+13)%26)+'A';
if ((c >= 'a') && (c <= 'z'))
c=(((c-'a')+13)%26)+'a';
return c;
}
}
The
FilterInputStream
needs to be
The primary feature of a
FilterInputStream
is that it delegates its input
read()
is the only
InputStream
method that
FilterInputStream
Strictly speaking,
rot13InputStream
works only on an ASCII byte stream since the underlying algorithm is based on the Roman alphabet. A more generalized character-
Working with files in Java poses some conceptual problems. The host filesystem lies outside of Java's virtual environment, in the real world, and can therefore still suffer from architecture and implementation differences. Java tries to mask some of these differences by providing information to help an application tailor itself to the local environment; we'll mention these areas as they occur.
The java.io.File class encapsulates access to information about a file or directory entry in the filesystem. It can be used to get attribute information about a file, list the entries in a directory, and perform basic filesystem operations such as removing a file or making a directory. While the File object handles these tasks, it doesn't provide direct access for reading and writing file data; there are specialized streams for that purpose.
You can create an instance of File from a String pathname:
File fooFile = new File("/tmp/foo.txt");
File barDir = new File("/tmp/bar");
You can also create a file with a relative path:
File f = new File("foo");
In this case, Java works relative to the current directory of the Java interpreter. You can determine the current working directory by checking the
System.getProperty("user.dir"));
An overloaded version of the File constructor lets you specify the directory path and filename as separate String objects:
File fooFile = new File("/tmp", "foo.txt");
With yet another variation, you can specify the directory with a File object and the filename with a String :
File tmpDir = new File("/tmp");
File fooFile = new File (tmpDir, "foo.txt");
None of the
File
constructors throw any exceptions. This means the object is created whether or not the file or directory actually exists; it isn't an error to create a
File
object for a nonexistent file. You can use the object's
exists()
instance method to find out whether the file or directory exists. The
File
object simply exists as a handle for getting information about what is (
One of the reasons that working with files in Java is
On some systems, Java can also compensate for differences such as the direction of the file separator slashes in a pathname. For example, in the current implementation on Windows platforms, Java accepts paths with either forward slashes or backslashes. However, under Solaris, Java accepts only paths with forward
Your best bet is to make sure you follow the filename conventions of the host filesystem. If your application has a GUI that is opening and saving files at the user's request, you should be able to handle that functionality with the Swing
JFileDialog
class. This class encapsulates a graphical
If your application needs to deal with files on its own
You can use this system-dependent information in several ways. Probably the simplest way to localize pathnames is to pick a convention you use internally, for instance the forward slash (/), and do a String replace to substitute for the localized separator character:
// we'll use forward slash as our standard
String path = "mail/1999/june/merle";
path = path.replace('/', File.separatorChar);
File mailbox = new File(path);
Alternately, you could work with the
String [] path = { "mail", "1999", "june", "merle" };
StringBuffer sb = new StringBuffer(path[0]);
for (int i=1; i< path.length; i++)
sb.append(File.separator + path[i]);
File mailbox = new File(sb.toString());
One thing to remember is that Java interprets the backslash character ( \ ) as an escape character when used in a String . To get a backslash in a String , you have to use \\ .
Another issue to grapple with is that some operating systems use special identifiers for the roots of filesystems. For example, Windows uses C:\ . Should you need it, the File class provides the static method listRoots() , which returns an array of File objects corresponding to the filesystem root directories.
Once we have a File object, we can use it to ask for information about the file or directory and to perform standard operations on it. A number of methods let us ask certain questions about the File . For example, isFile() returns true if the File represents a file while isDirectory() returns true if it's a directory. isAbsolute() indicates whether the File has an absolute or relative path specification.
Components of the File pathname are available through the following methods: getName() , getPath() , getAbsolutePath() , and getParent() . getName() returns a String for the filename without any directory information; getPath() returns the directory information without the filename. If the File has an absolute path specification, getAbsolutePath() returns that path. Otherwise it returns the relative path appended to the current working directory. getParent() returns the parent directory of the File .
The string returned by getPath() or getAbsolutePath() may not follow the same case conventions as the underlying filesystem. You can retrieve the filesystem's own or "canonical" version of the file's path using the method getCanonicalPath() . In Windows, for example, you can create a File object whose getAbsolutePath() is C:\Autoexec.bat but whose getCanonical-Path() is C:\AUTOEXEC.BAT . This is useful for comparing filenames that may have been supplied with different case conventions or for showing them to the user.
You can get or set the modification time of a file or directory with
lastModified()
and
setLastModified()
methods. The value is a
long
that is the number of
Here's a fragment of code that prints some information about a file:
File fooFile = new File("/tmp/boofa");
String type = fooFile.isFile() ? "File " : "Directory ";
String name = fooFile.getName();
long len = fooFile.length();
System.out.println(type + name + ", " + len + " bytes ");
If the File object corresponds to a directory, we can list the files in the directory with the list() method or the listFiles() method:
String [] fileNames = fooFile.list(); File [] files = fooFile.listFiles();
list()
returns an array of
String
objects that contains filenames.
listFiles()
returns an array of
File
objects. Note that in
List list = Arrays.asList(sa); Collections.sort(l);
If the
File
refers to a nonexistent directory, we can create the directory with
mkdir()
or
mkdirs()
. The
mkdir()
method creates a single directory;
mkdirs()
creates all the
Although we can create a directory using the
File
object, this isn't the most common way to create a file; that's normally done implicitly with a
FileOutputStream
or
FileWriter
, as we'll discuss in a moment. The exception is the
createNewFile()
method, which can be used to attempt to create a new
You can use this to implement simple file locking from Java. (The NIO package supports true file locks, as we'll see later). This is useful in combination with deleteOnExit() , which flags the file to be automatically removed when the Java Virtual Machine exits. Another file creation method related to the File class itself is the static method createTempFile() , which creates a file in a specified location using an automatically generated unique name. This, too, is useful in combination with deleteOnExit() .
The
toURL()
method converts a file path to a
file:
URL object. We'll talk about URLs in Chapter 13. They are an abstraction that allows you to point to any kind of object
Table 11-1 summarizes the methods provided by the File class.
|
Method |
Return type |
Description |
|---|---|---|
|
canRead() |
Boolean |
Is the file (or directory) readable? |
|
canWrite() |
Boolean |
Is the file (or directory) writable? |
|
createNewFile() |
Boolean |
Creates a new file |
|
createTempFile (String pfx , String sfx ) |
File |
Static method to create a new file, with the specified prefix and suffix, in the default temp file directory |
|
delete() |
Boolean |
Deletes the file (or directory) |
|
deleteOnExit() |
Void |
When it exits, Java runtime system deletes the file |
|
exists() |
boolean |
Does the file (or directory) exist? |
|
getAbsolutePath() |
String |
Returns the absolute path of the file (or directory) |
|
getCanonicalPath() |
String |
Returns the absolute, case-correct path of the file (or directory) |
|
getName() |
String |
Returns the name of the file (or directory) |
|
getParent() |
String |
Returns the name of the parent directory of the file (or directory) |
|
getPath() |
String |
Returns the path of the file (or directory) |
|
isAbsolute() |
boolean |
Is the filename (or directory name) absolute? |
|
isDirectory() |
boolean |
Is the item a directory? |
|
isFile() |
boolean |
Is the item a file? |
|
lastModified() |
long |
Returns the last modification time of the file (or directory) |
|
length() |
long |
Returns the length of the file |
|
list() |
String [] |
Returns a list of files in the directory |
|
listfiles() |
File[] |
Returns the contents of the directory as an array of File objects |
|
mkdir() |
boolean |
Creates the directory |
|
Mkdirs() |
boolean |
Creates all directories in the path |
|
renameTo(File dest ) |
boolean |
Renames the file (or directory) |
|
setLastModified() |
boolean |
Sets the last-modified time of the file (or directory) |
|
setReadOnly() |
boolean |
Sets the file to read-only status |
|
toURL() |
java.net.URL |
Generates a URL object for the file (or directory) |
Java provides two specialized streams for reading from and writing to files in the filesystem: FileInputStream and FileOutputStream . These streams provide the basic InputStream and OutputStream functionality applied to reading and writing files. They can be combined with the filter streams described earlier to work with files in the same way we do other stream communications.
Because FileInputStream is a subclass of InputStream , it inherits all standard InputStream functionality for reading a file. FileInputStream provides only a low-level interface to reading data, however, so you'll typically wrap it with another stream, such as a DataInputStream .
You can create a FileInputStream from a String pathname or a File object:
FileInputStream in = new FileInputStream("/etc/passwd");
When you create a
FileInputStream
, the Java runtime system attempts to
To read characters from a file, you can wrap an InputStreamReader around a FileInputStream . If you want to use the default character-encoding scheme, you can use the FileReader class instead, which is provided as a convenience. FileReader works just like FileInputStream , except that it reads characters instead of bytes and wraps a Reader instead of an InputStream .
The following class, ListIt , is a small utility that sends the contents of a file or directory to standard output:
//file: ListIt.java
import java.io.*;
class ListIt {
public static void main (String args[]) throws Exception {
File file = new File(args[0]);
if (!file.exists() !file.canRead()) {
System.out.println("Can't read " + file);
return;
}
if (file.isDirectory()) {
String [] files = file.list();
for (int i=0; i< files.length; i++)
System.out.println(files[i]);
} else
try {
FileReader fr = new FileReader (file);
BufferedReader in = new BufferedReader(fr);
String line;
while ((line = in.readLine()) != null)
System.out.println(line);
}
catch (FileNotFoundException e) {
System.out.println("File Disappeared");
}
}
}
ListIt
constructs a
File
object from its first command-line argument and tests the
File
to see whether it exists and is readable. If the
File
is a directory,
ListIt
outputs the
FileOutputStream is a subclass of OutputStream , so it inherits all the standard OutputStream functionality for writing to a file. Just like FileInputStream though, FileOutputStream provides only a low-level interface to writing data. You'll typically wrap another stream, such as a DataOutputStream or a PrintWriter , around the FileOutputStream to provide higher-level functionality.
You can create a FileOutputStream from a String pathname or a File object. Unlike FileInputStream , however, the FileOutputStream constructors don't throw a FileNotFoundException . If the specified file doesn't exist, the FileOutputStream creates the file. The FileOutputStream constructors can throw an IOException if some other I/O error occurs, so you still need to handle this exception.
If the specified file does exist, the FileOutputStream opens it for writing. When you subsequently call the write() method, the new data overwrites the current contents of the file. If you need to append data to an existing file, you can use a form of the constructor that accepts an append flag:
FileInputStream fooOut = new FileOutputStream(fooFile);
FileInputStream pwdOut = new FileOutputStream("/etc/passwd", true);
Another way to append data to files is with RandomAccessFile , as we'll discuss shortly.
To write characters (instead of bytes) to a file, you can wrap an OutputStreamWriter around a FileOutputStream . If you want to use the default character-encoding scheme, you can use instead the FileWriter class, which is provided as a convenience. FileWriter works just like FileOutputStream , except that it writes characters instead of bytes and wraps a Writer instead of an OutputStream .
The following example reads a line of data from standard input and writes it to the file /tmp/foo.txt :
String s = new BufferedReader(new InputStreamReader(System.in)).readLine();
File out = new File("/tmp/foo.txt");
FileWriter fw = new FileWriter (out);
PrintWriter pw = new PrintWriter(fw)
pw.println(s);
fw.close();
Notice how we wrapped a PrintWriter around the FileWriter to facilitate writing the data. Also, to be a good filesystem citizen, we've called the close() method when we're done with the FileWriter .
The java.io.RandomAccessFile class provides the ability to read and write data at a specified location in a file. RandomAccessFile implements both the DataInput and DataOutput interfaces, so you can use it to read and write strings and primitive types. In other words, RandomAccessFile defines the same methods for reading and writing data as DataInputStream and DataOutputStream . However, because the class provides random, rather than sequential, access to file data, it's not a subclass of either InputStream or OutputStream .
You can create a RandomAccessFile from a String pathname or a File object. The constructor also takes a second String argument that specifies the mode of the file. Use r for a read-only file or rw for a read-write file. Here's how we would start to create a simple database to keep track of user information:
try {
RandomAccessFile users =
new RandomAccessFile("Users", "rw")
} catch (IOException e) { ... }
When you create a RandomAccessFile in read-only mode, Java tries to open the specified file. If the file doesn't exist, RandomAccessFile throws an IOException . If, however, you're creating a RandomAccessFile in read-write mode, the object creates the file if it doesn't exist. The constructor can still throw an IOException if another I/O error occurs, so you still need to handle this exception.
After you have created a RandomAccessFile , call any of the normal reading and writing methods, just as you would with a DataInputStream or DataOutputStream . If you try to write to a read-only file, the write method throws an IOException .
What makes a RandomAccessFile special is the seek() method. This method takes a long value and uses it to set the location for reading and writing in the file. You can use the getFilePointer() method to get the current location. If you need to append data to the end of the file, use length() to determine that location, then seek() to it. You can write or seek beyond the end of a file, but you can't read beyond the end of a file. The read() method throws an EOFException if you try to do this.
Here's an example of writing some data to a user database:
users.seek(userNum * RECORDSIZE); users.writeUTF(userName); users.writeInt(userID);
Of course, in this nave example we assume that the String length for userName , along with any data that comes after it, fits within the specified record size.
Unless otherwise restricted, a Java application can read and write to the host filesystem with the same level of access as the user running the Java interpreter. For security reasons, untrusted applets and applications are not permitted to read from or write to arbitrary places in the filesystem. The ability of untrusted code to read and write files, as with any kind of system resource, is under the control of the system security policy, through a SecurityManager object. A security policy is set by the application that is running the untrusted code, such as appletviewer or a Java-enabled web browser. All filesystem access must first pass the scrutiny of the SecurityManager .
Some web browsers allow untrusted applets to have access to specific files designated by the user. Netscape Navigator and Internet Explorer currently do not allow untrusted applets any access to the filesystem. However, as we'll see in Chapter 22, signed applets can be given arbitrary access to the filesystem, just like a standalone Java application.
It's not unusual to want an applet to maintain some kind of state information on the system on which it's running. But for a Java applet that is restricted from access to the local filesystem, the only option is to store data over the network on its server (or possibly in a client-side cookie). Applets have at their disposal powerful general means for communicating data over networks. The only limitation is that, by convention, an applet's network communication is restricted to the server that launched it. This limits the options for where the data will reside.
Currently, the only way for a Java program to send data to a server is through a network socket or tools such as RMI, which run over sockets. In Chapter 11 we'll take a detailed look at building networked applications with sockets. With the tools described in that chapter, it's possible to build powerful client/server applications. Sun also has a Java extension called WebNFS, which allows applets and applications to work with files on an NFS server in much the same way as the ordinary File API.
We often package data files and other objects with our applications. Java provides many ways to access these resources. In a standalone application, we can simply open files and read the bytes. In both standalone applications and applets, we can construct URLs to well-known locations. The problem with these methods is that we generally have to know where our application lives in order to find our data. This is not always as easy as it seems. What is needed is a universal way to access resources associated with our application and our application's individual classes. The Class class's getResource() method provides just this.
What does getResource() do for us? To construct a URL to a file, we normally have to figure out a home directory for our code and construct a path relative to that. As we'll see in Chapter 22, in an applet, we could use getCodeBase() or getDocumentBase() to find the base URL and then use that base to create the URL for the resource we want. But these methods don't help a standalone application, and there's no reason that a standalone application and an applet shouldn't be written in the same way anyway. To solve this problem, the getResource() method provides a standard way to get objects relative to a given class file or to the system classpath. getResource() returns a special URL that uses the class's class loader. This means that no matter where the class came from—a web server, the local filesystem, or even a JAR file—we can simply ask for an object, get a URL for the object, and use the URL to access the object.
getResource() takes as an argument a slash-separated pathname for the resource and returns a URL. There are two kinds of paths: absolute and relative. An absolute path begins with a slash, for example, /foo/bar/blah.txt . In this case, the search for the object begins at the top of the classpath. If there is a directory foo/bar in the classpath, getResource() searches that directory for the blah.txt file. A relative URL does not begin with a slash. In this case, the search begins at the location of the class file, whether it is local, on a remote server, or in a JAR file (either local or remote). So if we were calling getResource() on a class loader that loaded a class in the foo.bar package, we could refer to the file as blah.txt . In this case, the class itself would be loaded from the directory foo/bar somewhere on the classpath, and we'd expect to find the file in the same directory.
For example, here's an application that looks up some resources:
//file: FindResources.java
package mypackage;
import java.net.URL;
import java.io.IOException;
public class FindResources {
public static void main(String [] args) throws IOException {
// absolute from the classpath
URL url = FindResources.class.getResource("/mypackage/foo.txt");
// relative to the class location
url = FindResources.class.getResource("foo.txt");
// another relative document
url = FindResources.class.getResource("docs/bar.txt");
}
}
The
FindResources
class belongs to the
mypackage
package, so its class file will live in a
mypackage
directory somewhere on the classpath.
FindResources
For an applet, the search is similar but occurs on the host from which the applet was loaded.
getResource()
first checks any JAR files loaded with the applet, and then searches the normal remote applet classpath,
getResource() returns a URL for whatever type of object you reference. This could be a text file or properties file that you want to read as a stream, or it might be an image or sound file or some other object. If you want the data as a stream, the Class class also provides a getResourceAsStream() method. In the case of an image, you'd probably hand the URL over to the getImage() method of a Swing component for loading.
Using a DataOutputStream , you could write an application that saves the data content of your objects as simple types. However Java provides an even more powerful mechanism called object serialization that does almost all the work for you. In its simplest form, object serialization is an automatic way to save and load the state of an object. However, object serialization has depths that we cannot plumb within the scope of this book, including complete control over the serialization process and interesting conundrums such as class versioning.
Basically, an object of any class that implements the Serializable interface can be saved and restored from a stream. Special stream subclasses, ObjectInputStream and ObjectOutputStream , are used to serialize primitive types and objects. Subclasses of Serializable classes are also serializable. The default serialization mechanism saves the value of an object's nonstatic and nontransient (see the following explanation) member variables.
One of the most important (and tricky) things about serialization is that when an object is serialized, any object references it contains are also serialized. Serialization can capture entire "graphs" of
In the following example, we create a Hashtable and write it to a disk file called h.ser . The Hashtable object is serializable because it implements the Serializable interface.
//file: Save.java
import java.io.*;
import java.util.*;
public class Save {
public static void main(String[] args) {
Hashtable h = new Hashtable();
h.put("string", "Gabriel Garcia Marquez");
h.put("int", new Integer(26));
h.put("double", new Double(Math.PI));
try {
FileOutputStream fileOut = new FileOutputStream("h.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(h);
}
catch (Exception e) {
System.out.println(e);
}
}
}
First we construct a Hashtable with a few elements in it. Then, in the three lines of code inside the try block, we write the Hashtable to a file called h.ser , using the writeObject() method of ObjectOutputStream . The ObjectOutputStream class is a lot like the DataOutputStream class, except that it includes the powerful writeObject() method.
The
Hashtable
we created has internal references to the items it contains. Thus, these components are automatically serialized along with the
Hashtable
. We'll see this in the next example when we
//file: Load.java
import java.io.*;
import java.util.*;
public class Load {
public static void main(String[] args) {
try {
FileInputStream fileIn = new FileInputStream("h.ser");
ObjectInputStream in = new ObjectInputStream(fileIn);
Hashtable h = (Hashtable)in.readObject();
System.out.println(h.toString());
}
catch (Exception e) {
System.out.println(e);
}
}
}
In this example, we read the Hashtable from the h.ser file, using the readObject() method of ObjectInputStream . The ObjectInputStream class is a lot like DataInputStream , except that it includes the readObject() method. The return type of readObject() is Object , so we need to cast it to a Hashtable . Finally, we print out the contents of the Hashtable using its toString() method.
Often simple deserialization alone is not enough to reconstruct the full state of an object. For example, the object may have had transient fields representing state that could not be serialized, such as network connections, event registration, or decoded image data. Objects have an opportunity to do their own setup after deserialization by implementing a special method named readObject() .
Not to be
private void readObject(ObjectInputStream s)
throws IOException, ClassNotFoundException
{
s.defaultReadObject();
initialize();
if (isRunning)
start();
}
When the
readObject()
method with this signature exists in an object it is called during the deserialization process. The argument to the method is the
ObjectInputStream
doing the object construction. We delegate to its
defaultReadObject()
method to do the normal deserialization and then do our custom setup. In this case we call one of our methods, named
initialize()
, and
We'll talk more about serialization in Chapter 21 when we discuss JavaBeans. There we'll see that it is even possible to serialize a graphical GUI component in mid-use and bring it back to life later.
The java.util.zip package contains classes you can use for data compression. In this section, we'll talk about how to use these classes. We'll also present two useful example programs that build on what you have just learned about streams and files. The classes in the java.util.zip package support two widespread compression formats: GZIP and ZIP.
The java.util.zip class provides two FilterOutputStream subclasses to write compressed data to a stream. To write compressed data in the GZIP format, simply wrap a GZIPOutputStream around an underlying stream and write to it. The following is a complete example that shows how to compress a file using the GZIP format.
//file: GZip.java
import java.io.*;
import java.util.zip.*;
public class GZip {
public static int sChunk = 8192;
public static void main(String[] args) {
if (args.length != 1) {
System.out.println("Usage: GZip source");
return;
}
// create output stream
String zipname = args[0] + ".gz";
GZIPOutputStream zipout;
try {
FileOutputStream out = new FileOutputStream(zipname);
zipout = new GZIPOutputStream(out);
}
catch (IOException e) {
System.out.println("Couldn't create " + zipname + ".");
return;
}
byte[] buffer = new byte[sChunk];
// compress the file
try {
FileInputStream in = new FileInputStream(args[0]);
int length;
while ((length = in.read(buffer, 0, sChunk)) != -1)
zipout.write(buffer, 0, length);
in.close();
}
catch (IOException e) {
System.out.println("Couldn't compress " + args[0] + ".");
}
try { zipout.close(); }
catch (IOException e) {}
}
}
First we check to make sure we have a command-line argument representing a filename. Then we construct a
GZIPOutputStream
wrapped around a
FileOutputStream
representing the given filename, with the
.gz
suffix appended. With this in place, we open the source file. We read
Writing data to a ZIP archive file is a little more involved but still quite manageable. While a GZIP file contains only one compressed file, a ZIP file is actually a collection of files, some (or all) of which may be compressed. Each item in the ZIP file is represented by a ZipEntry object. When writing to a ZipOutputStream , you'll need to call putNextEntry() before writing the data for each item. The following example shows how to create a ZipOutputStream . You'll notice it's just like creating a GZIPOutputStream :
ZipOutputStream zipout;
try {
FileOutputStream out = new FileOutputStream("archive.zip");
zipout = new ZipOutputStream(out);
}
catch (IOException e) {}
Let's say we have two files we want to write into this archive. Before we begin writing, we need to call putNextEntry() . We'll create a simple entry with just a name. There are other fields in ZipEntry that you can set, but most of the time you won't need to bother with them.
try {
ZipEntry entry = new ZipEntry("First");
zipout.putNextEntry(entry);
ZipEntry entry = new ZipEntry("Second");
zipout.putNextEntry(entry);
. . .
}
catch (IOException e) {}
To decompress data, you can use one of the two FilterInputStream subclasses provided in java.util.zip . To decompress data in the GZIP format, simply wrap a GZIPInputStream around an underlying FileInputStream and read from it. The following is a complete example that shows how to decompress a GZIP file:
//file: GUnzip.java
import java.io.*;
import java.util.zip.*;
public class GUnzip {
public static int sChunk = 8192;
public static void main(String[] args) {
if (args.length != 1) {
System.out.println("Usage: GUnzip source");
return;
}
// create input stream
String zipname, source;
if (args[0].endsWith(".gz")) {
zipname = args[0];
source = args[0].substring(0, args[0].length() - 3);
}
else {
zipname = args[0] + ".gz";
source = args[0];
}
GZIPInputStream zipin;
try {
FileInputStream in = new FileInputStream(zipname);
zipin = new GZIPInputStream(in);
}
catch (IOException e) {
System.out.println("Couldn't open " + zipname + ".");
return;
}
byte[] buffer = new byte[sChunk];
// decompress the file
try {
FileOutputStream out = new FileOutputStream(source);
int length;
while ((length = zipin.read(buffer, 0, sChunk)) != -1)
out.write(buffer, 0, length);
out.close();
}
catch (IOException e) {
System.out.println("Couldn't decompress " + args[0] + ".");
}
try { zipin.close(); }
catch (IOException e) {}
}
}
First we check to make sure we have a command-line argument representing a filename. If the argument ends with .gz , we figure out what the filename for the uncompressed file should be. Otherwise, we use the given argument and assume the compressed file has the .gz suffix. Then we construct a GZIPInputStream wrapped around a FileInputStream , representing the compressed file. With this in place, we open the target file. We read chunks of data from the GZIPInputStream and write them into the target file. Finally, we clean up by closing our open streams.
Again, the ZIP archive presents a little more complexity than the GZIP file. When reading from a ZipInputStream , you should call getNextEntry() before reading each item. When getNextEntry() returns null , there are no more items to read. The following example shows how to create a ZipInputStream . You'll notice it's just like creating a GZIPInputStream :
ZipInputStream zipin;
try {
FileInputStream in = new FileInputStream("archive.zip");
zipin = new ZipInputStream(in);
}
catch (IOException e) {}
Suppose we want to read two files from this archive. Before we begin reading, we need to call getNextEntry() . At the least, the entry will give us a name of the item we are reading from the archive:
try {
ZipEntry first = zipin.getNextEntry();
}
catch (IOException e) {}
At this point, you can read the contents of the first item in the archive. When you come to the end of the item, the read() method will return -1 . Now you can call getNextEntry() again to read the second item from the archive:
try {
ZipEntry second = zipin.getNextEntry();
}
catch (IOException e) {}
If you call getNextEntry(), and it returns null , there are no more items, and you have reached the end of the archive.
The java.nio package is a major new addition in Java 1.4. The name NIO stands for "new I/O," which may seem to imply that it is to be a replacement for the java.io package. In fact, much of the NIO functionality does overlap with existing APIs. NIO was added primarily to address specific issues of scalability for large systems, especially in networked applications. That said, NIO also provides several new features Java lacked in basic I/O, so there are some tools here that you'll want to look at even if you aren't planning to write any large or high-performance services. The primary features of NIO are outlined in the following sections.
Most of the need for the NIO package was driven by the
In addition to nonblocking and selectable I/O, the NIO package enables closing and interrupting I/O operations asynchronously. As discussed in Chapter 8, prior to NIO there was no reliable way to stop or wake up a thread blocked in an I/O operation. With NIO, threads blocked in I/O operations always wake up when interrupted or when the channel is closed by anyone. Additionally, if you interrupt a thread while it is blocked in an NIO operation, its channel is automatically closed. (Closing the channel because the thread is
Channel I/O is designed around the concept of buffers , which are a more sophisticated form of array, tailored to working with communications. The NIO package supports the concept of direct buffers , buffers that maintain their memory outside the Java virtual machine, in the native host operating system. Since all real I/O operations ultimately have to work with the host OS, by maintaining the buffer space there, some operations can be made much more efficient. Data can be transferred without first copying it into Java and back out.
NIO provides two general-purpose file-related features—memory-mapped files and file locking. We'll discuss memory-mapped files later, but suffice it to say that they allow you to work with file data as if it were all magically resident in memory. File locking supports the concept of shared and exclusive locks on
While java.io deals with streams, java.nio works with channels. A channel is an endpoint for communication. Although in practice channels are similar to streams, the underlying notion of a channel is a bit more abstract and primitive. Whereas streams in java.io are defined in terms of input or output with methods to read and write bytes, the basic channel interface says nothing about how communications happen. It simply defines whether the channel is open or closed, via the methods isOpen() and close() . Implementations of channels for files, network sockets, or arbitrary devices then add their own methods for operations such as reading, writing, or transferring data. The following channels are provided by NIO:
FileChannel
Pipe.SinkChannel , Pipe.SourceChannel
SocketChannel , ServerSocketChannel , DatagramChannel
We'll cover FileChannel in this chapter. The Pipe channels are simply the channel equivalents of the java.io Pipe facilities. We'll talk about Socket and Datagram channels in Chapter 12.
All these basic channels implement the ByteChannel interface, designed for channels that have read and write methods such as I/O streams. ByteChannel s read and write ByteBuffer s, however, not byte arrays.
In addition to these native channels, you can bridge to channels from java.io I/O streams and readers and writers for interoperability. Know that, if you mix these features, you may not get the full benefits of performance and asynchronous I/O.
Most of the utilities of the java.io and java.net packages operate on byte arrays. The corresponding tools of the NIO package are built around ByteBuffer s (with another type of buffer, CharBuffer , serving as a bridge to the text world). Byte arrays are simple, so why are buffers necessary? They serve several purposes.
They formalize the usage patterns for buffered data and they provide for things like read-only buffers and keep track of read/write
They provide additional APIs for working with raw data representing primitive types. You can create buffers that "view" your byte data as a series of larger primitives such as short s, int s, or float s. The most general type of data buffer, ByteBuffer , includes methods that let you read and write all primitive types like DataOutputStream does for streams.
They abstract the underlying storage of the data, allowing for special optimizations by Java. Specifically, buffers may be allocated as direct buffers that use native buffers of the host operating system instead of arrays in Java's memory. The NIO
Channel
facilities that work with buffers can recognize direct buffers automatically and try to optimize I/O to use them. For example, a read from a file channel into a Java byte array normally requires Java to copy the data for the read from the host operating system into Java's memory. But with a direct buffer the data can
Buffer
is a subclass of
java.nio.Buffer
object. The base
Buffer
is something like an array with state. The base
Buffer
class does not specify what type of elements it holds (that is for
Implementations of
Buffer
add specific, typed get and put methods that read and write the buffer contents. For example,
ByteBuffer
is a buffer of bytes and it has
get()
and
put()
methods that read and write bytes and arrays of bytes (along with many other useful methods we'll discuss later). Getting from and
The mark, position, limit, and capacity values always obey the formula:
The position for reading and writing the Buffer is always greater than the mark, which serves as a lower bound, and the limit, which serves as an upper bound. The capacity represents the physical extent of the buffer space.
You can set the position and limit markers explicitly with the position() and limit() methods. But several convenience methods are provided for the common usage patterns. The reset() method sets the position back to the mark. If no mark has been set, an InvalidMarkException is thrown. The clear() method resets the position to zero and makes the limit the capacity, readying the buffer for new data (the mark is discarded). Note that the clear() method does not actually do anything to the data in the buffer; it simply changes the position markers.
The flip() method is used for the common pattern of writing data into the buffer and then reading it back out. flip makes the current position the limit and then resets the current position to zero (any mark is thrown away). This saves having to keep track of how much data was read. Another method, rewind() , simply resets the position to zero, leaving the limit alone. You might use it to write the same data again. Here is a snippet of code that uses these methods to read data from a channel and writes it to two channels:
ByteBuffer buff = ...
while (inChannel.read(buff) > 0) { // position = ?
buff.flip(); // limit = position; position = 0;
outChannel.write(buff);
buff.rewind(); // position = 0
outChannel2.write(buff);
buff.clear(); // position = 0; limit = capacity
}
This might be confusing the first time you look at it because here the read from the Channel is actually a write to the Buffer and vice versa. Because this example writes all the available data up to the limit, either flip() or rewind() have the same effect in this case.
As stated earlier, various buffer types add get and put methods for reading and writing specific data types. There is a buffer type for each of the Java primitive types: ByteBuffer , CharBuffer , ShortBuffer , IntBuffer , LongBuffer , FloatBuffer and DoubleBuffer . Each provides get and put methods for reading and writing its type and arrays of its type. Of these, ByteBuffer is the most flexible. Because it has the "finest grain" of all the buffers, it has been given a full complement of get and put methods for reading and writing all the other data types, as well as byte . Here are some ByteBuffer methods:
byte get() char getChar() short getShort() int getInt() long getLong() float getFloat() double getDouble() void put(byte b) void put(ByteBuffer src) void put(byte[] src, int offset, int length) void put(byte[] src) void putChar(char value) void putShort(short value) void putInt(int value) void putLong(long value) void putFloat(float value) void putDouble(double value)
As we said, all the standard buffers also support random access. So for each of the aforementioned methods of ByteBuffer , there is an additional form that takes an index:
getLong(int index) putLong(int index, long value)
But that's not all.
ByteBuffer
can also provide "views" of itself as any of the larger grained types. For example, you can fetch a
ShortBuffer
view of a
ByteBuffer
with the
asShortBuffer()
method. The
ShortBuffer
view is
CharBuffer
s are interesting as well, primarily because of their integration with
String
s. Both
CharBuffer
s and
String
s implement the
java.lang.CharSequence
interface. This is the interface that provides the standard
charAt()
and
length()
methods. Because of this,
Now, since we're talking about reading and writing types larger than a byte here, the question arises: in what order do the bytes of multibyte values (e.g. short s, int s) get written? There are two camps in this world: "big endian" and "little endian." [1] Big endian means that the most significant bytes come first; little endian is the reverse. If you're writing binary data for consumption by some native application, this is important. Intel-compatible computers use little endian, and many workstations that run Unix use big endian. The ByteOrder class encapsulates the choice. You can specify the byte order to use with the ByteArray order() method, using the identifiers ByteOrder.BIG_ENDIAN and ByteOrder.LITTLE_ENDIAN like so:
byteArray.order(ByteOrder.BIG_ENDIAN);
You can retrieve the native ordering for your platform using the static ByteOrder.nativeOrder() method.
You can create a buffer either by allocating it explicitly using allocate() or by wrapping an existing array type. Each buffer type has a static allocate() method that takes a capacity (size) and also a wrap() method that takes an existing array:
CharBuffer cbuf = CharByffer.allocate(64*1024);
A direct buffer is allocated in the same way, with the allocateDirect() method:
ByteBuffer bbuf = ByteByffer.allocateDirect(64*1024);
As we described earlier, direct buffers can use native host operating-system memory structures that are optimized for use with some kinds of I/O operations. The
Character encoders and decoders turn characters into raw bytes and vice versa, mapping from the Unicode standard to particular encoding schemes. Encoders and decoders have always existed in Java for use by Reader and Writer streams and in the methods of the String class that work with byte arrays. However, prior to Java 1.4, there was no API for working with encoding explicitly; you simply referred to encoders and decoders wherever necessary by name as a String . The java.nio.charset package formalizes the idea of a Unicode character set with the Charset class.
The Charset class is a factory for Charset instances, which know how to encode character buffers to byte buffers and decode byte buffers to character buffers. You can look up a character set by name with the static Charset.forName() method and use it in conversions:
Charset charset = Charset.forName("US-ASCII");
CharBuffer charBuff = charset.decode(byteBuff); // to ascii
ByteBuffer byteBuff = charset.encode(charBuff); // and back
You can also test to see if an encoding is available with the static Charset.isSupported() method.
The following character sets are guaranteed to be supplied:
US-ASCII
ISO-8859-1
UTF-8
UTF-16BE
UTF-16LE
UTF-16
You can list all the encoders available on your platform using the static availableCharsets() method:
Map map = Charset.availableCharsets();
Iterator it = map.keySet().iterator();
while (it.hasNext())
System.out.println(it.next());
The result of availableCharsets() is a map because character sets may have "aliases" and appear under more than one name.
In addition to the buffer-oriented classes of the java.nio package, the InputStreamReader and OutputStreamWriter bridge classes of the java.io package have been updated to work with Charset as well. You can specify the encoding as a Charset object or by name.
You can get more control over the encoding and decoding process by creating an instance of
CharsetEncoder
or
CharsetDecoder
(codec) with the
Charset
newEncoder()
and
newDecoder()
methods. In our earlier example, we assumed that all the data was available in a single buffer. More often, however, we might have to process data as it arrives in chunks. The encoder/decoder API allows for this by providing more general
encode()
and
decode()
methods that take a flag specifying whether more data is expected. The codec needs to know this because it might have been left hanging in the middle of a multibyte character conversion when the data ran out. If it
CharsetDecoder decoder = Charset.forName("US-ASCII").newDecoder();
boolean done = false;
while (!done) {
bbuff.clear();
done = (in.read(bbuff) == -1);
bbuff.flip();
decoder.decode(bbuff, cbuff, done);
}
cbuff.flip();
// use cbuff. . .
Here we look for the end of input condition on the
in
channel to set the flag
done
. The
encode()
and
decode()
methods also return a special result object,
CoderResult
, that can determine the progress of encoding. The methods
isError()
,
isUnderflow()
, and
isOverflow()
on the
CoderResult
specify why encoding
Now that we've covered the basics of channels and buffers, it's time to look at a real channel type. The FileChannel is the NIO equivalent of the java.io.RandomAccessFile , but it provides several basic new features, in addition to some performance optimizations. You will want to use a FileChannel in place of a plain java.io file stream if you wish to use file locking, memory mapped file access, or perform highly optimized data transfer between files or between file and network channels.
A FileChannel is constructed from a FileInputStream , FileOutputStream , or RandomAccessFile :
FileChannel readOnlyFc = new FileInputStream("file.txt").getChannel();
FileChannel readWriteFc =
new RandomAccessFile("file.txt", "rw").getChannel();
FileChannel s for file input and output streams are read-only or write-only, respectively. To get a read-write FileChannel you must construct a RandomAccessFile with read-write permissions, as in the previous example.
Using a FileChannel is just like a RandomAccessFile , but it works with ByteBuffer instead of byte arrays:
bbuf.clear(); readOnlyFc.position(index); readOnlyFc.read(bbuf); bbuf.flip(); readWriteFc.write(bbuf);
You can control how much data is read and written either by setting buffer position and limit markers or using another form of read/write that takes a buffer starting position and length. You can also read and write to a random position using:
readWriteFc.read(bbuf, index) readWriteFc.write(bbuf, index2);
In each case, the actual number of bytes read or written depends on several factors. The operation tries to read or write to the limit of the buffer and the vast majority of the time that is what happens with local file access. But the operation is only guaranteed to block until at least one byte has been processed. Whatever happens, the number of bytes
The size of the file is always available with the size() method. It can change if you write past the end of the file. Conversely, you can truncate the file to a specified length with the truncate() method.
FileChannel
s are safe for use by multiple threads and guarantee that data "
As with all Channel s, a FileChannel may be closed by any thread. Once closed all its read/write and position-related methods throw a ClosedChannelException .
FileChannel s support exclusive and shared locks on regions of files through the lock() method:
FileLock fileLock = fileChannel.lock(); int start = 0, len = fileChannel2.size(); FileLock readLock = fileChannel2.lock(start, len, true);
Locks may be either shared or exclusive. An
exclusive
lock
The simple lock() method in the previous example attempts to acquire an exclusive lock for the whole file. The second form accepts a starting and length parameter, as well as a flag indicating whether the lock should be shared (or exclusive). The FileLock object returned by the lock() method can be used to release the lock:
fileLock.release();
Note that file locks are a cooperative API; they do not
One of the most interesting new features
This may seem counterintuitive; we're getting a conceptually easier way to access our data and it's also faster and more efficient? What's the catch? There really is no catch. The reason for this is that all modern operating systems are based on the idea of virtual memory. In a
A good example of where a memory-mapped file would be useful is in a database. Imagine a 100-MB file containing records indexed at various positions. By mapping the file we can work with a standard
ByteBuffer
, reading and writing data at arbitrary positions and let the native operating system read and write the underlying data in fine grained pages, as necessary. We could emulate this behavior with
RandomAccessFile
or
FileChannel
, but we would have to explicitly read and write data into buffers first, and the implementation would almost
A mapping is created with the FileChannel map() method. For example:
FileChannel fc = new RandomAccessFile("index.db", "rw").getChannel();
MappedByteBuffer mappedBuff =
fc.map(FileChannel.MapMode.READ_WRITE, 0, fc.size());
The map() method returns a MappedByteBuffer , which is simply the standard ByteBuffer with a few additional methods relating to the mapping. The most important is force() , which ensures that any data written to the buffer is flushed out to permanent storage on the disk. The READ_ONLY and READ_WRITE constant identifiers of the FileChannel.MapMode static inner class specify the type of access. Read-write access is available only when mapping a read-write file channel. Data read through the buffer is always consistent within the same Java VM. It can also be consistent across applications on the same host machine, but this is not guaranteed.
Again, a MappedByteBuffer acts just like a ByteBuffer . Continuing with the previous example, we could decode the buffer with a character decoder and search for a pattern like so:
CharBuffer cbuff = Charset.forName("US-ASCII").decode(mappedBuff);
Matcher matcher = Pattern.compile("abc*").matcher(cbuff);
while (matcher.find())
print(matcher.start()+": "+matcher.group(0));
Here we have effectively implemented the Unix grep command in about five lines of code (thanks to the fact that the Regular Expression API can work with our CharBuffer as a CharSequence ). Of course in this example, the CharBuffer allocated by the decode() method is as large as the mapped file and must be held in memory. More generally, we can use the CharsetDecoder shown earlier to iterate through a large mapped space.
The final feature of FileChannel that we'll look at is performance optimization. FileChannel supports two highly optimized data transfer methods: transferFrom() and transferTo() , which move data between the file channel and another channel. These methods can take advantage of direct buffers internally to move data between the channels as fast as possible, often without copying the bytes into Java's memory space at all. The following example is currently the fastest way to implement a file copy in Java:
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class CopyFile {
public static void main(String [] args) throws Exception
{
String fromFileName = args[0];
String toFileName = args[1];
FileChannel in = new FileInputStream(fromFileName).getChannel();
FileChannel out = new FileOutputStream(toFileName).getChannel();
in.transferTo(0, (int)in.size(), out);
in.close();
out.close();
}
}
We've laid the
[1] The terms "big endian" and "little endian" come from Jonathan Swift's
novel Gulliver's Travels , where it denoted two camps of Lilliputians: those who eat their eggs from the big end and those who eat them from the little end.
|
|
CONTENTS |
|