Flylib.com

Books Software

 
 
 

Part PART IV: New IO


Part PART IV: New I/O

Chapter 14: Buffers

Chapter 15: Channels

Chapter 16: Nonblocking I/O




Chapter 14. Buffers

Traditional synchronous I/O is designed for traditional applications. Such applications have the following characteristics:

  • Files may be large but not huge. It's possible to read an entire file into memory.

  • An application reads from or writes to only a few files or network connections at the same time, ideally using only one stream at a time.

  • The application is sequential. It won't be able to do much until it's finished reading or writing a file.

As long as these characteristics hold, stream-based I/O is reasonably quick and operates fairly efficiently . However, if these prerequisites are violated, the standard I/O model begins to show some weaknesses. For example, web servers often need to service hundreds or thousands of connections simultaneously . Scientific, engineering, and multimedia applications often need to manipulate datasets that are gigabytes in size .

Java 1.4 introduced a new model for I/O that is designed more for these sorts of applications and less for the more traditional applications that don't have to do so much I/O. The classes that make up this new I/O library are all found in the java.nio package and its subpackages. The new I/O model does not replace traditional, stream-based I/O. Indeed, several parts of the new I/O API are based on streams. However, the new I/O model is much more efficient for certain types of I/O-bound applications.

Whereas the traditional I/O model is based on streams, the new I/O model is based on buffers and channels. A buffer is like an array (in some implementations it may in fact be an array) that holds the data to be read and written. However, unlike input and output streamseven buffered input and output streamsthe same buffer can be used for both reading and writing. Input channels fill a buffer with data that output channels then drain. Rather than being a part of a channel, a buffer is a neutral meeting ground in which channels exchange data. Furthermore, because buffers are objects accessed via methods , they may not really be arrays. They can be implemented directly on top of memory or the disk for extremely fast, random access. For the right kind of application, the performance gains can be dramatic.

Different buffers have different element types, just as arrays do. For instance, there are byte buffers, int buffers, float buffers, and char buffers. The class library doesn't contain any string buffers or object buffers, but you could write these classes yourself if you found a need. The same basic operations apply to all these different kinds of buffers:

  • Allocate the buffer.

  • Put values in the buffer.

  • Get values from the buffer.

  • Flip the buffer.

  • Clear the buffer.

  • Rewind the buffer.

  • Mark the buffer.

  • Reset the buffer.

  • Slice the buffer.

  • Compact the buffer.

  • Duplicate the buffer.



14.1. Copying Files with Buffers

I'm going to begin with a simple example, copying one file to another file. The basic interface to the program looks like this:

$

java FileCopier


original copy


Obviously this program could be written in a traditional way with streams, and that's going to be true of almost all the programs you use the new I/O (NIO) model to write. NIO doesn't make anything possible that was impossible before. However, if the files are large and the local operating system is sophisticated enough, the NIO version of FileCopier might just be faster than the traditional version.

The rough outline of the program is typical:

import java.io.*;
import java.nio.*;
public class NIOCopier {
  public static void main(String[] args) throws IOException {
    FileInputStream inFile = new FileInputStream(args[0]);
    FileOutputStream outFile = new FileOutputStream(args[1]);

    // copy files here...
    inFile.close( );
    outFile.close( );
  }
}

However, rather than merely reading from the input stream and writing to the output stream, I'm going to do something a little different. First, I open channels to both files using the getChannel( ) methods in FileInputStream and FileOutputStream :

FileChannel inChannel = inFile.getChannel( );
FileChannel outChannel = outFile.getChannel( );

Next, I create a one-megabyte buffer with the static factory method ByteBuffer.allocate( ) :

ByteBuffer buffer = ByteBuffer.allocate(1024*1024);

The input channel will fill this buffer with data from the original file and the output channel will drain data out of this buffer to store into the copy.

To read data, you pass the buffer to the input channel's read( ) method, much as you'd pass a byte array to an input stream's read( ) method:

inChannel.read(buffer);

The read( ) method returns the number of bytes it read. As with input streams, there's no guarantee that the read( ) method completely fills the buffer. It may read fewer bytes or no bytes at all. When the input data is exhausted, the read( ) method returns -1 . Thus, you normally do something like this:

long bytesRead = inChannel.read(buffer);
if (bytesRead == -1) break;

Now the output channel needs to write the data in the buffer into the copy. Before it can do that, though, the buffer must be flipped :

buffer.flip( );

Flipping a buffer converts it from input to output.

To write the data, you pass the buffer to the output channel's write( ) method:

outChannel.write(buffer);

However, this is not like an output stream's write(byte[]) method. That method is guaranteed to write every byte in the array to the target or throw an IOException if it can't. The output channel's write( ) method is more like the read( ) method. It will write some bytes, but perhaps not all, and perhaps even none. It returns the number of bytes written. You could loop repeatedly until all the bytes are written, like this:

long bytesWritten = 0;
while (bytesWritten < bytesRead){
  bytesWritten += outChannel.write(buffer);
}

However, there's a simpler way. The buffer object itself knows whether all the data has been written. The hasRemaining( ) method can check this:

while (buffer.hasRemaining( )) outChannel.write(buffer);

This code reads and writes at most one megabyte. To copy larger files, we have to wrap all this up in a loop:

while (true) {
  ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
  int bytesRead = inChannel.read(buffer);
  if (bytesRead == -1) break;
  buffer.flip( );
  while (buffer.hasRemaining( )) outChannel.write(buffer);
}

Allocating a new buffer for each read is wasteful and inefficient; we should reuse the same buffer. Before we do that, though, we must restore the buffer to a fresh state by invoking its clear( ) method:

ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
while (true) {
  int bytesRead = inChannel.read(buffer);
  if (bytesRead == -1) break;
  buffer.flip( );
  while (buffer.hasRemaining( )) outChannel.write(buffer);
  buffer.clear( );
}

Finally, both the input and output channels should be closed to release any native resources the channel object may be holding onto:

inChannel.close( );
outChannel.close( );

Example 14-1 demonstrates the complete program, after taking a couple of common small shortcuts in the code. Compare this to the equivalent program for copying with streams found in Example 4-2.

Example 14-1. Copying files using NIO
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class NIOCopier {
  public static void main(String[] args) throws IOException {
    FileInputStream inFile = new FileInputStream(args[0]);
    FileOutputStream outFile = new FileOutputStream(args[1]);
    FileChannel inChannel = inFile.getChannel( );
    FileChannel outChannel = outFile.getChannel( );
    for (ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
    inChannel.read(buffer) != -1;
    buffer.clear( )) {
      buffer.flip( );
      while (buffer.hasRemaining( )) outChannel.write(buffer);
    }
    inChannel.close( );
    outChannel.close( );
  }
}

In a very unscientific test, copying one large (4.3-GB) file on one platform (a dual 2.5-GHz PowerMac G5 running Mac OS X 10.4.1) using traditional I/O with buffered streams and an 8192-byte buffer took 305 seconds. Expanding and reducing the buffer size didn't shift the overall numbers more than 5% and if anything tended to increase the time to copy. (Using a one-megabyte buffer like Example 14-1's actually increased the time to over 23 minutes.) Using new I/O as implemented in Example 14-1 was about 16% faster, at 255 seconds. A straight Finder copy took 197 seconds. Using the Unix cp command actually took 312 seconds, so the Finder is doing some surprising optimizations under the hood.

What this suggests is that new I/O doesn't help a great deal for traditional file operations that move through the file from beginning to end. The new I/O API is clearly not a panacea for all I/O performance issues. You can expect to see the biggest improvements in two other areas:

  • Network servers that talk to many clients simultaneously

  • Repeated random access to parts of large files