14.1. Copying
Files with Buffers
I'm going to begin with a simple example,
copying one file to another file. The basic interface to the
program looks like this:
$
java FileCopier
original copy
Obviously this program could be written in a
traditional way with streams, and that's going to be true of almost
all the programs you use the new I/O (NIO) model to write. NIO
doesn't make anything possible that was
impossible
before. However,
if the files are large and the local operating system is
sophisticated enough, the NIO version of
FileCopier
might
just be faster than the traditional version.
The rough outline of the program is typical:
import java.io.*;
import java.nio.*;
public class NIOCopier {
public static void main(String[] args) throws IOException {
FileInputStream inFile = new FileInputStream(args[0]);
FileOutputStream outFile = new FileOutputStream(args[1]);
// copy files here...
inFile.close( );
outFile.close( );
}
}
However, rather than merely reading from the
input stream and writing to the output stream, I'm going to do
something a little different. First, I
open
channels to both files
using the
getChannel( )
methods
in
FileInputStream
and
FileOutputStream
:
FileChannel inChannel = inFile.getChannel( );
FileChannel outChannel = outFile.getChannel( );
Next, I create a one-megabyte buffer with the
static factory method
ByteBuffer.allocate( )
:
ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
The input channel will fill this buffer with
data from the original file and the output channel will drain data
out of this buffer to store into the copy.
To read data, you pass the buffer to the input
channel's
read( )
method, much as you'd pass a byte array
to an input stream's
read( )
method:
inChannel.read(buffer);
The
read( )
method returns the number
of bytes it read. As with input streams, there's no guarantee that
the
read( )
method completely fills the buffer. It may
read fewer bytes or no bytes at all. When the input data is
exhausted, the
read( )
method returns
-1
. Thus,
you normally do something like this:
long bytesRead = inChannel.read(buffer);
if (bytesRead == -1) break;
Now the output channel needs to write the data
in the buffer into the copy. Before it can do that, though, the
buffer must be
flipped
:
buffer.flip( );
Flipping a buffer converts it from input to
output.
To write the data, you pass the buffer to the
output channel's
write( )
method:
outChannel.write(buffer);
However, this is not like an output stream's
write(byte[])
method. That method is
guaranteed
to write
every byte in the array to the target or throw an
IOException
if it can't. The output channel's
write(
)
method is more like the
read( )
method. It will
write some bytes, but perhaps not all, and perhaps even none. It
returns the number of bytes written. You could loop repeatedly
until all the bytes are written, like this:
long bytesWritten = 0;
while (bytesWritten < bytesRead){
bytesWritten += outChannel.write(buffer);
}
However, there's a simpler way. The buffer
object itself
knows
whether all the data has been written. The
hasRemaining( )
method can check this:
while (buffer.hasRemaining( )) outChannel.write(buffer);
This code reads and
writes
at most one megabyte.
To copy larger files, we have to wrap all this up in a loop:
while (true) {
ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
int bytesRead = inChannel.read(buffer);
if (bytesRead == -1) break;
buffer.flip( );
while (buffer.hasRemaining( )) outChannel.write(buffer);
}
Allocating a new buffer for each read is
wasteful
and inefficient; we should reuse the same buffer. Before
we do that, though, we must restore the buffer to a fresh state by
invoking its
clear( )
method:
ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
while (true) {
int bytesRead = inChannel.read(buffer);
if (bytesRead == -1) break;
buffer.flip( );
while (buffer.hasRemaining( )) outChannel.write(buffer);
buffer.clear( );
}
Finally, both the input and output channels
should be closed to release any native resources the channel object
may be holding onto:
inChannel.close( );
outChannel.close( );
Example 14-1
demonstrates
the complete program,
after taking a couple of common small shortcuts in the code.
Compare this to the equivalent program for copying with streams
found in Example 4-2.
Example 14-1.
Copying files using NIO
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class NIOCopier {
public static void main(String[] args) throws IOException {
FileInputStream inFile = new FileInputStream(args[0]);
FileOutputStream outFile = new FileOutputStream(args[1]);
FileChannel inChannel = inFile.getChannel( );
FileChannel outChannel = outFile.getChannel( );
for (ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
inChannel.read(buffer) != -1;
buffer.clear( )) {
buffer.flip( );
while (buffer.hasRemaining( )) outChannel.write(buffer);
}
inChannel.close( );
outChannel.close( );
}
}
|
In a very unscientific test, copying one large
(4.3-GB) file on one platform (a dual 2.5-GHz PowerMac G5 running
Mac OS X 10.4.1) using traditional I/O with buffered streams and an
8192-byte buffer took 305 seconds. Expanding and reducing the
buffer
size
didn't shift the overall
numbers
more than 5% and if
anything tended to increase the time to copy. (Using a one-megabyte
buffer like Example 14-1's actually increased the time to over 23
minutes.) Using new I/O as implemented in Example 14-1 was about
16% faster, at 255 seconds. A straight Finder copy took 197
seconds. Using the Unix
cp
command
actually took 312 seconds, so the Finder is doing some surprising
optimizations under the hood.
What this suggests is that new I/O doesn't help
a great deal for traditional file operations that move through the
file from beginning to end. The new I/O API is clearly not a
panacea for all I/O performance issues. You can expect to see the
biggest improvements in two other areas:
|