The Complete Stream Zoo

   

Core Java™ 2: Volume I - Fundamentals
By Cay S. Horstmann, Gary Cornell
Table of Contents
Chapter 12.  Streams and Files


Unlike C, which gets by just fine with a single type FILE*, Java has a whole zoo of more than 60 (!) different stream types (see Figures 12-1 and 12-2). Library designers claim that there is a good reason to give users a wide choice of stream types: it is supposed to reduce programming errors. For example, in C, some people think it is a common mistake to send output to a file that was open only for reading. (Well, it is not that common, actually.) Naturally, if you do this, the output is ignored at run time. In Java and C++, the compiler catches that kind of mistake because an InputStream (Java) or istream (C++) has no methods for output.

Figure 12-1. Input and Output stream hierarchy

graphics/12fig01.gif

Figure 12-2. Reader and Writer hierarchy

graphics/12fig02.gif

(We would argue that in C++, and even more so in Java, the main tool that the stream interface designers have against programming errors is intimidation. The sheer complexity of the stream libraries keeps programmers on their toes.)

graphics/cplus_icon.gif

ANSI C++ gives you more stream types than you want, such as istream, ostream, iostream, ifstream, ofstream, fstream, wistream, wifstream, istrstream, and so on (18 classes in all). But Java really goes overboard with streams and gives you the separate classes for selecting buffering, lookahead, random access, text formatting, or binary data.

Let us divide the animals in the stream class zoo by how they are used. Four abstract classes are at the base of the zoo: InputStream, OutputStream, Reader, and Writer. You do not make objects of these types, but other methods can return them. For example, as you saw in Chapter 10, the URL class has the method openStream that returns an InputStream. You then use this InputStream object to read from the URL. As we mentioned before, the InputStream and OutputStream classes let you read and write only individual bytes and arrays of bytes; they have no methods to read and write strings and numbers. You need more-capable child classes for this. For example, DataInputStream and DataOutputStream let you read and write all the basic Java types.

For Unicode text, on the other hand, as we mentioned before, you use classes that descend from Reader and Writer. The basic methods of the Reader and Writer classes are similar to the ones for InputStream and OutputStream.

 abstract int read() abstract void write(int b) 

They work just as the comparable methods do in the InputStream and OutputStream classes except, of course, the read method returns either a Unicode character (as an integer between 0 and 65535) or 1 when you have reached the end of the file.

Finally, there are streams that do useful stuff, for example, the ZipInputStream and ZipOutputStream that let you read and write files in the familiar ZIP compression format.

Layering Stream Filters

FileInputStream and FileOutputStream give you input and output streams attached to a disk file. You give the file name or full path name of the file in the constructor. For example,

 FileInputStream fin = new FileInputStream("employee.dat"); 

looks in the current directory for a file named "employee.dat".

graphics/caution_icon.gif

Since the backslash character is the escape character in Java strings, be sure to use \\ for Windows-style path names ("C:\\Windows\\win.ini"). In Windows, you can also use a single forward slash ("C:/Windows/win.ini") since most Windows file handling system calls will interpret forward slashes as file separators. However, this is not recommended the behavior of the Windows system functions is subject to change, and on other operating systems, the file separator may yet be different. Instead, for portable programs, you should use the correct file separator character. It is stored in the constant string File.separator.

You can also use a File object (see the end of the chapter for more on file objects):

 File f = new File("employee.dat"); FileInputStream fin = new FileInputStream(f); 

Like the abstract InputStream and OutputStream classes, these classes only support reading and writing on the byte level. That is, we can only read bytes and byte arrays from the object fin.

 byte b = (byte)fin.read(); 

graphics/exclamatory_icon.gif

Since all the classes in java.io interpret relative path names as starting with the user's current working directory, you may want to know this directory. You can get at this information via a call to System.getProperty("user.dir").

As you will see in the next section, if we just had a DataInputStream, then we could read numeric types:

 DataInputStream din = . . .; double s = din.readDouble(); 

But just as the FileInputStream has no methods to read numeric types, the DataInputStream has no method to get data from a file.

Java uses a clever mechanism to separate two kinds of responsibilities. Some streams (such as the FileInputStream and the input stream returned by the openStream method of the URL class) can retrieve bytes from files and other more exotic locations. Other streams (such as the DataInputStream and the PrintWriter) can assemble bytes into more useful data types. The Java programmer has to combine the two into what are often called filtered streams by feeding an existing stream to the constructor of another stream. For example, to be able to read numbers from a file, first create a FileInputStream and then pass it to the constructor of a DataInputStream.

 FileInputStream fin = new FileInputStream("employee.dat"); DataInputStream din = new DataInputStream(fin); double s = din.readDouble(); 

It is important to keep in mind that the data input stream that we created with the above code does not correspond to a new disk file. The newly created stream still accesses the data from the file attached to the file input stream, but the point is that it now has a more capable interface.

If you look at Figure 12-1 again, you can see the classes FilterInputStream and FilterOutputStream. You combine their child classes into a new filtered stream to construct the streams you want. For example, by default, streams are not buffered. That is, every call to read contacts the operating system to ask it to dole out yet another byte. If you want buffering and data input for a file named employee.dat in the current directory, you need to use the following rather monstrous sequence of constructors:

 DataInputStream din = new DataInputStream    (new BufferedInputStream       (new FileInputStream("employee.dat"))); 

Notice that we put the DataInputStream last in the chain of constructors because we want to use the DataInputStream methods, and we want them to use the buffered read method. Regardless of the ugliness of the above code, it is necessary: you must be prepared to continue layering stream constructors until you have access to the functionality you want.

Sometimes you'll need to keep track of the intermediate streams when chaining them together. For example, when reading input, you often need to peek at the next byte to see if it is the value that you expect. Java provides the PushbackInputStream for this purpose.

 PushbackInputStream pbin = new PushbackInputStream    (new BufferedInputStream       (new FileInputStream("employee.dat"))); 

Now you can speculatively read the next byte

 int b = pbin.read(); 

and throw it back if it isn't what you wanted.

 if (b != '<') pbin.unread(b); 

But reading and unreading are the only methods that apply to the pushback input stream. If you want to look ahead and also read numbers, then you need both a pushback input stream and a data input stream reference.

 DataInputStream din = new DataInputStream    (pbin = new PushbackInputStream       (new BufferedInputStream       (new FileInputStream("employee.dat")))); 

Of course, in the stream libraries of other programming languages, niceties such as buffering and lookahead are automatically taken care of, so it is a bit of a hassle in Java that one has to resort to layering stream filters in these cases. But the ability to mix and match filter classes to construct truly useful sequences of streams does give you an immense amount of flexibility. For example, you can read numbers from a compressed ZIP file by using the following sequence of streams (see Figure 12-3).

 ZipInputStream zin    = new ZipInputStream(new FileInputStream("employee.zip")); DataInputStream din = new DataInputStream(zin); 
Figure 12-3. A sequence of filtered stream

graphics/12fig03.gif

(See the section on ZIP file streams later in this chapter for more on Java's ability to handle ZIP files.)

All in all, apart from the rather monstrous constructors that are needed to layer streams, the ability to mix and match streams is a very useful feature of Java!

java.io.FileInputStream 1.0

graphics/api_icon.gif
  • FileInputStream(String name)

    creates a new file input stream, using the file whose path name is specified by the name string.

  • FileInputStream(File f)

    creates a new file input stream, using the information encapsulated in the File object. (The File class is described at the end of this chapter.)

java.io.FileOutputStream 1.0

graphics/api_icon.gif
  • FileOutputStream(String name)

    creates a new file output stream specified by the name string. Path names that are not absolute are resolved relative to the current working directory. Caution: This method automatically deletes any existing file with the same name.

  • FileOutputStream(String name, boolean append)

    creates a new file output stream specified by the name string. Path names that are not absolute are resolved relative to the current working directory. If the append parameter is true, then data is added at the end of the file. An existing file with the same name will not be deleted.

  • FileOutputStream(File f)

    creates a new file output stream using the information encapsulated in the File object. (The File class is described at the end of this chapter.) Caution: This method automatically deletes any existing file with the same name as the name of f.

java.io.BufferedInputStream 1.0

graphics/api_icon.gif
  • BufferedInputStream(InputStream in)

    creates a new buffered stream with a default buffer size. A buffered input stream reads characters from a stream without causing a device access every time. When the buffer is empty, a new block of data is read into the buffer.

  • BufferedInputStream(InputStream in, int n)

    creates a new buffered stream with a user-defined buffer size.

java.io.BufferedOutputStream 1.0

graphics/api_icon.gif
  • BufferedOutputStream(OutputStream out)

    creates a new buffered stream with a default buffer size. A buffered output stream collects characters to be written without causing a device access every time. When the buffer fills up, or when the stream is flushed, the data is written.

  • BufferedOutputStream(OutputStream out, int n)

    creates a new buffered stream with a user-defined buffer size.

java.io.PushbackInputStream 1.0

graphics/api_icon.gif
  • PushbackInputStream(InputStream in)

    constructs a stream with one-byte lookahead.

  • PushbackInputStream(InputStream in, int size)

    constructs a stream with a pushback buffer of specified size.

  • void unread(int b)

    pushes back a byte, which is retrieved again by the next call to read. You can push back only one character at a time.

    Parameters:

    b

    The byte to be read again

Data Streams

You often need to write the result of a computation or read one back. The data streams support methods for reading back all of the basic Java types. To write a number, character, Boolean value, or string, use one of the following methods of the DataOutput interface:

 writeChars writeByte writeInt writeShort writeLong writeFloat writeDouble writeChar writeBoolean writeUTF 

For example, writeInt always writes an integer as a 4-byte binary quantity regardless of the number of digits, and writeDouble always writes a double as an 8-byte binary quantity. The resulting output is not humanly readable but the space needed will be the same for each data type, and reading it back in will be faster. (See the section on the PrintWriter class later in this chapter for how to output numbers as human readable text.)

graphics/notes_icon.gif

There are two different methods of storing integers and floating-point numbers in memory, depending on the platform you are using. Suppose, for example, you are working with a 4-byte quantity, like an int or a float. This can be stored in such a way that the first of the 4 bytes in memory holds the most significant byte (MSB) of the value, the so-called big-endian method, or it can hold the least significant byte (LSB) first, which is called, naturally enough, the little-endian method. For example, the SPARC uses big-endian; the Pentium, little-endian. This can lead to problems. For example, when saving a file using C or C++, the data is saved exactly as the processor stores it. That makes it challenging to move even the simplest data files from one platform to another. In Java, all values are written in the big-endian fashion, regardless of the processor. That makes Java data files platform independent.

The writeUTF method writes string data using Unicode Text Format (UTF). UTF format is as follows. A 7-bit ASCII value (that is, a 16-bit Unicode character with the top 9 bits zero) is written as one byte:

 0a6a5a4a3a2a1a0 

A 16-bit Unicode character with the top 5 bits zero is written as a 2-byte sequence:

 110a10a9a8a7a6    10a5a4a3a2a1a0 

(The top zero bits are not stored.)

All other Unicode characters are written as 3-byte sequences:

 1110a15a14a13a12   10a11a10a9a8a7a6   10a5a4a3a2a1a0 

This is a useful format for text consisting mostly of ASCII characters because ASCII characters still take only a single byte. On the other hand, it is not a good format for Asiatic languages, for which you are better off directly writing sequences of double-byte Unicode characters. Use the writeChars method for that purpose.

Note that the top bits of a UTF byte determine the nature of the byte in the encoding scheme.

0xxxxxxx

:

ASCII

10xxxxxx

:

Second or third byte

110xxxxx

:

First byte of 2-byte sequence

1110xxxx

:

First byte of 3-byte sequence

To read the data back in, use the following methods:

readInt

readDouble

readShort

readChar

readLong

readBoolean

readFloat

readUTF

graphics/notes_icon.gif

The binary data format is compact and platform independent. Except for the UTF strings, it is also suited to random access. The major drawback is that binary files are not readable by humans.

java.io.DataInput 1.0

graphics/api_icon.gif
  • boolean readBoolean()

    reads in a Boolean value.

  • byte readByte()

    reads an 8-bit byte.

  • char readChar()

    reads a 16-bit Unicode character.

  • double readDouble()

    reads a 64-bit double.

  • float readFloat()

    reads a 32-bit float.

  • void readFully(byte[] b)

    reads bytes into the array b , blocking until all bytes are read.

    Parameters:

    b

    The buffer into which the data is read

  • void readFully(byte[] b, int off, int len)

    reads bytes into the array b, blocking until all bytes are read.

    Parameters:

    b

    The buffer into which the data is read

     

    off

    The start offset of the data

     

    len

    The maximum number of bytes read

  • int readInt()

    reads a 32-bit integer.

  • String readLine()

    reads in a line that has been terminated by a \n, \r, \r\n, or EOF. Returns a string containing all bytes in the line converted to Unicode characters.

  • long readLong()

    reads a 64-bit long integer.

  • short readShort()

    reads a 16-bit short integer.

  • String readUTF()

    reads a string of characters in UTF format.

  • int skipBytes(int n)

    skips n bytes, blocking until all bytes are skipped.

    Parameters:

    n

    The number of bytes to be skipped

java.io.DataOutput 1.0

graphics/api_icon.gif
  • void writeBoolean(boolean b)

    writes a Boolean value.

  • void writeByte(byte b)

    writes an 8-bit byte.

  • void writeChar(char c)

    writes a 16-bit Unicode character.

  • void writeChars(String s)

    writes all characters in the string.

  • void writeDouble(double d)

    writes a 64-bit double.

  • void writeFloat(float f)

    writes a 32-bit float.

  • void writeInt(int i)

    writes a 32-bit integer.

  • void writeLong(long l)

    writes a 64-bit long integer.

  • void writeShort(short s)

    writes a 16-bit short integer.

  • void writeUTF(String s)

    writes a string of characters in UTF format.

Random-Access File Streams

The RandomAccessFile stream class lets you find or write data anywhere in a file. It implements both the DataInput and DataOutput interfaces. Disk files are random access, but streams of data from a network are not. You open a random-access file either for reading only or for both reading and writing. You specify the option by using the string "r" (for read access) or "rw" (for read/write access) as the second argument in the constructor.

 RandomAccessFile in = new RandomAccessFile("employee.dat", "r"); RandomAccessFile inOut    = new RandomAccessFile("employee.dat", "rw"); 

When you open an existing file as a RandomAccessFile, it does not get deleted.

A random-access file also has a file pointer setting that comes with it. The file pointer always indicates the position of the next record that will be read or written. The seek method sets the file pointer to an arbitrary byte position within the file. The argument to seek is a long integer between zero and the length of the file in bytes.

The getFilePointer method returns the current position of the file pointer.

To read from a random-access file, you use the same methods such as readInt and readUTF as for DataInputStream objects. That is no accident. These methods are actually defined in the DataInput interface that both DataInputStream and RandomAccessFile implement.

Similarly, to write a random-access file, you use the same writeInt and writeUTF methods as in the DataOutputStream class. These methods are defined in the DataOutput interface that is common to both classes.

The advantage of having the RandomAccessFile class implement both DataInput and DataOutput is that this lets you use or write methods whose argument types are the DataInput and DataOutput interfaces.

 class Employee {  . . .    read(DataInput in) { . . . }    write(DataOutput out) { . . . } } 

Note that the read method can handle either a DataInputStream or a RandomAccessFile object because both of these classes implement the DataInput interface. The same is true for the write method.

java.io.RandomAccessFile 1.0

graphics/api_icon.gif
  • RandomAccessFile(String name, String mode)

    Parameters:

    name

    System-dependent file name

     

    mode

    "r" for reading only, or "rw" for reading and writing

  • RandomAccessFile(File file, String mode)

    Parameters:

    file

    A File object encapsulating a system-dependent file name. (The File class is described at the end of this chapter.)

     

    mode

    "r" for reading only, or "rw" for reading and writing

  • long getFilePointer()

    returns the current location of the file pointer.

  • void seek(long pos)

    sets the file pointer to pos bytes from the beginning of the file.

  • long length()

    returns the length of the file in bytes.

Text Streams

In the last section, we discussed binary input and output. While binary I/O is fast and efficient, it is not easily readable by humans. In this section, we will focus on text I/O. For example, if the integer 1234 is saved in binary, it is written as the sequence of bytes 00 00 04 D2 (in hexadecimal notation). In text format, it is saved as the string "1234".

Unfortunately, doing this in Java requires a bit of work, because, as you know, Java uses Unicode characters. That is, the character encoding for the string "1234" really is 00 31 00 32 00 33 00 34 (in hex). However, at the present time most environments where your Java programs will run use their own character encoding. This may be a single-byte, double-byte, or variable-byte scheme. For example, under Windows, the string would need to be written in ASCII, as 31 32 33 34, without the extra zero bytes. If the Unicode encoding were written into a text file, then it would be quite unlikely that the resulting file will be humanly readable with the tools of the host environment. To overcome this problem, as we mentioned before, Java now has a set of stream filters that bridges the gap between Unicode-encoded text and the character encoding used by the local operating system. All of these classes descend from the abstract Reader and Writer classes, and the names are reminiscent of the ones used for binary data. For example, the InputStreamReader class turns an input stream that contains bytes in a particular character encoding into a reader that emits Unicode characters. Similarly, the OutputStreamWriter class turns a stream of Unicode characters into a stream of bytes in a particular character encoding.

For example, here is how you make an input reader that reads keystrokes from the console and automatically converts them to Unicode.

 InputStreamReader in = new InputStreamReader(System.in); 

This input stream reader assumes the normal character encoding used by the host system. For example, under Windows, it uses the ISO 8859-1 encoding (also known as ISO Latin-1 or, among Windows programmers, as "ANSI code"). You can choose a different encoding by specifying it in the constructor for the InputStreamReader. This takes the form

 InputStreamReader(InputStream, String) 

where the string describes the encoding scheme that you want to use. For example,

 InputStreamReader in = new InputStreamReader(new    FileInputStream("kremlin.dat"), "8859_5"); 

Tables 12-1 andTable 12-2 list the currently supported encoding schemes.

Local encoding schemes cannot represent all Unicode characters. If a character cannot be represented, it is transformed to a ?

Table 12-1. Basic character encodings (in rt.jar)

Name

Description

ASCII

American Standard Code for Information Exchange

Cp1252

Windows Latin-1

ISO8859_1

ISO 8859-1, Latin alphabet No. 1

UnicodeBig

Sixteen-bit Unicode Transformation Format, big-endian byte order, with byte-order mark

UnicodeBigUnmarked

Sixteen-bit Unicode Transformation Format, big-endian byte order

UnicodeLittle

Sixteen-bit Unicode Transformation Format, little-endian byte order, with byte-order mark

UnicodeLittleUnmarked

Sixteen-bit Unicode Transformation Format, little-endian byte order

UTF8

Eight-bit Unicode Transformation Format

UTF-16

Sixteen-bit Unicode Transformation Format, byte order specified by a mandatory initial byte-order mark

Table 12-2. Extended Character Encodings (in i18n.jar)

Name

Description

Big5

Big5, Traditional Chinese

Cp037

USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia

Cp273

IBM Austria, Germany

Cp277

IBM Denmark, Norway

Cp278

IBM Finland, Sweden

Cp280

IBM Italy

Cp284

IBM Catalan/Spain, Spanish Latin America

Cp285

IBM United Kingdom, Ireland

Cp297

IBM France

Cp420

IBM Arabic

Cp424

IBM Hebrew

Cp437

MS-DOS United States, Australia, New Zealand, South Africa

Cp500

EBCDIC 500V1

Cp737

PC Greek

Cp775

PC Baltic

Cp838

IBM Thailand extended SBCS

Cp850

MS-DOS Latin-1

Cp852

MS-DOS Latin-2

Cp855

IBM Cyrillic

Cp856

IBM Hebrew

Cp857

IBM Turkish

Cp858

Variant of Cp850 with Euro character

Cp860

MS-DOS Portuguese

Cp861

MS-DOS Icelandic

Cp862

PC Hebrew

Cp863

MS-DOS Canadian French

Cp864

PC Arabic

Cp865

MS-DOS Nordic

Cp866

MS-DOS Russian

Cp868

MS-DOS Pakistan

Cp869

IBM Modern Greek

Cp870

IBM Multilingual Latin-2

Cp871

IBM Iceland

Cp874

IBM Thai

Cp875

IBM Greek

Cp918

IBM Pakistan (Urdu)

Cp921

IBM Latvia, Lithuania (AIX, DOS)

Cp922

IBM Estonia (AIX, DOS)

Cp930

Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026

Cp933

Korean Mixed with 1880 UDC, superset of 5029

Cp935

Simplified Chinese Host mixed with 1880 UDC, superset of 5031

Cp937

Traditional Chinese Host mixed with 6204 UDC, superset of 5033

Cp939

Japanese Latin Kanji mixed with 4370 UDC, superset of 5035

Cp942

IBM OS/2 Japanese, superset of Cp932

Cp942C

Variant of Cp942

Cp943

IBM OS/2 Japanese, superset of Cp932 and Shift-JIS

Cp943C

Variant of Cp943

Cp948

OS/2 Chinese (Taiwan) superset of 938

Cp949

PC Korean

Cp949C

Variant of Cp949

Cp950

PC Chinese (Hong Kong, Taiwan)

Cp964

AIX Chinese (Taiwan)

Cp970

AIX Korean

Cp1006

IBM AIX Pakistan (Urdu)

Cp1025

IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)

Cp1026

IBM Latin-5, Turkey

Cp1046

IBM Arabic - Windows

Cp1097

IBM Iran (Farsi)/Persian

Cp1098

IBM Iran (Farsi)/Persian (PC)

Cp1112

IBM Latvia, Lithuania

Cp1122

IBM Estonia

Cp1123

IBM Ukraine

Cp1124

IBM AIX Ukraine

Cp1140

Variant of Cp037 with Euro character

Cp1141

Variant of Cp273 with Euro character

Cp1142

Variant of Cp277 with Euro character

Cp1143

Variant of Cp278 with Euro character

Cp1144

Variant of Cp280 with Euro character

Cp1145

Variant of Cp284 with Euro character

Cp1146

Variant of Cp285 with Euro character

Cp1147

Variant of Cp297 with Euro character

Cp1148

Variant of Cp500 with Euro character

Cp1149

Variant of Cp871 with Euro character

Cp1250

Windows Eastern European

Cp1251

Windows Cyrillic

Cp1253

Windows Greek

Cp1254

Windows Turkish

Cp1255

Windows Hebrew

Cp1256

Windows Arabic

Cp1257

Windows Baltic

Cp1258

Windows Vietnamese

Cp1381

IBM OS/2, DOS People's Republic of China (PRC)

Cp1383

IBM AIX People's Republic of China (PRC)

Cp33722

IBM-eucJP - Japanese (superset of 5050)

EUC_CN

GB2312, EUC encoding, Simplified Chinese

EUC_JP

JIS X 0201, 0208, 0212, EUC encoding, Japanese

EUC_KR

KS C 5601, EUC encoding, Korean

EUC_TW

CNS11643 (Plane 1-3), EUC encoding, Traditional Chinese

GBK

GBK, Simplified Chinese

ISO2022CN

ISO 2022 CN, Chinese (conversion to Unicode only)

ISO2022CN_CNS

CNS 11643 in ISO 2022 CN form, Traditional Chinese (conversion from Unicode only)

ISO2022CN_GB

GB 2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)

ISO2022JP

JIS X 0201, 0208 in ISO 2022 form, Japanese

ISO2022KR

ISO 2022 KR, Korean

ISO8859_2

ISO 8859-2, Latin alphabet No. 2

ISO8859_3

ISO 8859-3, Latin alphabet No. 3

ISO8859_4

ISO 8859-4, Latin alphabet No. 4

ISO8859_5

ISO 8859-5, Latin/Cyrillic alphabet

ISO8859_6

ISO 8859-6, Latin/Arabic alphabet

ISO8859_7

ISO 8859-7, Latin/Greek alphabet

ISO8859_8

ISO 8859-8, Latin/Hebrew alphabet

ISO8859_9

ISO 8859-9, Latin alphabet No. 5

ISO8859_13

ISO 8859-13, Latin alphabet No. 7

ISO8859_15_FDIS

ISO 8859-15, Latin alphabet No. 9

JIS0201

JIS X 0201, Japanese

JIS0208

JIS X 0208, Japanese

JIS0212

JIS X 0212, Japanese

JISAutoDetect

Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)

Johab

Johab, Korean

KOI8_R

KOI8-R, Russian

MS874

Windows Thai

MS932

Windows Japanese

MS936

Windows Simplified Chinese

MS949

Windows Korean

MS950

Windows Traditional Chinese

MacArabic

Macintosh Arabic

MacCentralEurope

Macintosh Latin-2

MacCroatian

Macintosh Croatian

MacCyrillic

Macintosh Cyrillic

MacDingbat

Macintosh Dingbat

MacGreek

Macintosh Greek

MacHebrew

Macintosh Hebrew

MacIceland

Macintosh Iceland

MacRoman

Macintosh Roman

MacRomania

Macintosh Romania

MacSymbol

Macintosh Symbol

MacThai

Macintosh Thai

MacTurkish

Macintosh Turkish

MacUkraine

Macintosh Ukraine

SJIS

Shift-JIS, Japanese

TIS620

TIS620, Thai

Because it is so common to want to attach a reader or writer to a file, there is a pair of convenience classes, FileReader and FileWriter, for this purpose. For example, the writer definition

 FileWriter out = new FileWriter("output.txt"); 

is equivalent to

 OutputStreamWriter out = new OutputStreamWriter(new    FileOutputStream("output.txt")); 
Writing Text Output

For text output, you want to use a PrintWriter. A print writer can print strings and numbers in text format. Just as a DataOutputStream has useful output methods but no destination, a PrintWriter must be combined with a destination writer.

 PrintWriter out = new PrintWriter(new    FileWriter("employee.txt")); 

You can also combine a print writer with a destination (output) stream.

 PrintWriter out = new PrintWriter(new    FileOutputStream("employee.txt")); 

The PrintWriter(OutputStream) constructor automatically adds an OutputStreamWriter to convert Unicode characters to bytes in the stream.

To write to a print writer, you use the same print and println methods that you used with System.out. You can use these methods to print numbers (int, short, long, float, double), characters, Boolean values, strings, and objects.

graphics/notes_icon.gif

Java veterans probably wonder whatever happened to the PrintStream class and to System.out. In Java 1.0, the PrintStream class simply truncated all Unicode characters to ASCII characters by dropping the top byte. Conversely, the readLine method of the DataInputStream turned ASCII to Unicode by setting the top byte to 0. Clearly, that was not a clean or portable approach, and it was fixed with the introduction of readers and writers in Java 1.1. For compatibility with existing code, System.in, System.out, and System.err are still streams, not readers and writers. But now the PrintStream class internally converts Unicode characters to the default host encoding in the same way as the PrintWriter. Objects of type PrintStream act exactly like print writers when you use the print and println methods, but unlike print writers, they allow you to send raw bytes to them with the write(int) and write(byte[]) methods.

For example, consider this code:

 String name = "Harry Hacker"; double salary = 75000; out.print(name); out.print(' '); out.println(salary); 

This writes the characters

 Harry Hacker 75000 

to the stream out. The characters are then converted to bytes and end up in the file employee.txt.

The println method automatically adds the correct end-of-line character for the target system ("\r\n" on Windows, "\n" on UNIX, "\r" on Macs) to the line. This is the string obtained by the call System.getProperty("line.separator").

If the writer is set to autoflush mode, then all characters in the buffer are sent to their destination whenever println is called. (Print writers are always buffered.) By default, autoflushing is not enabled. You can enable or disable autoflushing by using the PrintWriter(Writer, boolean) constructor and passing the appropriate Boolean as the second argument.

 PrintWriter out = new PrintWriter(new    FileWriter("employee.txt"), true); // autoflush 

The print methods don't throw exceptions. You can call the checkError method to see if something went wrong with the stream.

graphics/notes_icon.gif

You cannot write raw bytes to a PrintWriter. Print writers are designed for text output only.

java.io.PrintWriter 1.1

graphics/api_icon.gif
  • PrintWriter(Writer out)

    creates a new PrintWriter, without automatic line flushing.

    Parameters:

    out

    A character-output writer

  • PrintWriter(Writer out, boolean autoFlush)

    creates a new PrintWriter.

    Parameters:

    out

    A character-output writer

     

    autoFlush

    If true, the println methods will flush the output buffer

  • PrintWriter(OutputStream out)

    creates a new PrintWriter, without automatic line flushing, from an existing OutputStream by automatically creating the necessary intermediate OutputStreamWriter.

    Parameters:

    out

    An output stream

  • PrintWriter(OutputStream out, boolean autoFlush)

    creates a new PrintWriter from an existing OutputStream but allows you to determine whether the writer autoflushes or not.

    Parameters:

    out

    An output stream

     

    autoFlush

    If true, the println methods will flush the output buffer

  • void print(Object obj)

    prints an object by printing the string resulting from toString.

    Parameters:

    obj

    The object to be printed

  • void print(String s)

    prints a Unicode string.

  • void println(String s)

    prints a string followed by a line terminator. Flushes the stream if the stream is in autoflush mode.

  • void print(char[] s)

    prints an array of Unicode characters.

  • void print(char c)

    prints a Unicode character.

  • void print(int i)

    prints an integer in text format.

  • void print(long l)

    prints a long integer in text format.

  • void print(float f)

    prints a floating-point number in text format.

  • void print(double d)

    prints a double-precision floating-point number in text format.

  • void print(boolean b)

    prints a Boolean value in text format.

  • boolean checkError()

    returns true if a formatting or output error occurred. Once the stream has encountered an error, it is tainted and all calls to checkError return true.

Reading Text Input

As you know:

  • To write data in binary format, you use a DataOutputStream.

  • To write in text format, you use a PrintWriter.

Therefore, you might expect that there is an analog to the DataInputStream that lets you read data in text format. Unfortunately, Java does not provide such a class. (That is why we wrote our own Console class for use in the beginning chapters.) The only game in town for processing text input is the BufferedReader method it has a method, readLine, that lets you read a line of text. You need to combine a buffered reader with an input source.

 BufferedReader in = new BufferedReader(new    FileReader("employee.txt")); 

The readLine method returns null when no more input is available. A typical input loop, therefore, looks like this:

 String line; while ((line = in.readLine()) != null) {    do something with line } 

The FileReader class already converts bytes to Unicode characters. For other input sources, you need to use the InputStreamReader unlike the PrintWriter, the InputStreamReader has no automatic convenience method to bridge the gap between bytes and Unicode characters.

 BufferedReader in2 = new BufferedReader(new    InputStreamReader(System.in)); BufferedReader in3 = new BufferedReader(new    InputStreamReader(url.openStream())); 

To read numbers from text input, you need to read a string first and then convert it.

 String s = in.readLine(); double x = Double.parseDouble(s); 

That works if there is a single number on each line. Otherwise, you must work harder and break up the input string, for example, by using the StringTokenizer utility class. We will see an example of this later in this chapter.

graphics/exclamatory_icon.gif

Java has StringReader and StringWriter classes that allow you to treat a string as if it were a data stream. This can be quite convenient if you want to use the same code to parse both strings and data from a stream.


       
    Top
     



    Core Java 2(c) Volume I - Fundamentals
    Building on Your AIX Investment: Moving Forward with IBM eServer pSeries in an On Demand World (MaxFacts Guidebook series)
    ISBN: 193164408X
    EAN: 2147483647
    Year: 2003
    Pages: 110
    Authors: Jim Hoskins

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net