Java views each file as a sequential stream of bytes (Fig. 14.2). Every operating system provides a mechanism to determine the end of a file, such as an end-of-file marker or a count of the total bytes in the file that is recorded in a system-maintained administrative data structure. A Java program processing a stream of bytes simply receives an indication from the operating system when the program reaches the end of the streamthe program does not need to know how the underlying platform represents files or streams. In some cases, the end-of-file indication occurs as an exception. In other cases, the indication is a return value from a method invoked on a stream-processing object.
Figure 14.2. Java's view of a file of n bytes.
File streams can be used to input and output data as either characters or bytes. Streams that input and output bytes to files are known as byte-based streams, storing data in its binary format. Streams that input and output characters to files are known as character-based streams, storing data as a sequence of characters. For instance, if the value 5 were being stored using a byte-based stream, it would be stored in the binary format of the numeric value 5, or 101. If the value 5 were being stored using a character-based stream, it would be stored in the binary format of the character 5, or 00000000 00110101 (this is the binary for the numeric value 53, which indicates the character 5 in the Unicode character sets). The difference between the numeric value 5 and the character 5 is that the numeric value can be used as an integer, whereas the character 5 is simply a character that can be used in a string of text, as in "Sarah Miller is 15 years old". Files that are created using byte-based streams are referred to as binary files, while files created using character-based streams are referred to as text files. Text files can be read by text editors, while binary files are read by a program that converts the data to a human-readable format.
A Java program opens a file by creating an object and associating a stream of bytes or characters with it. The classes used to create these objects are discussed shortly. Java can also associate streams with different devices. In fact, Java creates three stream objects that are associated with devices when a Java program begins executingSystem.in, System.out and System.err. Object System.in (the standard input stream object) normally enables a program to input bytes from the keyboard; object System.out (the standard output stream object) normally enables a program to output data to the screen; and object System.err (the standard error stream object) normally enables a program to output error messages to the screen. Each of these streams can be redirected. For System.in, this capability enables the program to read bytes from a different source. For System.out and System.err, this capability enables the output to be sent to a different location, such as a file on disk. Class System provides methods setIn, setOut and setErr to redirect the standard input, output and error streams, respectively.
Java programs perform file processing by using classes from package java.io. This package includes definitions for stream classes, such as FileInputStream (for byte-based input from a file), FileOutputStream (for byte-based output to a file), FileReader (for character-based input from a file) and FileWriter (for character-based output to a file). Files are opened by creating objects of these stream classes, which inherit from classes InputStream, OutputStream, Reader and Writer, respectively (these classes will be discussed later in this chapter). Thus, the methods of these stream classes can all be applied to file streams as well.
Java contains classes that enable the programmer to perform input and output of objects or variables of primitive data types. The data will still be stored as bytes or characters behind the scenes, allowing the programmer to read or write data in the form of integers, strings, or other data types without having to worry about the details of converting such values to byte-format. To perform such input and output, objects of classes ObjectInputStream and ObjectOutputStream can be used together with the byte-based file stream classes FileInputStream and FileOutputStream (these classes will be discussed in more detail shortly). The complete hierarchy of classes in package java.io can be viewed in the online documentation at
java.sun.com/j2se/5.0/docs/api/java/io/package-tree.html
Each indentation level in the hierarchy indicates that the indented class extends the class under which it is indented. For example, class InputStream is a subclass of Object. Click a class's name in the hierarchy to view the details of the class.
As you can see in the hierarchy, Java offers many classes for performing input/output operations. We use several of these classes in this chapter to implement file-processing programs that create and manipulate sequential-access files and random-access files (discussed in Section 14.7). We also include a detailed example on class File, which is useful for obtaining information about files and directories. In Chapter 24, Networking, we use stream classes extensively to implement networking applications. Several other classes in the java.io package that we do not use in this chapter are discussed briefly in Section 14.8.
In addition to the classes in this package, character-based input and output can be performed with classes Scanner and Formatter. Class Scanner is used extensively to input data from the keyboard. As we will see, this class can also read data from a file. Class Formatter enables formatted data to be output to the screen or to a file in a manner similar to System.out.printf. Chapter 28, Formatted Output, presents the details of formatted output with System.out.printf. All these features can be used to format text files as well.
Introduction to Computers, the Internet and the World Wide Web
Introduction to Java Applications
Introduction to Classes and Objects
Control Statements: Part I
Control Statements: Part 2
Methods: A Deeper Look
Arrays
Classes and Objects: A Deeper Look
Object-Oriented Programming: Inheritance
Object-Oriented Programming: Polymorphism
GUI Components: Part 1
Graphics and Java 2D™
Exception Handling
Files and Streams
Recursion
Searching and Sorting
Data Structures
Generics
Collections
Introduction to Java Applets
Multimedia: Applets and Applications
GUI Components: Part 2
Multithreading
Networking
Accessing Databases with JDBC
Servlets
JavaServer Pages (JSP)
Formatted Output
Strings, Characters and Regular Expressions
Appendix A. Operator Precedence Chart
Appendix B. ASCII Character Set
Appendix C. Keywords and Reserved Words
Appendix D. Primitive Types
Appendix E. (On CD) Number Systems
Appendix F. (On CD) Unicode®
Appendix G. Using the Java API Documentation
Appendix H. (On CD) Creating Documentation with javadoc
Appendix I. (On CD) Bit Manipulation
Appendix J. (On CD) ATM Case Study Code
Appendix K. (On CD) Labeled break and continue Statements
Appendix L. (On CD) UML 2: Additional Diagram Types
Appendix M. (On CD) Design Patterns
Appendix N. Using the Debugger
Inside Back Cover