9.5 Text File IO

9.5 Text File I/O

The system-supplied standard C library also provides functions for input and output from file systems. Linking methods and calling conventions are similar to those for display and keyboard I/O, and are set forth in Table 9-2.

Table 9-2 presents the function interfaces from a register-level programming perspective, rather than their C data types, using the calling conventions of Unix and Linux. When a function returns a useful value in register ret0, we include a brief description of it.

9.5.1 Directory-Level Access

With standard input and output, no special preparations are necessary because the environment in which a process runs already has stdin and stdout predefined and ready to accept any I/O. File I/O, on the other hand, has an obvious preliminary requirement: specifying which particular file is to be used. In many programming languages, this is known as opening a file. Open files consume resources when operating systems establish and maintain files structures. A file should be closed when it is no longer in use.

Table 9-2. Calling Conventions for C Functions Related to Text Files

Function

Argument

Register(s)

Description

fopen

first

out0

Address of a string specifying the file.

 

second

out1

Address of a string specifying the mode.

 

returned

ret0[]

File pointer (address of more information about the file), or zero if any error occurred.

fclose

first

out0

File pointer (address of more information about the file).

 

returned

ret0[*]

Zero indicating success, or EOF if any error occurred.

fputs

first

out0

Address of a null-terminated string.

 

second

out1

File pointer.

 

returned

ret0[*]

Zero indicating success, or EOF if any error occurred.

fgets

first

out0

Address of an adequate storage area.

 

second

out1

Size of the storage area.

 

third

out2

File pointer.

 

returned

ret0[]

Passed-in value in out0 if successful, or zero if any error or end of file occurred.

fprintf

first

out0

File pointer.

 

second

out1

Address of a string containing format information.

 

other(s)

out2 to out7

(then stack)

Integer or floating-point (memory-format) quantities to be formatted, or the address of any string quantities to be included in the output.

 

returned

ret0[*]

Total number of bytes written, or error codes.

fscanf

first

out0

File pointer.

 

second

out1

Address of a string containing format information.

 

other(s)

out2 to out7 (then stack)

Address(es) of integer, floating-point, or string quantities, interpreted according to information in the format string.

 

returned

ret0[*]

Number of input objects successfully processed, or EOF if any error occurred.

[] Test this returned value as an address (quad word for Linux and for HP-UX with +DD64 compiler option, or double word for HP-UX without +DD64 compiler option).

[*] Test this returned value as a double word (int in C).

All of the functions in Table 9-2 involve the concept of the file pointer, which is the address of a data structure provided by the operating environment containing certain information about a file while it is open for I/O. The internal composition of that data structure, which may be different for each operating system, is not explored in this text.

The fopen function requires two input arguments, both of which are address pointers to null-terminated character strings. The first of those arguments, the name of the file, must be specified according to the conventions of the particular operating environment. If the system-supplied defaults are not suitable, the file name may need to include specific device and/or directory information. The permitted characters and overall length for a file name are also system-dependent.

The second argument for fopen indicates how the file is to be accessed. This access mode specification has a minimum of three values for any implementation of the standard C library:

r

Access is established to read from the file.

w

Access is established to write into the file.

a

Access is established to append onto the end of the file.

For the r mode, the file must already exist. For the w and a modes, a new file may be created. For the w mode, a Unix system will discard any existing file of the same name.

The file pointer address returned in register ret0 should be retained as a local variable or in some general register whose contents are preserved across procedure calls (see Appendix D). All of the other file-access functions (Table 9-2) require this file pointer as an input argument to convey which specific file is to be accessed. File pointers are generally an address value within a region of address space shared between the calling process and the operating system, which handles the actual I/O operations.

The fclose function requires only the file pointer as an input argument. When the system closes a text file that was opened for write access (w or a), any intermediary buffers are flushed and all pending physical write operations are permitted to complete before the close operation is considered to have been accomplished.

9.5.2 Unformatted Line I/O

Text files are usually line-oriented, where the conceptual unit of input or output, rather than single bytes, is a line of some n characters. The standard C library provides the fgets and fputs functions, which move an entire line of characters between a file in the external environment and some particular storage region managed by the calling process.

With the fgets function, the second argument conveys the size of the storage area provided by the calling program. Input will terminate at the first newline encountered in the stored characters. Additionally, a NUL is placed at the end of the string in the buffer. If necessary, input will stop short of a line terminator to make room for this character.

Unix nomenclature uses the singular newline character, while other operating systems may use their own conventions, and even multiple characters, to represent the end of a line (Table A-1). When fgets and fputs are implemented for a particular system, the null terminator added at the end of the string in memory assures that, from the programmer's perspective, the functions appear to operate the same.

Unlike the puts function, fputs does not automatically append any newline character to its output. The correlated features of fgets and fputs have the result that a program loop repetitively using fgets to obtain data from one file and using fputs to move the same data into a new file would produce an exact copy, regardless of line lengths in the data being copied.

9.5.3 Formatted I/O

The fprintf and fscanf functions (Table 9-2) work with format strings to interpret information they move between the external environment in text files and internal storage (memory). Their operation is analogous to that already described for the printf and scanf functions (Table 9-1). The first argument (out0) for a call to fprintf or fscanf is the file pointer corresponding to the previously opened text file into or from which I/O is to occur.

The fprintf function needs the address of a format string in out1; out2 to out7 (and perhaps more on the stack) contain either the values of integers or floating-point numbers to interpret, or else the addresses of strings to include in the printed output. Note that the fprintf function places newline characters only according to information in the format string, and thus can print more than one line of output.

Formatted input from files is provided by fscanf. A single call to fscanf can work past newline markers in order to satisfy the total number of items designated by the format string. Individual words of text (including adjacent punctuation marks) can be read by fscanf with the "%s" parameter, using any space, tab, or newline as a field terminator.

Most programs using fscanf will perform formatted I/O on an entire file using a loop. Termination of that loop occurs when register ret0 indicates that the end of the file (EOF) has been reached. Each system defines some suitable negative value to represent the EOF condition. Unix and Linux traditionally use negative one ( 1). Refer to books or manuals on ANSI C for more information on the format strings used by fprintf and scanf.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net