Adventures in Random Access: fseek() and ftell()

I l @ ve RuBoard

Adventures in Random Access: fseek() and ftell ()

The fseek() function enables you to treat a file like an array and move directly to any particular byte in a file opened by fopen() . To see how it works, let's create a program (see Listing 12.5) that displays a file in reverse order. Borrowing from the earlier examples, it uses a command-line argument to get the name of the file it will read. Note that fseek() has three arguments and returns an int value. The ftell() function returns the current position in a file as a long value.

Listing 12.5 The reverse.c program.
 /* reverse.c -- displays a file in reverse order */ #include <stdio.h> #include <stdlib.h> #define CNTL_Z '2'   /* eof marker in DOS text files */ #define SLEN 50 int main(void) {    char file[SLEN];    char ch;    FILE *fp;    long count, last;    puts("Enter the name of the file to be processed:");    gets(file);    if ((fp = fopen(file,"rb")) == NULL)    {                      /* read-only and binary modes */       printf("reverse can't open %s\n", file);       exit(1);    }    fseek(fp, 0L, SEEK_END);        /* go to end of file */    last = ftell(fp);    for (count = 1L; count <= last; count++)    {       fseek(fp, -count, SEEK_END);  /* go backward      */       ch = getc(fp);     /* for DOS, works with UNIX */       if (ch != CNTL_Z && ch != '\r')          putchar(ch);     /* for Macintosh            */     /*  if (ch == '\r')          putchar('\n');       else          putchar(ch);           */    }    putchar('\n');    fclose(fp);    return 0;     } 

Here is the output for a sample file:

 Enter the name of the file to be processed:  cluv  .C ni eno naht ylevol erom margorp a ees reven llahs I taht kniht I 

Note

If you run the program from a command-line environment, this program expects the filename to be in the same directory (or folder) as the executable program. If you run the program from an IDE, the program looks depend on the implementation. For example, Microsoft Visual C++ 5.0 looks in the directory containing the source code, but Metrowerks CodeWarrior looks in the directory containing the executable file.


We need to discuss three topics: how fseek() and ftell() work, how to use a binary stream, and how to make the program portable.

How fseek() and ftell() Work

The first of the three arguments to fseek() is a FILE pointer to the file being searched. The file should have been opened by using fopen() .

The second argument to fseek() is called the offset . This argument tells how far to move from the starting point (see the following list of mode starting points). The argument must be a long value. It can be positive (move forward), negative (move backward), or zero (stay put).

The third argument is the mode, and it identifies the starting point. Under ANSI, the stdio.h header file specifies the following manifest constants for the mode:

Mode Measure Offset From
SEEK_SET Beginning of file
SEEK_CUR Current position
SEEK_END End of file

Older implementations may lack these definitions and, instead, use the numeric values 0L , 1L , and 2L , respectively, for these modes. Recall that the L suffix identifies type long values, or the implementation might have the constants defined in a different header file. When in doubt, consult your usage manual or the online manual.

The value returned by fseek() is if everything is okay, and -1 if there is an error, such as attempting to move past the bounds of the file.

The ftell() function is type long , and it returns the current file location. Under ANSI, it is declared in stdio.h . As originally implemented in UNIX, ftell() specifies the file position by returning the number of bytes from the beginning, with the first byte being byte 0, and so on. Under ANSI C, this definition applies to files opened in the binary mode, but not necessarily to files opened in the text mode. That is one reason we used the binary mode for Listing 12.5.

Now we can explain the basic elements of Listing 12.5. First, the statement

 fseek(fp, 0L, SEEK_END); 

takes you to an offset of 0 bytes from the file end. That is, it takes you to the end of the file. Next , the statement

 last = ftell(fp); 

assigns to last the number of bytes from the beginning to the end of the file.

Then, you have this loop:

 for (count = 1L; count <= last; count++) {   fseek(fp, -count, SEEK_END);    /* go backward */      ch = getc(fp);      } 

The first cycle positions the program at the first character before the end of the file, that is, at the file's final character. Then the program prints that character. The next loop positions the program at the preceding character and prints it. This process continues until the first character is reached and printed.

Binary Versus Text Mode

We designed Listing 12.5 to work in both the UNIX and the MS-DOS environments. UNIX has only one file format, so no special adjustments are needed. MS-DOS, however, does require extra attention. Many MS-DOS editors mark the end of a text file with the character Ctrl+Z. When such a file is opened in the text mode, C recognizes this character as marking the end of the file. When the same file is opened in the binary mode, however, the Ctrl+Z character is just another character in the file, and the actual end-of-file comes later. It might come immediately after the Ctrl+Z, or the file could be padded with null characters to make the size a multiple of, say, 256. Null characters don't print under DOS, and we included code to prevent the program from trying to print the Ctrl+Z character.

Another difference is one we've mentioned before: MS-DOS represents a text file newline with the \r\n combination. A C program opening the same file in a text mode "sees" \r\n as a simple \n , but, when using the binary mode, the program sees both characters. Therefore, we included coding to suppress printing \r . (Different coding is needed for Macintosh text files because they use the \r as the end-of-line marker. Listing 12.5 shows the Macintosh version as a comment.)

Because a UNIX text file normally contains neither Ctrl+Z nor \r , this extra coding does not affect most UNIX text files.

The ftell() function may work differently in the text mode than in the binary mode. Many systems have text file formats that are different enough from the UNIX model that a byte count from the beginning of the file is not a meaningful quantity. ANSI C states that, for the text mode, ftell() returns a value that can be used as the second argument to fseek() . For MS-DOS, for example, ftell() can return a count that sees \r\n as a single byte.

Portability

Ideally, fseek() and ftell() should conform to the UNIX model. However, differences in real systems sometimes make this impossible . Therefore, ANSI provides lowered expectations for these functions. Here are some limitations:

  • In the binary mode, implementations need not support the SEEK_END mode. (Listing 12.5, then, is not guaranteed to be portable.)

  • In the text mode, the only calls to fseek() that are guaranteed to work are these:

     fseek(file, 0L, SEEK_SET)              Go to beginning of file fseek(file, 0L, SEEK_CUR)              Stay at current position fseek(file, 0L, SEEK_END)              Go to file end fseek(file,ftell-pos, SEEK_SET)        Go to position ftell-pos from the                                        beginning; ftell-pos is a value returned by                                        ftell() 

Fortunately, many common environments allow stronger implementations of these functions.

The fgetpos () and fsetpos () Functions

One potential problem with fseek() and ftell() is that they limit file sizes to values that can be represented by type long . Perhaps two billion bytes seems more than adequate, but the ever-increasing capacities of storage devices makes larger files possible. ANSI C introduced two new positioning functions designed to work with larger file sizes. Instead of using a long value to represent a position, it uses a new type, called fpos_t (for file position type) for that purpose. The fpos_t type is not a fundamental type; rather, it is defined in terms of other types. A variable or data object of fpos_t type can specify a location within a file, and it can be used with fseek() and ftell() , but its nature is not specified beyond that. Implementors can then design a type to meet the needs of a particular platform; the type could, for example, be implemented as a structure.

ANSI C does define how the fpos_t is used. The fgetpos() has this prototype:

 int fgetpos(FILE *stream, fpos_t *pos); 

When called, it places an fpos_t value in the location pointed to by pos ; the value describes a location in the file. The function returns zero if successful, and a non-zero value for failure.

The fsetpos() function has this prototype:

 int fsetpos(FILE *stream, const fpos_t *pos); 

When called, it uses the fpos_t value in the location pointed to by pos to set the file pointer to the location indicated by that value. The function returns zero if successful, and a non-zero value for failure. The fpos_t value should have been obtained by a previous call to fgetpos() .

I l @ ve RuBoard


C++ Primer Plus
C Primer Plus (5th Edition)
ISBN: 0672326965
EAN: 2147483647
Year: 2000
Pages: 314
Authors: Stephen Prata

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net