File Synchronization

team bbl


By default, Linux systems read from and write to a buffer/page cache that is kept in memory. They avoid actually transferring data to disk until the buffer is full or until the application calls a sync function to flush the buffer/page cache. This strategy increases performance by avoiding the relatively slow mechanical process of writing to disk more often than necessary.

Input and output operations are of two types:

  • Asynchronous I/O, which frees the application to perform other tasks while input is written or read

  • Synchronized I/O, which performs the write or read operation and verifies its completion before returning

Synchronized I/O is useful when the integrity of data and files is critical to an application. Synchronized output assures that the data that is written to a device is actually stored there. Synchronized input assures that the data that is read from a device is a current image of data on that device.

Two levels of file synchronization are available:

  • Data integrity

    Write operations: Data in the buffer is transferred to disk, along with file system information necessary to retrieve the data.

    Read operations: Any pending write operations relevant to the data being read complete with data integrity before the read operation is performed.

  • File integrity

    Write operations: Data in the buffer and all file system information related to the operation are transferred to disk.

    Read operations: Any pending write operations relevant to the data being read complete with file integrity before the read operation is performed.

How to Assure Data or File Integrity

You can assure data integrity or file integrity at specific times by using function calls, or you can set file status flags to force automatic file synchronization for each read or write call associated with that file.

Note that using synchronized I/O can degrade system performance.

Using Function Calls

You can choose to write to buffer/page cache as usual and call functions explicitly when you want the program to flush the buffer to disk. For instance, you may want to use the buffer/page cache when a significant amount of I/O is occurring and call these functions when activity slows down. Two functions are available:

Function

Description

fdatasync

Flushes all data buffers, providing operation completion with data integrity.

fsync

Flushes all data and file control information from the buffers, providing operation completion with file integrity.


For a complete description of these functions, refer to the man pages for fdatasync and fsync.

Using File Descriptors

If you want to write data to disk in all cases automatically, you can set file status flags to force this behavior instead of making explicit calls to fdatasync or fsync.

To set this behavior, use these flags with the open function:

Flag

Description

O_DSYNC

Forces data synchronization for each write operation. For example: fd = open("filea", O_RDWR|O_CREAT|O_DSYNC, 0666);

O_SYNC

Forces file and data synchronization for each write operation. For example: fd = open("filea", O_RDWR|O_CREAT|O_SYNC, 0666);


Performance Implications of sync/fsync

Forced synchronization of the contents of real memory and disk takes place in several ways:

  • An application program makes an fsync() call for a specified file. This causes all the pages that contain modified data for that file to be written to disk. The writing is complete when the fsync() call returns to the program.

  • An application program makes a sync() call. This causes all the file pages in memory that contain modified data to be scheduled for writing to disk. The writing is not necessarily complete when the sync() call returns to the program.

  • A user can enter the sync command, which in turn issues a sync() call. Again, some of the writes might not be complete when the user is prompted for input (or the next command in a shell script is processed).

  • The sync daemon, bdflush, is called at regular intervals. This ensures that the system does not accumulate large amounts of data that exists only in volatile RAM.

    team bbl



    Performance Tuning for Linux Servers
    Performance Tuning for Linux Servers
    ISBN: 0137136285
    EAN: 2147483647
    Year: 2006
    Pages: 254

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net