Chapter 11: Working with Files

All information is stored in files. File management is a vital component of practically any program. Beginners pay excessive attention to the design windows , toolbars , and nonstandard buttons . Unfortunately, they often neglect file operations. However files must be paid the attention they requireactually, mastering operations overfiles is the key to mastery of programming.

With the arrival of large and fast hard disks, the importance of files has grown considerably. Using API functions for file management can make your program more efficient and ensure its high performance. Most programs presented in this chapter are console applications. This is because console applications are the most suitable for demonstrating file processing. File processing is the method of hiding from users and programmers all the events that take place when information is read from or written to the storage media. Operating systems of the Windows NT family work with two types of file systemsFile Allocation Table 32 (FAT32) and the New Technology File System (NTFS). FAT32 is the direct descendant of the two file systems, FAT12 and FAT16. It has inherited most of their drawbacks. Microsoft continues to support this file system for backward compatibility with previous versions of its file systems, although in future versions of Windows, FAT32 support will be reduced to the capability of reading files from FAT volumes . On the contrary, NTFS is generally considered as one of the most perfect file systems. In the following few sections, I will cover the basic concepts of these file systems, which are required for efficient programming.

File Characteristics

When providing descriptions of file characteristics, I'll base them on the parameters that API functions manipulate. The types of file systems and their structure will be covered later. First, I'll list and briefly describe the main file attributes. Most of them are present in NTFS, and a considerable part is also present in FAT32.

File Attributes

The attribute is an integer of the DWORD type. It defines how the operating system treats the file.

  • FILE_ATTRIBUTE_READONLY equ 1h The "read only" attribute. Applications can only read this file. Accordingly, any attempt at writing to this file will cause an error.

  • FILE_ATTRIBUTE_HIDDEN equ 2h The "hidden file" attribute. The file with this attribute will not be displayed when viewing the directory using "standard" tools (see the File Search section later in this chapter).

  • FILE_ATTRIBUTE_SYSTEM equ 4h The "system file" attribute. If this attribute is set, the file either belongs to the operating system or is used by the system in exclusive mode.

  • FILE_ATTRIBUTE_DIRECTORY equ 10h The "directory" attribute. The operating system treats a file with this attribute in a special way. It considers such files as directories, namely, lists of files composed of 32-byte records. Normal files cannot be converted into directories. To create a directory, use the CreateDirectory function.

  • FILE_ATTRIBUTE_ARCHIVE equ 20h Since the time of MS-DOS, this attribute has been set for files that were not archived using the BACKUP or XCOPY operation. For programming purposes, this attribute is equivalent to the zero value.

  • FILE_ATTRIBUTE_DEVICE equ 40h Reserved for future use.

  • FILE_ATTRIBUTE_NORMAL equ 80h This attribute means that no other attributes are set for the file.

  • FILE_ATTRIBUTE_TEMPORARY equ 100h This attribute means that the file is intended for temporary data storage. After the file is closed; the system must delete it. The system stores the main part of such a file in memory.

  • FILE_ATTRIBUTE_SPARSE_FILE equ 200h This attribute allows you to use distributed (or sparse) files. The logical length of such files can significantly exceed the disk space taken by the file. This attribute first appeared in NTFS 5.0.

  • FILE_ATTRIBUTE_REPARSE_POINT equ 400h This attribute was introduced in Windows 2000. It is used for extending the functional capabilities of the file system. Reparse points (described later in this chapter) allow NTFS-based hierarchical storage management (HSM) technology to be implemented. This technology allows you to considerably extend the framework of available disk space by using remote storage media. The system operates with such storage media automatically. This attribute was first introduced with NTFS 5.0.

  • FILE_ATTRIBUTE_COMPRESSED equ 800h If this attribute is set, the file has been compressed by the system. For directories, the presence of this attribute means that all newly created files in such a directory must be compressed by default.

  • FILE_ATTRIBUTE_OFFLINE equ l000h If this attribute is set, the information stored in this file is unavailable. Perhaps, this file resides on a storage device that is offline (not connected).

  • FILE_ATTRIBUTE_NOT_CONTENT_INDEXED equ 2000h The presence of this attribute means that the file cannot be indexed by the Windows indexing service.

  • FILE_ATTRIBUTE_ENCRYPTED equ 4000h The file is encrypted. This attribute first appeared in NTFS 5.0, which provides the possibility of file encryption (more details will be provided later in this chapter).

File attributes can be changed using the SetFileAttributes function. To read file attributes, use the GetFileAttributes function. The values of the following attributes can be set only using the DeviceIoControl function:

  • FILE_ATTRIBUTE_COMPRESSED

  • FILE_ATTRIBUTE_DEVICE

  • FILE_ATTRIBUTE_DIRECTORY

  • FILE_ATTRIBUTE_ENCRYPTED

  • FILE_ATTRIBUTE_REPARSE_POINT

  • FILE_ATTRIBUTE_SPARSE_FILE

Note that if the operating system doesn't impose any limitations on the possibility of changing file attributes, then the attributes themselves lose sense. For example, if the read-only attribute can be removed any time, then any user will be able to do with the file whatever he or she pleases.

The file has three time characteristics: creation time, last modification time, and last access time. The time is counted in nanoseconds, starting from 12 p.m., January 1, 1600; and is stored in two 32-bit values that can be represented by the following structure:

 FILETIME STRUC     dwLowDateTime DW ?     dwLowHighTime DW ?   FILETIME ENDS 

It is necessary to point out that time is stored in universal coordinates and must be converted to the local time (this is achieved using the FileTimeToLocalFileTime function). To get all three time values, use the GetFileTime function. For output and manipulations with the file, it is expedient to invent a more convenient structure than two 32-bit values. There is such a structure SYSTIME . It has the following format:

 SYSTIME STRUC     wYear         DW ?     wMonth        DW ?     wDayOfWeek    DW ?     wDay          DW ?     wHour         DW ?     wMinute       DW ?     wSecund       DW ?     wMilliseconds DW ?   SYSTIME ENDS 

For converting the structure obtained using the GetFileTime function into the SYSTIME Structure, use the FileTimeToSystemTime API function.

To set the time characteristics of a file, the SetFileTime function is used. For defining the time, it is convenient to use the SYSTIME structure and then convert it into the SetFileTime format using the SystemTimeToFileTime function. Later in this chapter, I'll provide an example illustrating how to obtain time characteristics of the file (see Listing 11.6).

File length in bytes is usually stored in two 32-bit values or in one 64-bit value. If 32-bit values are designated as ll (least significant part) and 12 (most significant part), then a 64-bit value will be obtained according to the following formula: 12*0FFFFH+11 . The file size can be obtained using the GetFileSize function. This function returns the least significant part of the file length, which is enough in most cases. The second argument of this function is the pointer to the most significant part of the file length. However, the GetFileSizeEx function is more convenient for determining the file length. The second argument of this function is the address of the following structure:

 FSIZE STRUC     LOWPART  DW ?     HIGHPART DW ?   FSISE ENDS 

This structure is the one that gets the file length.

It is necessary to note that functions able to be used for obtaining file characteristics receive the file descriptor as their first argument. In other words, to get these characteristics, it is first necessary to open the file (see the description of the CreateFile function later in this chapter). An alternative method of receiving these parameters is using the FindFirstFile function (see the File Search section later in this chapter).

In addition to the previously described characteristics, the file, naturally, has a name. Note that it is necessary to distinguish between the long name and the short name of the file. Also, it is necessary to distinguish the fully qualified pathname (with all long names) and the abbreviated pathname (in which all long names are replaced by abbreviated ones). An abbreviated pathname is needed because some older programs interpret blank characters as parameter delimiters. To convert the long name into the short one, use the GetShortPathName function, which is capable of converting both names and paths. The inverse operation can be carried out using the GetFullPathName function.

In this book, I don't even touch direct disk access. However, you may still ask about the structure of the directory records. This is quite natural because, with the migration from FAT (MS-DOS) to FAT321 (Windows 95), there appeared the possibility of storing long filenames. In addition to the time and date of modification, there appeared such characteristics as time and date of creation and last access. Where are all these characteristics stored? This question will be answered in the next section.



The Assembly Programming Master Book
The Assembly Programming Master Book
ISBN: 8170088178
EAN: 2147483647
Year: 2004
Pages: 140
Authors: Vlad Pirogov

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net