Techniques of Working with Binary Files

Manipulations over external files [i] are based on several API functions, the most important and sophisticated of which is the CreateFile function. I will cover this function in detail in Chapter 13. Here, practical information will be provided that is needed to start using this function in your programs. Note, however, that using this function it is possible not only to create or open files but also to manipulate with such objects as pipes, consoles, disk devices, and communication resources (see Chapter 13 for more details). The function distinguishes the device by the name structure. For example, a name like C:\CONFIG.SYS defines a file, and a name like CONOUT$ defines the output buffer of the current console.

Listings 11.3 and 11.4 present two simple but important programs. Both programs output the contents of the text file [ii] whose name is displayed as a command-line option in the current console. In the first case, you get the descriptor of the current console in a standard way. In the second case, you open the console as a file and output information there as you would do when outputting information to a file. Pay attention to the role of the buffer, into which the file contents are read. Experiment with the buffer size . For example, try to output a large text file to it. An interesting point is that the structure of the text file is not taken into account in both programs because for such input and output this is not needed. Later in this chapter, I will describe the text file structure.

Listing 11.3: Text output from a file to a console (first method)
image from book
 ; The FILES1.ASM file .586P ; Flat memory model .MODEL FLAT, stdcall ; Constants STD_OUTPUT_HANDLE equ -11 GENERIC_READ      equ 80000000h GENERIC_WRITE     equ 40000000h GEN = GENERIC_READ or GENERIC_WRITE SHARE = 0 OPEN_EXISTING equ 3 ; Prototypes of external procedures EXTERN GetStdHandle@4:NEAR EXTERN WriteConsoleA@20:NEAR EXTERN ExitProcess@4:NEAR EXTERN GetCommandLineA@0:NEAR EXTERN CreateFileA@28:NEAR EXTERN CloseHandle@4:NEAR EXTERN ReadFile@20:NEAR ;-------------------------- ; INCLUDELIB directives for the linker includelib c:\masm32\lib\user32.lib includelib c: \masm32\lib\kernel32.lib ;-------------------------- ; Data segment _DATA SEGMENT HANDL DWORD ? HFILE DWORD ? BUF   DB 100 DUP(0) BUFER DB 300 DUP(0) NUMB  DWORD ? NUMW  DWORD ? _DATA ENDS ; Code segment _TEXT SEGMENT START: ; Get the output handle      PUSH STD_OUTPUT_HANDLE      CALL GetStdHandle@4      MOV  HANDL, EAX ; Get the number of parameters      CALL NUMPAR      CMP  EAX, 1      JE   NO_PAR ; Get the parameter with the EDI number      MOV  EDI, 2      LEA  EBX, BUF      CALL GETPAR ; Open file      PUSH 0             ; Must be 0      PUSH 0             ; File attribute (if creating a file)      PUSH OPEN_EXISTING ; How to open      PUSH 0             ; The pointer to the security attribute      PUSH 0             ; Common access mode      PUSH GEN           ; Access mode      PUSH OFFSET BUF    ; Filename      CALL CreateFileA@28      CMP  EAX, -1      JE   NO_PAR      MOV  HFILE, EAX LOO: ; Read into buffer      PUSH 0      PUSH OFFSET NUMB      PUSH 300      PUSH OFFSET BUFER      PUSH HFILE      CALL ReadFile@20 ; Output the buffer contents to the console      PUSH 0      PUSH OFFSET NUMW      PUSH NUMB      PUSH OFFSET BUFER      PUSH HANDL      CALL WriteConsoleA@20 ; Check whether the last bytes have been read      CMP  NUMB, 300      JE   LOO ; Close the file      PUSH HFILE      CALL CloseHandle@4 ; End of program NO_PAR:      PUSH 0      CALL ExitProcess@4 ; Procedures ; Procedure for defining the number of parameters in the string ; Determine the number of parameters (->EAX) NUMPAR PROC    CALL GetCommandLineA@0    MOV    ESI, EAX ; Pointer to the string    XOR    ECX, ECX ; Counter    MOV    EDX, 1   ; Indicator L1:    CMP    BYTE PTR [ESI], 0    JE     L4    CMP    BYTE PTR [ESI], 32    JE     L3    ADD    ECX, EDX ; Parameter number    MOV    EDX, 0    JMP    L2 L3:    OR     EDX, 1 L2:    INC    ESI    JMP    L1 L4:    MOV    EAX, ECX    RET NUMPAR ENDP ; Get the parameter from the command line ; EBX - Points to the buffer, in which to load the parameter ; Zero-terminated string will be placed into the buffer ; EDI --- Parameter number GETPAR PROC    CALL   GetCommandLineA@0    MOV    ESI, EAX ; Pointer to the string    XOR    ECX, ECX ; Counter    MOV    EDX, 1 ; Indicator L1:    CMP    BYTE PTR [ESI], 0    JE     L4    CMP    BYTE PTR [ESI], 32    JE     L3    ADD    ECX, EDX ; Parameter number    MOV    EDX, 0    JMP    L2 L3:    OR     EDX, 1 L2:    CMP    ECX, EDI    JNE    L5    MOV    AL, BYTE PTR [ESI]    MOV    BYTE PTR [EBX], AL    INC    EBX L5:    INC    ESI    JMP    L1 L4:    MOV    BYTE PTR [EBX], 0    RET GETPAR ENDP _TEXT ENDS END START 
image from book
 
Listing 11.4: Output of the contents of a text file into a console (second method)
image from book
 ; The FILES2.ASM file .586P ; Flat memory model .MODEL FLAT, stdcall ; Constants STD_OUTPUT_HANDLE equ -11 GENERIC_READ   equ 80000000h GENERIC_WRITE  equ 40000000h GEN = GENERIC_READ or GENERIC_WRITE SHARE = 0 OPEN_EXISTING  equ 3 ; Prototypes of external procedures EXTERN ExitProcess@4:NEAR EXTERN GetCommandLineA@0:NEAR EXTERN CreateFileA@28:NEAR EXTERN CloseHandle@4:NEAR EXTERN ReadFile@20:NEAR EXTERN WriteFile@20:NEAR ;--------------------------- ; INCLUDELIB directives for the linker includelib c:\masm32\lib\user32.lib includelib c:\masm32\lib\kernel32.lib ;--------------------------- ; Data segment _DATA SEGMENT HANDL  DWORD ? HFILE  DWORD ? BUF    DB 100 DUP(O) BUFER  DB 300 DUP(O) NUMB   DWORD ? NUMW   DWORD ? NAMEOUT DB "CONOUT$" _DATA ENDS ; Code segment _TEXT SEGMENT START: ; Get the output handle (console as a file)     PUSH  0     PUSH  0     PUSH  OPEN_EXISTING     PUSH  0     PUSH  0     PUSH  GEN     PUSH  OFFSET NAMEOUT     CALL  CreateFileA@28     MOV   HANDL, EAX ; Get the number of parameters     CALL  NUMPAR     CMP   EAX, 1     JE    NO_PAR ;------------------------------------- ; Get the parameter with the EDI number     MOV   EDI, 2     LEA   EBX, BUF     CALL  GETPAR ; Open the file     PUSH  0     PUSH  0     PUSH  OPEN_EXISTING     PUSH  0     PUSH  0     PUSH  GEN     PUSH  OFFSET BUF     CALL  CreateFileA@28     CMP   EAX, -1     JE    NO_PAR     MOV   HFILE, EAX LOO: ; Read into the buffer     PUSH  0     PUSH  OFFSET NUMB     PUSH  300     PUSH  OFFSET BUFER     PUSH  HFILE     CALL  ReadFile@20 ; Output to the console as to a file     PUSH  0     PUSH  OFFSET NUMW     PUSH  NUMB     PUSH  OFFSET BUFER     PUSH  HANDL     CALL  WriteFile@20     CMP   NUMB, 300     JE    LOO ; Close the file     PUSH  HFILE     CALL  CloseHandle@4 ; End of program operation NO_PAR:     PUSH  0     CALL  ExitProcess@4 ; Procedures ; Procedure for determining the number of parameters in the string ; Determine the number of parameters (->EAX) NUMPAR PROC     CALL  GetCommandLineA@0     MOV   ESI, EAX ; Pointer to the string     XOR   ECX, ECX ; Counter     MOV   EDX, 1  ; Indicator L1:     CMP   BYTE PTR [ESI], 0     JE    L4     CMP   BYTE PTR [ESI], 32     JE    L3     ADD   ECX, EDX ; Parameter number     MOV   EDX, 0     JMP   L2 L3:     OR    EDX, 1 L2:     INC   ESI     JMP   L1 L4:     MOV   EAX, ECX.     RET NUMPAR ENDP ; Get the parameter from the command line ; EBX - Points to the buffer, in which the parameter will be loaded ; Zero-terminated string is loaded into the buffer ; EDI --- Parameter number GETPAR PROC     CALL GetCommandLineA@0     MOV  ESI, EAX ; Pointer to the string     XOR  ECX, ECX ; Counter     MOV  EDX, 1  ; Indicator L1:     CMP  BYTE PTR [ESI], 0     JE   L4     CMP  BYTE PTR [ESI], 32     JE   L3     ADD  ECX, EDX ; Parameter number     MOV  EDX, 0     JMP  L2 L3:     OR   EDX, 1 L2:     CMP  ECX, EDI     JNE  L5     MOV  AL, BYTE PTR [ESI]     MOV  BYTE PTR [EBX], AL     INC  EBX L5:     INC  ESI     JMP  L1 L4:     MOV  BYTE PTR [EBX], 0     RET GETPAR ENDP _TEXT ENDS END START 
image from book
 

Now, consider the text file structure in more detail. When working with a high-level programming language, certain algorithmic skills are lost. In particular, this relates to working with text files. Assembly language, on the contrary, doesn't allow you to relax. Consider possible variants of working with text files.

The main characteristic of a text file is that such files consist of strings that have different lengths. Strings are separated by delimiters. Most frequently, the role of the delimiter is delegated to a sequence of two codes 13 and 10 . Other variants are also possible. For example, some MS-DOS text editors separated strings by only one code 13 .

Reading data from text files string by string can be implemented using four approaches:

  • Byte-by-byte reading from a file. When a delimiter is encountered , carry out an operation over the string read, and proceed with reading the next string. When doing so, naturally, it is necessary to account for the possible lack of the delimiter character in the end of the file. If you think that this method is too slow, I'd point out that Windows implements an efficient disk-caching algorithm. Therefore, the situation is not as bad as it seems.

  • Reading data into a small buffer that can hold at least one text line. Having read the line, find the end-of-line character and carry out some operation over the string. Then, access the file and move the pointer to the start of the next line. Finally, repeat the operation.

  • Reading data into the buffer of arbitrary length. After completing the read operation, search for all strings loaded into the buffer and carry out some operations over them. In this case, it is highly probable that one of the strings won't fit entirely into the buffer. You must take this situation into account.

  • Reading data into the buffer capable of holding the entire file. This is a particular case of the third approach. From the programming point of view, it is the easiest to implement.

The program presented in Listing 11.5 implements the third approach.

Listing 11.5: An example illustrating the processing of a text file
image from book
 ; The FILES2.ASM file .586P ; Flat memory model .MODEL FLAT, stdcall ; Constants STD_OUTPUT_HANDLE equ -11 GENERIC_READ      equ 80000000h GENERIC_WRITE     equ 40000000h GEN = GENERIC_READ or GENERIC_WRITE SHARE = 0 OPEN_EXISTING  equ 3 ; Prototypes of external procedures EXTERN ExitProcess@4:NEAR EXTERN GetCommandLineA@0:NEAR EXTERN CreateFileA@28:NEAR EXTERN CloseHandle@4:NEAR EXTERN ReadFile@20:NEAR EXTERN WriteFile@20:NEAR EXTERN CharToOemA@8:NEAR ;------------------ ; INCLUDELIB directives for the linker includelib c:\masm32\lib\user32.lib includelib c:\masm32\lib\kernel32.lib ;------------------------ ; Data segment _DATA SEGMENT HANDL   DWORD ? ; Console descriptor HFILE   DWORD ? ; File descriptor BUF     DB 100 DUP(0) ; Buffer for parameters BUFER   DB 1000 DUP(0) ; Buffer for the file NAMEOUT DB "CONOUT$" INDS    DD 0 ; Number of the character in the string INDB    DD 0 ; Number of the character in the buffer NUMB    DD ? NUMC    DD ? PRIZN   DD 0 STROKA  DB 300 DUP(0) _DATA ENDS ; Code segment _TEXT SEGMENT START: ; Get the output handle HANDLE (console as file)       PUSH 0       PUSH 0       PUSH OPEN_EXISTING       PUSH 0       PUSH 0       PUSH GEN       PUSH OFFSET NAMEOUT       CALL CreateFileA@28       MOV  HANDL, EAX ; Get the number of parameters       CALL NUMPAR       CMP  EAX, 1       JE   NO_PAR ;------------------------------------- ; Get the parameter with the EDI number       MOV  EDI, 2       LEA  EBX, BUF       CALL GETPAR ; Open the file       PUSH 0       PUSH 0       PUSH OPEN_EXISTING       PUSH 0       PUSH 0       PUSH GEN       PUSH OFFSET BUF       CALL CreateFileA@28       CMP  EAX, -1       JE   NO_PAR       MOV  HFILE, EAX ;+++++++++++++++++++++++++++++ LOO: ; Read 1000 bytes       PUSH 0       PUSH OFFSET NUMB       PUSH 1000       PUSH OFFSET BUFER       PUSH HFILE       CALL ReadFile@20       MOV  INDB, 0 ; Check whether the bytes are in the buffer       CMP  NUMB, 0       JZ   _CLOSE ; Fill the string LOO1:       MOV  EDI, INDS       MOV  ESI, INDB       MOV  AL, BYTE PTR BUFER[ESI]       CMP  AL, 13 ; Check whether the end-of-line character has been encountered       JE   _ENDSTR       MOV  BYTE PTR STROKA[EDI], AL       INC  ESI       INC  EDI       MOV  INDS, EDI       MOV  INDB, ESI       CMP  NUMB, ESI ; Check for the buffer end       JNBE LOO1 ; Buffer end       MOV  INDS, EDI       MOV  INDB, ESI       JMP  LOO _ENDSTR: ; Carry out some operation with the string       CALL OUTST ; Reset the string to zero       MOV  INDS, 0 ; Go to another string in the buffer       ADD  INDB, 2 ; Has the buffer end been reached?       MOV  ESI, INDB       CMP  NUMB, ESI       JAE  LOO1       JMP  LOO ;++++++++++++++++++++++++++++++++ _CLOSE: ; Check whether the string is empty       CMP  INDS, 0       JZ   CONT ; Carry out some operation over the string       CALL OUTST CONT: ; Close the files       PUSH HFILE       CALL CloseHandle@4 ; End of program operation NO_PAR:     PUSH 0     CALL ExitProcess@4 ; Procedures ; The procedure for determining the number of parameters in the string ; Determine the number of parameters (->EAX) NUMPAR PROC     CALL GetCommandLineA@0     MOV  ESI, EAX ; Pointer to the string     XOR  ECX, ECX ; Counter     MOV  EDX, 1 ; Indicator L1:     CMP  BYTE PTR [ESI], 0     JE   L4     CMP  BYTE PTR [ESI], 32     JE   L3     ADD  ECX, EDX ; Parameter number     MOV  EDX, 0     JMP  L2 L3:     OR   EDX, 1 L2:     INC  ESI     JMP  L1 L4:     MOV  EAX, ECX     RET NUMPAR ENDP ; Get the parameter from the command line ; EBX - Points to the buffer, in which the parameter will be loaded ; Zero-terminated string is loaded into the buffer ; EDI --- Parameter number GETPAR PROC     CALL GetCommandLineA@0     MOV  ESI, EAX ; Pointer to the string     XOR  ECX, ECX ; Counter     MOV  EDX, 1  ; Indicator L1:     CMP  BYTE PTR [ESI], 0     JE   L4     CMP  BYTE PTR [ESI], 32     JE   L3     ADD  ECX, EDX ; Parameter number     MOV  EDX, 0     JMP  L2 L3:     OR   EDX, 1 L2:     CMP  ECX, EDI     JNE  L5     MOV  AL, BYTE PTR [ESI]     MOV  BYTE PTR [EBX], AL     INC  EBX L5:     INC  ESI     JMP  L1 L4:     MOV  BYTE PTR [EBX], 0     RET GETPAR ENDP ; Output the string with the delimiter into the console OUTST PROC     MOV  EBX, INDS     MOV  BYTE PTR STROKA[EBX], 0     PUSH OFFSET STROKA     PUSH OFFSET STROKA     CALL CharToOemA@8 ; String is terminated by the delimiter     MOV  BYTE PTR STROKA[EBX], 6     INC  INDS ;Output string     PUSH 0     PUSH OFFSET NUMC     PUSH INDS     PUSH OFFSET STROKA     PUSH HANDL     CALL WriteFile@20     RET OUTST ENDP _TEXT ENDS END START 
image from book
 

The program presented in Listing 11.5 demonstrates one of the possible algorithms for processing text files line-by-line reading of the text file. The program fragment responsible for reading and analyzing the text file is between the LOO label and the CONT label. Carefully consider the algorithms, and you'll decide that high-level language will never stimulate writing such algorithms. Consequently, the Assembly language stimulates your intellectual capabilities.

How To Obtain Time Attributes of a File

Now I will demonstrate how to obtain time attributes of a file using the GetFileTime function mentioned earlier in this chapter.

Listing 11.6: Obtaining time attributes of a file
image from book
 .586P ; Flat memory model .MODEL FLAT, stdcall ; Constants STD_OUTPUT_HANDLE equ -11 GENERIC_READ    equ 80000000h GENERIC_WRITE   equ 40000000h GEN = GENERIC_READ or GENERIC_WRITE OPEN_EXISTING  equ 3 ; Prototypes of external procedures EXTERN CreateFileA@28:NEAR EXTERN lstrlenA@4:NEAR EXTERN GetStdHandle@4:NEAR EXTERN WriteConsoleA@20:NEAR EXTERN ExitProcess@4:NEAR EXTERN CloseHandle@4:NEAR EXTERN GetFileTime@16:NEAR EXTERN FileTimeToLocalFileTime@8:NEAR EXTERN FileTimeToSystemTime@8:NEAR EXTERN wsprintfA:NEAR ; Structures ; The FILETIME structure FILETIME STRUC   LOTIME    DD 0   HITIME    DD 0   FILETIME  ENDS SYSTIME STRUC   Y   DW 0 ; Year   M   DW 0 ; Month   DWE DW 0 ; Day of week   D   DW 0 ; Day of month   H  DW 0  ; Hour   MI DW 0  ; Minute   S  DW 0  ; Second   MS DW 0  ; Thousandths of a second SYSTIME ENDS ; INCLUDELIB directives for the linker includelib c:\masm32\lib\user32.lib includelib c:\masm32\lib\kernel32.lib ;------------------------------------------------ ; Data segment __DATA SEGMENT LENS    DD ? ; String length will be placed here HANDL   DD ? ; Descriptor of the output console HFILE   DD ? ; Descriptor of the file being opened ERRS    DB 'Error!', 0 ; Error message PATH    DB 'e:\backup3.pst', 0 ; Path to the file FTMCR   FILETIME <0>; For creation time FTMAC   FILETIME <0> ; For access time FTMWR   FILETIME <0> ; For modification time LOCALS1 FILETIME <0> ; For local time FORM    DB "Write time:. Sec %lu Min %lu Hou %lu Day %lu Mon %lu Yea %lu", 0 SST     SYSTIME <0> ; System time format BUF     DB 60 DUP(0); Buffer for the formatted string TEXT1   DB 'File: ', 0 _DATA ENDS ; Code segment _TEXT SEGMENT START: ; Get the output handle     PUSH  STD_OUTPUT_HANDLE     CALL  GetStdHandle@4     MOV   HANDL, EAX ; Open the file     PUSH  0      ; Must be 0     PUSH  0      ; File attribute (if creating)     PUSH  OPEN_EXISTING ; How to open     PUSH  0      ; Pointer to the attribute     PUSH  0      ; Generic access mode     PUSH  GEN     ; Access mode     PUSH  OFFSET PATH  ; Filename     CALL  CreateFileA@28 ; Check whether the file has been opened.     CMP   EAX, -1     JNE   CONT ; Error message and exit     LEA   EAX, ERRS     MOV   EDI, 1     CALL  WRITE     JMP   EXI CONT:     MOV   HFILE, EAX ; Get the file time     PUSH  OFFSET FTMWR     PUSH  OFFSET FTMAC     PUSH  OFFSET FTMCR     PUSH  HFILE     CALL  GetFileTime@16 ; Convert the file time to local time     PUSH  OFFSET LOCALS1     PUSH  OFFSET FTMWR     CALL  FileTimeToLocalFileTime@8 ; Format to the system time format     PUSH  OFFSET SST     PUSH  OFFSET LOCALS1     CALL  FileTimeToSystemTime@8 ; Convert the string     MOV   AX, SST.Y     MOVZX EAX, AX     PUSH  EAX     MOV   AX, SST.M     MOVZX EAX, AX     PUSH  EAX     MOV   AX, SST.D     MOVZX EAX, AX     PUSH  EAX     MOV   AX, SST.H     MOVZX EAX, AX     PUSH  EAX     MOV   AX, SST.MI     MOVZX EAX, AX     PUSH  EAX     MOV   AX, SST.S     MOVZX EAX, AX     PUSH  EAX     PUSH  OFFSET FORM     PUSH  OFFSET BUF     CALL  wsprintfA ; Release the stack     ADD   ESP, 32 ; Output information     LEA   EAX, TEXT1     MOV   EDI, 0     CALL  WRITE     LEA   EAX, PATH     MOV   EDI, 1     CALL  WRITE     LEA   EAX, BUF     MOV   EDI, 1     CALL  WRITE ; Close the file CLOSE:     PUSH  HFILE     CALL  CloseHandle@4 ; Exit     EXI:     PUSH  0     CALL  ExitProcess@4 ; Output the string (terminated by the line feed) ; EAX - To the start 'of the string ; EDI --- With or without line feed WRITE  PROC ; Get the parameter length     PUSH  EAX     PUSH  EAX     CALL  lstrlenA@4     MOV   ESI, EAX     POP   EBX     CMP   EDI, 1     JNE   NO_ENT ; Line feed in the end     MOV   BYTE PTR [EBX+ESI], 13     MOV   BYTE PTR [EBX+ESI+1], 10     MOV   BYTE PTR [EBX+ESI+2], 0     ADD   EAX, 2     NO_ENT: ; String output     PUSH  0     PUSH  OFFSET LENS     PUSH  EAX     PUSH  EBX     PUSH  HANDL     CALL  WriteConsoleA@20     RET WRITE  ENDP _TEXT ENDS END START 
image from book
 

As a result of execution of this program, the string with the filename and the string containing its last modification date in the "second minute hour day month year" format will be sent for output to the current console. Principally, this program is simple and doesn't need any comments. The only thing that I'd like to point out is the use of the wsprintfA function, because only it can cause any difficulties if you have read the book up to this chapter.

To translate this program using TASM32, proceed in a normal way:

  1. Remove all @n suffixes.

  2. Replace wsprintfA with _wsprintfA .

  3. Include the IMPORT32.LIB library.

[i] By external files I mean files located on an external device.

[ii] Actually, they output the contents of any file, but this method of file output into the console makes sense only for text files.



The Assembly Programming Master Book
The Assembly Programming Master Book
ISBN: 8170088178
EAN: 2147483647
Year: 2004
Pages: 140
Authors: Vlad Pirogov

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net