Now we will consider the process of reading records. In this program, we will read each record and display the first name listed with each record.
Since each person's name is a different length, we will need a function to count the number of characters we want to write. Since we pad each field with null characters, we can simply count characters until we reach a null character.[2] Note that this means our records must contain at least one null character each.
Here is the code. Put it in a file called count-chars.s:
#PURPOSE: Count the characters until a null byte is reached. # #INPUT: The address of the character string # #OUTPUT: Returns the count in %eax # #PROCESS: # Registers used: # %ecx - character count # %al - current character # %edx - current character address .type count_chars, @function .globl count_chars #This is where our one parameter is on the stack .equ ST_STRING_START_ADDRESS, 8 count_chars: pushl %ebp movl %esp, %ebp #Counter starts at zero movl $0, %ecx #Starting address of data movl ST_STRING_START_ADDRESS(%ebp), %edx count_loop_begin: #Grab the current character movb (%edx), %al #Is it null? cmpb $0, %al #If yes, we're done je count_loop_end #Otherwise, increment the counter and the pointer incl %ecx incl %edx #Go back to the beginning of the loop jmp count_loop_begin count_loop_end: #We're done. Move the count into %eax #and return. movl %ecx, %eax popl %ebp ret
As you can see, it's a fairly straightforward function. It simply loops through the bytes, counting as it goes, until it hits a null character. Then it returns the count.
Our record-reading program will be fairly straightforward, too. It will do the following:
Open the file
Attempt to read a record
If we are at the end of the file, exit
Otherwise, count the characters of the first name
Write the first name to STDOUT
Write a newline to STDOUT
Go back to read another record
To write this, we need one more simple function - a function to write out a newline to STDOUT. Put the following code into write-newline.s:
.include "linux.s" .globl write_newline .type write_newline, @function .section .data newline: .ascii "\n" .section .text .equ ST_FILEDES, 8 write_newline: pushl %ebp movl %esp, %ebp movl $SYS_WRITE, %eax movl ST_FILEDES(%ebp), %ebx movl $newline, %ecx movl $1, %edx int $LINUX_SYSCALL movl %ebp, %esp popl %ebp ret
Now we are ready to write the main program. Here is the code to read-records.s:
.include "linux.s" .include "record-def.s" .section .data file_name: .ascii "test.dat\0" .section .bss .lcomm record_buffer, RECORD_SIZE .section .text #Main program .globl _start _start: #These are the locations on the stack where #we will store the input and output descriptors #(FYI - we could have used memory addresses in #a .data section instead) .equ ST_INPUT_DESCRIPTOR, -4 .equ ST_OUTPUT_DESCRIPTOR, -8 #Copy the stack pointer to %ebp movl %esp, %ebp #Allocate space to hold the file descriptors subl $8, %esp #Open the file movl $SYS_OPEN, %eax movl $file_name, %ebx movl $0, %ecx #This says to open read-only movl $0666, %edx int $LINUX_SYSCALL #Save file descriptor movl %eax, ST_INPUT_DESCRIPTOR(%ebp) #Even though it's a constant, we are #saving the output file descriptor in #a local variable so that if we later #decide that it isn't always going to #be STDOUT, we can change it easily. movl $STDOUT, ST_OUTPUT_DESCRIPTOR(%ebp) record_read_loop: pushl ST_INPUT_DESCRIPTOR(%ebp) pushl $record_buffer call read_record addl $8, %esp #Returns the number of bytes read. #If it isn't the same number we #requested, then it's either an #end-of-file, or an error, so we're #quitting cmpl $RECORD_SIZE, %eax jne finished_reading #Otherwise, print out the first name #but first, we must know it's size pushl $RECORD_FIRSTNAME + record_buffer call count_chars addl $4, %esp movl %eax, %edx movl ST_OUTPUT_DESCRIPTOR(%ebp), %ebx movl $SYS_WRITE, %eax movl $RECORD_FIRSTNAME + record_buffer, %ecx int $LINUX_SYSCALL pushl ST_OUTPUT_DESCRIPTOR(%ebp) call write_newline addl $4, %esp jmp record_read_loop finished_reading: movl $SYS_EXIT, %eax movl $0, %ebx int $LINUX_SYSCALL
To build this program, we need to assemble all of the parts and link them together:
as read-record.s -o read-record.o as count-chars.s -o count-chars.o as write-newline.s -o write-newline.o as read-records.s -o read-records.o ld read-record.o count-chars.o write-newline.o \ read-records.o -o read-records
The backslash in the first line simply means that the command continues on the next line. You can run your program by doing ./read-records.
As you can see, this program opens the file and then runs a loop of reading, checking for the end of file, and writing the firstname. The one construct that might be new is the line that says:
pushl $RECORD_FIRSTNAME + record_buffer
It looks like we are combining and add instruction with a push instruction, but we are not. You see, both RECORD_FIRSTNAME and record_buffer are constants. The first is a direct constant, created through the use of a .equ directive, while the latter is defined automatically by the assembler through its use as a label (it's value being the address that the data that follows it will start at). Since they are both constants that the assembler knows, it is able to add them together while it is assembling your program, so the whole instruction is a single immediate-mode push of a single constant.
The RECORD_FIRSTNAME constant is the number of bytes after the beginning of a record before we hit the first name. record_buffer is the name of our buffer for holding records. Adding them together gets us the address of the first name member of the record stored in record_buffer.
[2]If you have used C, this is what the strlen function does.