Figure 9-5 continues the progression of SCANTEXT and SCANTERM by presenting SCANFILE. We have designed a program that loops through an input text file and places into an output text file each isolated word (with any trailing punctuation) on a separate line. The SCANFILE program also counts the number of words, which will be the number of lines in the output file, and displays that value on the screen. Figure 9-5 SCANFILE: Using C-like input and output with files// SCANFILE Demonstrate I/O for files // This program does lexical analysis by reading "words" // from a text file using fscanf. These separate words are // then written out, one per line, using fprintf. BUFL = 80 // Allowance for a "word" FLEN = 40 // File name allowance .global gets, printf, fopen, fclose, fscanf, fprintf .data // Declare storage .align 8 // Desired alignment BUF: .skip BUFL // Input/output buffer IFILE: .skip FLEN // Input file name OFILE: .skip FLEN // Output file name IPRMT: stringz "Input from? " OPRMT: stringz "Output to? " IMODE: stringz "r" // Read from input file OMODE: stringz "w" // Create output file TELL: stringz "The program has processed %d words.\n" PFORM: // Prompts are strings IFORM: stringz "%s" // Scan for a "word" OFORM: stringz "%s\n" // Print "word" & newline .text // Section for code .align 32 // Desired alignment .global main // These three lines .proc main // mark the mandatory main: // 'main' program entry .prologue 12,r32 // Mask for rp, ar.pfs only alloc loc0 = ar.pfs,0,7,3,0 // ins, locals, outs .save rp,loc1 // Must save return address mov loc1 = b0;; // to our caller .body first: add out0 = @gprel(IPRMT),gp // out0 -> format mov loc2 = gp // Save gp br.call.sptk.many b0 = printf // Ask about input mov gp = loc2 // Restore gp cmp4.lt p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop0;; // go to handler (null) add out0 = @gprel(IFILE),gp // out0 -> filename br.call.sptk.many b0 = gets // Unformatted input mov gp = loc2 // Restore gp cmp.eq p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop1;; // go to handler (null) add out0 = @gprel(IFILE),gp // out0 -> filename add out1 = @gprel(IMODE),gp // out1 -> mode br.call.sptk.many b0 = fopen // Find input file mov gp = loc2 // Restore gp cmp.eq p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop2;; // go to handler (null) mov loc3 = ret0 // loc3 = file pointer add out0 = @gprel(OPRMT),gp // out0 -> format br.call.sptk.many b0 = printf // Ask about input mov gp = loc2 // Restore gp cmp4.lt p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop0;; // go to handler (null) add out0 = @gprel(OFILE),gp // out0 -> filename br.call.sptk.many b0 = gets // Unformatted input mov gp = loc2 // Restore gp cmp.eq p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop1;; // go to handler (null) add out0 = @gprel(OFILE),gp // out0 -> filename add out1 = @gprel(OMODE),gp // out1 -> mode br.call.sptk.many b0 = fopen // Find output file mov gp = loc2 // Restore gp cmp.eq p6,p0 = r8,r0 // If any error, (p6) br.cond.sptk.few stop4;; // go to handler (null) mov loc4 = ret0 // loc4 = file pointer mov loc5 = 0 // loc5 = word count add loc6 = @gprel(BUF),gp;; // loc6 -> buffer loop: mov out0 = loc3 // out0 = IPTR add out1 = @gprel(IFORM),gp // out1 -> format mov out2 = loc6 // out2 -> BUF br.call.sptk.many b0 = fscanf // Read a "word" mov gp = loc2 // Restore gp cmp4.ne p6,p0 = 1,ret0 // Expect one %s item (p6) br.cond.sptk.few eof;; // No - maybe it's EOF mov out0 = loc4 // out0 = OPTR add out1 = @gprel(OFORM),gp // out1 -> format mov out2 = loc6 // out2 -> BUF br.call.sptk.many b0 = fprintf // Write a "word" mov gp = loc2 // Restore gp cmp4.lt p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop7 // go to handler (null) add loc5 = 1,loc5 // Count one word br.cond.sptk.few loop // Go back for more eof: cmp4.ne p6,p0 = -1,ret0 // If not EOF, (p6) br.cond.sptk.few stop6;; // then exit mov out0 = loc3 // out0 = IPTR br.call.sptk.many b0 = fclose // Close input mov gp = loc2 // Restore gp cmp4.ne p6,p0 = 0,ret0 // If not successful, (p6) br.cond.sptk.few stop3;; // then exit mov out0 = loc4 // out0 = OPTR br.call.sptk.many b0 = fclose // Close input mov gp = loc2 // Restore gp cmp4.ne p6,p0 = 0,ret0 // If not successful, (p6) br.cond.sptk.few stop5;; // then exit add out0 = @gprel(TELL),gp // out0 -> format mov out1 = loc5 // out1 = number of words br.call.sptk.many b0 = printf // C print function mov gp = loc2 // Restore gp cmp4.lt p6,p0 = ret0,r0 // If any error, (p6) br.cond.sptk.few stop0;; // go to handler (null) br.cond.sptk.many done // That is all stop0: // Terminal output error stop1: // Terminal input error stop2: // Problem opening IFILE stop3: // Problem closing IFILE stop4: // Problem opening OFILE stop5: // Problem closing OFILE stop6: // Problem getting input stop7: // Problem doing output done: mov ret0 = 0 // Signal all is normal mov b0 = loc1 // Restore return address mov ar.pfs = loc0 // Restore caller's ar.pfs br.ret.sptk.many b0;; // Back to command line .endp main // Mark end of procedure The top portion of SCANFILE begins by prompting the user for the file names. Those names are then passed to the fopen function in order to establish access to the input file and create the output file. For this program, as for SCANTERM, we have inserted appropriate branch instructions that ensure that the program proceeds only if all necessary conditions are met. As we do not wish to detail system-specific error codes, those branch instructions lead to stop labels that simply fall through to code at the label done. The functional core of SCANFILE consists of a concise loop beginning at the label loop. The fscanf function with the format "%s" reads one "word" from the input file i.e., a string of characters up to a terminator (space, tab, or newline) that possibly includes some adjacent punctuation marks. The terminating character is not stored as part of the string, which is instead terminated with a null in memory. Each such string is then sent to the output file using the fprintf function with a C-style "%s\n" format string. Exit from this program loop occurs when a call to fscanf returns the EOF condition (a negative value) in register ret0. After the loop branches to the label eof, both input and output files are closed, and a summary message with the number of words processed is displayed with the printf function: Input from? lincoln.txt Output to? abc.txt The program has processed 274 words. The "words" could be more carefully examined to prune away punctuation marks and other characters such as dashes; the above example uses a text file containing the Gettysburg address, which has many hyphenated words. We would also have our readers appreciate that the internal coding in the system-supplied fscanf function has to perform character-by-character testing. Essential work cannot be avoided, but it can be encapsulated and made reusable. |