9.6 SCANFILE: Input and Output with Files

Figure 9-5 continues the progression of SCANTEXT and SCANTERM by presenting SCANFILE. We have designed a program that loops through an input text file and places into an output text file each isolated word (with any trailing punctuation) on a separate line. The SCANFILE program also counts the number of words, which will be the number of lines in the output file, and displays that value on the screen.

Figure 9-5 SCANFILE: Using C-like input and output with files
 // SCANFILE      Demonstrate I/O for files // This program does lexical analysis by reading "words" // from a text file using fscanf. These separate words are // then written out, one per line, using fprintf.          BUFL    = 80             // Allowance for a "word"          FLEN    = 40             // File name allowance          .global gets, printf, fopen, fclose, fscanf, fprintf          .data                    // Declare storage          .align  8                // Desired alignment BUF:     .skip   BUFL             // Input/output buffer IFILE:   .skip   FLEN             // Input file name OFILE:   .skip   FLEN             // Output file name IPRMT:   stringz "Input from? " OPRMT:   stringz "Output to? " IMODE:   stringz "r"              // Read from input file OMODE:   stringz "w"              // Create output file TELL:    stringz "The program has processed %d words.\n" PFORM:                            // Prompts are strings IFORM:   stringz "%s"             // Scan for a "word" OFORM:   stringz "%s\n"           // Print "word" & newline          .text                    // Section for code          .align   32              // Desired alignment          .global  main            // These three lines          .proc    main            //  mark the mandatory main:                             //   'main' program entry          .prologue 12,r32         // Mask for rp, ar.pfs only          alloc   loc0 = ar.pfs,0,7,3,0  // ins, locals, outs          .save    rp,loc1         // Must save return address          mov      loc1 = b0;;     //  to our caller          .body first:   add      out0 = @gprel(IPRMT),gp  // out0 -> format          mov      loc2 = gp       // Save gp          br.call.sptk.many b0 = printf   // Ask about input          mov     gp = loc2        // Restore gp          cmp4.lt p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop0;; //  go to handler (null)          add     out0 = @gprel(IFILE),gp  // out0 -> filename          br.call.sptk.many b0 = gets   // Unformatted input          mov     gp = loc2        // Restore gp          cmp.eq  p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop1;; //  go to handler (null)          add     out0 = @gprel(IFILE),gp // out0 -> filename          add     out1 = @gprel(IMODE),gp // out1 -> mode          br.call.sptk.many b0 = fopen  // Find input file          mov     gp = loc2        // Restore gp          cmp.eq  p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop2;; //  go to handler (null)          mov     loc3 = ret0      // loc3 = file pointer          add     out0 = @gprel(OPRMT),gp  // out0 -> format          br.call.sptk.many b0 = printf  // Ask about input          mov     gp = loc2        // Restore gp          cmp4.lt p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop0;; //  go to handler (null)          add     out0 = @gprel(OFILE),gp  // out0 -> filename          br.call.sptk.many b0 = gets  // Unformatted input          mov     gp = loc2        // Restore gp          cmp.eq  p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop1;; //  go to handler (null)          add     out0 = @gprel(OFILE),gp // out0 -> filename          add     out1 = @gprel(OMODE),gp // out1 -> mode          br.call.sptk.many b0 = fopen  // Find output file          mov     gp = loc2        // Restore gp          cmp.eq   p6,p0 = r8,r0   // If any error,    (p6)  br.cond.sptk.few stop4;; //  go to handler (null)          mov     loc4 = ret0      // loc4 = file pointer          mov     loc5 = 0         // loc5 = word count          add     loc6 = @gprel(BUF),gp;; // loc6 -> buffer loop:    mov     out0 = loc3      // out0 = IPTR          add     out1 = @gprel(IFORM),gp // out1 -> format          mov     out2 = loc6      // out2 -> BUF          br.call.sptk.many b0 = fscanf // Read a "word"          mov     gp = loc2        // Restore gp          cmp4.ne p6,p0 = 1,ret0   // Expect one %s item    (p6)  br.cond.sptk.few eof;;   // No - maybe it's EOF          mov     out0 = loc4      // out0 = OPTR          add     out1 = @gprel(OFORM),gp // out1 -> format          mov     out2 = loc6      // out2 -> BUF          br.call.sptk.many b0 = fprintf // Write a "word"          mov     gp = loc2        // Restore gp          cmp4.lt p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop7   //  go to handler (null)          add     loc5 = 1,loc5    // Count one word          br.cond.sptk.few loop    // Go back for more eof:     cmp4.ne p6,p0 = -1,ret0  // If not EOF,    (p6)  br.cond.sptk.few stop6;; //  then exit          mov     out0 = loc3      // out0 = IPTR          br.call.sptk.many b0 = fclose // Close input          mov     gp = loc2        // Restore gp          cmp4.ne p6,p0 = 0,ret0   // If not successful,    (p6)  br.cond.sptk.few stop3;; //  then exit          mov     out0 = loc4      // out0 = OPTR          br.call.sptk.many b0 = fclose // Close input          mov     gp = loc2        // Restore gp          cmp4.ne p6,p0 = 0,ret0   // If not successful,    (p6)  br.cond.sptk.few stop5;; //  then exit          add     out0 = @gprel(TELL),gp // out0 -> format          mov     out1 = loc5      // out1 = number of words          br.call.sptk.many b0 = printf  // C print function          mov     gp = loc2        // Restore gp          cmp4.lt p6,p0 = ret0,r0  // If any error,    (p6)  br.cond.sptk.few stop0;; //  go to handler (null)          br.cond.sptk.many done   // That is all stop0:                            // Terminal output error stop1:                            // Terminal input error stop2:                            // Problem opening IFILE stop3:                            // Problem closing IFILE stop4:                            // Problem opening OFILE stop5:                            // Problem closing OFILE stop6:                            // Problem getting input stop7:                            // Problem doing output done:   mov      ret0 = 0         // Signal all is normal         mov      b0 = loc1        // Restore return address         mov      ar.pfs = loc0    // Restore caller's ar.pfs         br.ret.sptk.many b0;;     // Back to command line         .endp    main             // Mark end of procedure 

The top portion of SCANFILE begins by prompting the user for the file names. Those names are then passed to the fopen function in order to establish access to the input file and create the output file. For this program, as for SCANTERM, we have inserted appropriate branch instructions that ensure that the program proceeds only if all necessary conditions are met. As we do not wish to detail system-specific error codes, those branch instructions lead to stop labels that simply fall through to code at the label done.

The functional core of SCANFILE consists of a concise loop beginning at the label loop. The fscanf function with the format "%s" reads one "word" from the input file i.e., a string of characters up to a terminator (space, tab, or newline) that possibly includes some adjacent punctuation marks. The terminating character is not stored as part of the string, which is instead terminated with a null in memory. Each such string is then sent to the output file using the fprintf function with a C-style "%s\n" format string.

Exit from this program loop occurs when a call to fscanf returns the EOF condition (a negative value) in register ret0. After the loop branches to the label eof, both input and output files are closed, and a summary message with the number of words processed is displayed with the printf function:

 Input from? lincoln.txt Output to? abc.txt The program has processed 274 words. 

The "words" could be more carefully examined to prune away punctuation marks and other characters such as dashes; the above example uses a text file containing the Gettysburg address, which has many hyphenated words.

We would also have our readers appreciate that the internal coding in the system-supplied fscanf function has to perform character-by-character testing. Essential work cannot be avoided, but it can be encapsulated and made reusable.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net