To Load or Not to Load | Embedded Systems Firmware Demystified (With CD-ROM)

TFS supports two major types of executable files: binary and script. Scripts are simple ASCII files containing CLI commands that are executed one line at a time. Scripts are executed directly from the flash memory in which they are stored. As of this writing, the binary executables can be in ELF, COFF, or AOUT format. These three formats are very similar; each consists of a file header, section headers, and a bunch of sections. Unlike the script executable, a binary file is executed after copying sections of the formatted file from TFS storage to space in RAM. The program then executes out of RAM. This process is commonly called loading a file. Loading is a very common practice on workstation/server-type operating systems because, in those environments, it isnt physically possible for the CPU to fetch from disk. In an embedded systems application, there is no disk. Instead the applications reside in flash memory which appears as local memory to the CPU. This issue means that the embedded systems processor can fetch applications directly out of flash memory. (As a matter of fact, MicroMonitor does execute directly out of flash.) Thus, whether an embedded system monitor loads an application or executes it directly from the file system is a design choice. There are tradeoffs on both sides of the decision.

The advantages of loading include:

No need to re-invent a file format;
You can run multiple applications on the same target by building them with non-overlapping absolute address maps;
RAM space on the target is likely to be wider and therefore more efficient for instruction fetching;
With the instructions in writable memory space, you can easily insert traps (or breakpoints) into the instruction stream;
The executable program can be compressed and therefore can require less flash space.

On the other hand, loading has a significant disadvantage : the transfer of text and initialized data sections from flash memory to RAM consumes time and memory.

The optimum approach depends on the nature of your application and how much memory you have. Since I am presenting a platform that provides a buffer between the application and the flash memory (allowing the application to treat flash memory as namespace instead of address space), I have already accepted the fact that I am adding overhead in order to provide a platform that is easier to maintain. I have therefore designed TFS to support loading.

In general, I consider it reasonable to sacrifice efficiency for the sake of maintainability. Throughout the design of TFS, when faced with this trade-off, Ive opted for maintainability.

Loader Implementation

Listing 7.15 shows a snippet of the loader for the ELF file format. The parameters shown in Listing 7.15 include the following:

fp is a pointer to the TFS file header structure associated with this ELF file;
verbose enables extra reporting;
entrypoint is a pointer to a long that, if non-zero , is loaded with the address of the entry point for this ELF file;
verifyonly is a flag that tells this function it should only compare what is already loaded to what would be loaded.

Note that Listing 7.15 requires multiple levels of headers. The TFS header has nothing to do with ELF itself. The TFS header is simply the header used for managing the storage of the files in the flash memory. Each file in TFS has two parts : the header and the data. In this case, the data is an ELF file, and the ELF file has another set of headers. This function has to deal with both header types, which can sometimes contribute to a certain amount of confusion.

Listing 7.15: ELF File Format Loader.

 /* tfsloadelf():  *  The file pointed to by fp has been determined to be an ELF file.  *  This function loads the sections of that file into the designated   *  locations.  *  Caches are flushed after loading each loadable section.  */ int tfsloadebin(TFILE *fp,int verbose,long *entrypoint,int verifyonly) {     Elf32_Word  size, notproctot;     int     i, err;     char        *shname_strings;     ELFFHDR *ehdr;     ELFSHDR *shdr;     /* Establish file header pointers... */     ehdr = (ELFFHDR *)(TFS_BASE(fp));     shdr = (ELFSHDR *)((int)ehdr + ehdr->e_shoff);     err = 0;     /* Verify basic file sanity... */     if ((ehdr->e_ident[0] != 0x7f)  (ehdr->e_ident[1] != E)          (ehdr->e_ident[2] != L)  (ehdr->e_ident[3] != F))         return(TFSERR_BADHDR);     /* Store the section name string table base: */     shname_strings = (char *)ehdr + shdr[ehdr->e_shstrndx].sh_offset;     notproctot = 0;     /* For each section header, relocate or clear if necessary... */     for (i=0;!err && i<ehdr->e_shnum;i++,shdr++) {         if ((size = shdr->sh_size) == 0)             continue;         if ((verbose) && (ehdr->e_shstrndx != SHN_UNDEF))             printf("%-10s: ", shname_strings + shdr->sh_name);         if (!(shdr->sh_flags & SHF_ALLOC)) {             notproctot += size;             if (verbose)                 printf("     %7ld bytes not processed (tot=%ld)\n",                     size,notproctot);             continue;         }         if (shdr->sh_type == SHT_NOBITS) {             if (tfsmemset((char *)(shdr->sh_addr),0,size,                 verbose,verifyonly) != 0)                 err++;         }         else {             if (TFS_ISCPRS(fp)) {                 int     outsize;                 outsize = decompress((char *)(ehdr)+shdr->sh_offset,                           size,(char *)shdr->sh_addr);                 if (outsize == -1) {                     err++;                     continue;                 }                 if (verbose)                     printf("dcmp %7d bytes from 0x%08lx to 0x%08lx\n",                          outsize,(ulong)(ehdr)+shdr->sh_offset,                          shdr->sh_addr);             }             else {                 if (tfsmemcpy((char *)(shdr->sh_addr),                     (char *)((int)ehdr+shdr->sh_offset),                     size,verbose,verifyonly) != 0)                     err++;             }             /* Flush caches for each loadable section... */             flushDcache((char *)shdr->sh_addr,size);             invalidateIcache((char *)shdr->sh_addr,size);         }     }     if (err)         return(TFSERR_MEMFAIL);     if (verbose & !verifyonly)         printf("entrypoint: 0x%lx\n",ehdr->e_entry);     /* Store entry point: */     if (entrypoint)         *entrypoint = (long)(ehdr->e_entry);     return(TFS_OKAY); }

Listing 7.15 begins by setting up structure overlays for the ELF file header structure ( ehdr ). The section header pointer ( shdr ) is established based on an offset ( e_shoff ) from the location of the file header.

The test following these pointer initializations performs some basic format validation by confirming that the header contains the required ELF signatures. (Without getting into a lot of detail on ELF itself, shname_strings is another offset further into the file that points to a table of strings. For my purposes, this pointer helps display different section names in more detail as I parse through the various section headers.)

The following loop drives the processing of the ELF file sections. Immediately, the section name is printed, and the SHF_ALLOC flag is tested . This flag indicates whether or not the section has space directly associated with the executable image. For example, a .text section would have this flag set because .text contains instructions for the CPU. A symbol table section would not have this flag set because it doesnt have anything to do with the actual process image. Following additional checks, the section is either copied to RAM, or, if sh-type is set to SHF_NOBITS , the memory space is cleared.

Notice the TFS_ISCPRS(fp) macro. TFS_ISCPRS(fp) is an extension to the ELF file format that is unique to TFS. MicroMonitor allows each section within the ELF file to be compressed. Hence, when this macro returns true, the loader decompresses out of flash memory and into the destination memory space. (The next section covers this feature in greater detail.)

Each time a copy operation completes, the loader forces D-cache to be flushed and the I-cache to be invalidated. These cache operations are necessary because when the loader is copying .text sections from one point in memory to another, it is essentially copying instructions from one point in memory to another. During the copy, the instructions just look like data to the CPU, so if the CPU has a data and instruction cache, there is the possibility that the instructions are in the data cache. If the caches are not flushed/invalidated prior to the CPU attempting to execute the code that was just copied, it is possible that a drastic error can occur because the instructions are in the data cache and not in physical memory.

Once the copy loop completes, the loader stores the entry point and returns.