Section 5.3. Kernel Command Line Processing

5.3. Kernel Command Line Processing

Following the architecture setup, main.c performs generic early kernel initialization and then displays the kernel command line. Line 10 of Listing 5-3 is reproduced here for convenience.

Kernel command line: console=ttyS0,115200 ip=bootp root=/dev/nfs

In this simple example, the kernel being booted is instructed to open a console device on serial port device ttyS0 (usually the first serial port) at a baud rate of 115Kbps. It is being instructed to obtain its initial IP address information from a BOOTP server and to mount a root file system via the NFS protocol. (We cover BOOTP later in Chapter 12, "Embedded Development Environment," and NFS in Chapters 9, "File Systems," and 12. For now, we limit the discussion to the kernel command line mechanism.)

Linux is typically launched by a bootloader (or bootstrap loader) with a series of parameters that have come to be called the kernel command line. Although we don't actually invoke the kernel using a command prompt from a shell, many bootloaders can pass parameters to the kernel in a fashion that resembles this well-known model. On some platforms whose bootloaders are not Linux aware, the kernel command line can be defined at compile time and becomes hard coded as part of the kernel binary image. On other platforms (such as a desktop PC running Red Hat Linux), the command line can be modified by the user without having to recompile the kernel. The bootstrap loader (Grub or Lilo in the desktop PC case) builds the kernel command line from a configuration file and passes it to the kernel during the boot process. These command line parameters are a boot mechanism to set initial configuration necessary for proper boot on a given machine.

Numerous command line parameters are defined throughout the kernel. The .../Documentation subdirectory in the kernel source contains a file called kernel-parameters.txt containing a list of kernel command line parameters in dictionary order. Remember the previous warning about kernel documentation: The kernel changes far faster than the documentation. Use this file as a guide, but not a definitive reference. More than 400 distinct kernel command line parameters are documented in this file, and it cannot be considered a comprehensive list. For that, you must refer directly to the source code.

The basic syntax for kernel command line parameters is fairly simple and mostly evident from the example in line 10 of Listing 5-3. Kernel command line parameters can be either a single text word, a key=value pair, or a key= value1, value2, …. key and multivalue format. It is up to the consumer of this information to process the data as delivered. The command line is available globally and is processed by many modules as needed. As noted earlier, setup_arch() in main.c is called with the kernel command line as its only argument. This is to pass architecture-specific parameters and configuration directives to the relevant portions of architecture- and machine-specific code.

Device driver writers and kernel developers can add additional kernel command-line parameters for their own specific needs. Let's take a look at the mechanism. Unfortunately, some complications are involved in using and processing kernel command line parameters. The first of these is that the original mechanism is being deprecated in favor of a much more robust implementation. The second complication is that we need to comprehend the complexities of a linker script file to fully understand the mechanism.^[7]

^[7] It's not necessarily all that complex, but most of us never need to understand a linker script file. The embedded engineer does. It is well documented in the GNU LD manual referenced at the end of this chapter.

5.3.1. The __setup Macro

As an example of the use of kernel command line parameters, consider the specification of the console device. We want this device to be initialized early in the boot cycle so that we have a destination for console messages during boot. This initialization takes place in a kernel object called printk.o. The C source file for this module is found in .../kernel/printk.c. The console initialization routine is called console_setup() and takes the kernel command line parameter string as its only argument.

The challenge is to communicate the console parameters specified on the kernel command line to the setup and device driver routines that require this data in a modular and general fashion. Further complicating the issue is that typically the command line parameters are required early, before (or in time for) those modules that need them. The startup code in main.c, where the main processing of the kernel command line takes place, cannot possibly know the destination functions for each of hundreds of kernel command line parameters without being hopelessly polluted with knowledge from every consumer of these parameters. What is needed is a flexible and generic way to pass these kernel command line parameters to their consumers.

In Linux 2.4 and earlier kernels, developers used a simple macro to generate a not-so-simple sequence of code. Although it is being deprecated, the __setup macro is still in widespread use throughout the kernel. We next use the kernel command line from Listing 5-3 to demonstrate how the __setup macro works.

From the previous kernel command line (line 10 of Listing 5-3), this is the first complete command line parameter passed to the kernel:

 console=ttyS0,115200

For the purposes of this example, the actual meaning of the parameters is irrelevant. Our goal here is to illustrate the mechanism, so don't be concerned if you don't understand the argument or its values.

Listing 5-4 is a snippet of code from .../kernel/printk.c. The body of the function has been stripped because it is not relevant to the discussion. The most relevant part of Listing 5-4 is the last line, the invocation of the __setup macro. This macro expects two arguments; in this case, it is passed a string literal and a function pointer. It is no coincidence that the string literal passed to the __setup macro is the same as the first eight characters of the kernel command line related to the console: console=.

Listing 5-4. Console Setup Code Snippet

/*  *   Setup a list of consoles. Called from init/main.c  */ static int __init console_setup(char *str) {     char name[sizeof(console_cmdline[0].name)];     char*s,  *options;     int idx;     /*      * Decode str into name, index, options.      */     return 1; } __setup("console=", console_setup);

You can think of this macro as a registration function for the kernel command-line console parameter. In effect, it says: When the console= string is encountered on the kernel command line, invoke the function represented by the second __setup macro argumentin this case, the console_setup() function. But how is this information communicated to the early setup code, outside this module, which has no knowledge of the console functions? The mechanism is both clever and somewhat complicated, and relies on lists built by the linker.

The details are hidden in a set of macros designed to conceal the syntactical tedium of adding section attributes (and other attributes) to a portion of object code. The objective is to build a static list of string literals associated with function pointers. This list is emitted by the compiler in a separately named ELF section in the final vmlinux ELF image. It is important to understand this technique; it is used in several places within the kernel for special-purpose processing.

Let's now examine how this is done for the __setup macro case. Listing 5-5 is a portion of code from the header file .../include/linux/init.h defining the __setup family of macros.

Listing 5-5. Family of `__setup` Macro Definitions from `init.h`

 ... #define __setup_param(str, unique_id, fn, early) \        static char __setup_str_##unique_id[] __initdata = str; \        static struct obs_kernel_param __setup_##unique_id      \               __attribute_used__                               \               __attribute__((__section__(".init.setup")))      \               __attribute__((aligned((sizeof(long)))))         \               = { __setup_str_##unique_id, fn, early } #define __setup_null_param(str, unique_id)                     \         __setup_param(str, unique_id, NULL, 0) #define __setup(str, fn\         __setup_param(str, fn, fn, 0) ...

Listing 5-5 is the author's definition of syntactical tedium! Recall from Listing 5-4 that our invocation of the original __setup macro looked like this:

  __setup("console=", console_setup);

With some slight simplification, here is what the compiler's preprocessor produces after macro expansion:

  static char __setup_str_console_setup[] __initdata = "console=";   static struct obs_kernel_param __setup_console_setup  \   __attribute__((__section__(".init.setup")))=      {__setup_str_console_setup, console_setup, 0};

To make this more readable, we have split the second and third lines, as indicated by the UNIX line-continuation character \.

We have intentionally left out two compiler attributes whose description does not add any insight to this discussion. Briefly, the __attribute_used__ (itself a macro hiding further syntactical tedium) tells the compiler to emit the function or variable, even if the optimizer determines that it is unused.^[8] The __attribute__ (aligned) tells the compiler to align the structures on a specific boundary, in this case sizeof(long).

^[8] Normally, the compiler will complain if a variable is defined static and never referenced in the compilation unit. Because these variables are not explicitly referenced, the warning would be emitted without this directive.

What we have left after simplification is the heart of the mechanism. First, the compiler generates an array of characters called __setup_str_console_ setup[] initialized to contain the string console=. Next, the compiler generates a structure that contains three members: a pointer to the kernel command line string (the array just declared), the pointer to the setup function itself, and a simple flag. The key to the magic here is the section attribute attached to the structure. This attribute instructs the compiler to emit this structure into a special section within the ELF object module, called .init.setup. During the link stage, all the structures defined using the __setup macro are collected and placed into this .init.setup section, in effect creating an array of these structures. Listing 5-6, a snippet from .../init/main.c, shows how this data is accessed and used.

Listing 5-6. Kernel Command Line Processing

1 extern struct obs_kernel_param __setup_start[], __setup_end[]; 2 3 static int __init obsolete_checksetup(char *line) 4 { 5         struct obs_kernel_param *p; 6 7         p = __setup_start; 8         do { 9                 int n = strlen(p->str); 10                 if (!strncmp(line, p->str, n)) { 11                         if (p->early) { 12                                     /* Already done in parse_early_param? (Needs 13                                      * exact match on param part) */ 14                                    if (line[n] == '\0' || line[n] == '=') 15                                              return 1; 16                 } else if (!p->setup_func) { 17                    printk(KERN_WARNING "Parameter %s is obsolete," 18                            " ignored\n", p->str); 19                        return 1; 20                } else if (p->setup_func(line + n)) 21                                return 1; 22                } 23                p++; 24        } while (p < __setup_end); 25        return 0; 26 }

Examination of this code should be fairly straightforward, with a couple of explanations. The function is called with a single command line argument, parsed elsewhere within main.c. In the example we've been discussing, line would point to the string console=ttyS0,115200, which is one component from the kernel command line. The two external structure pointers __setup_start and __setup_end are defined in a linker script file, not in a C source or header file. These labels mark the start and end of the array of obs_kernel_param structures that were placed in the .init.setup section of the object file.

The code in Listing 5-6 scans all these structures via the pointer p to find a match for this particular kernel command line parameter. In this case, the code is searching for the string console= and finds a match. From the relevant structure, the function pointer element returns a pointer to the console_setup() function, which is called with the balance of the parameter (the string ttyS0,115200) as its only argument. This process is repeated for every element in the kernel command line until the kernel command line has been completely exhausted.

The technique just described, collecting objects into lists in uniquely named ELF sections, is used in many places in the kernel. Another example of this technique is the use of the __init family of macros to place one-time initialization routines into a common section in the object file. Its cousin __initdata, used to mark one-time-use data items, is used by the __setup macro. Functions and data marked as initialization using these macros are collected into a specially named ELF section. Later, after these one-time initialization functions and data objects have been used, the kernel frees the memory occupied by these items. You might have seen the familiar kernel message near the final part of the boot process saying, "Freeing init memory: 296K." Your mileage may vary, but a third of a megabyte is well worth the effort of using the __init family of macros. This is exactly the purpose of the __initdata macro in the earlier declaration of __setup_str_console_setup[].

You might have been wondering about the use of symbol names preceded with obsolete_. This is because the kernel developers are replacing the kernel command line processing mechanism with a more generic mechanism for registering both boot time and loadable module parameters. At the present time, hundreds of parameters are declared with the __setup macro. However, new development is expected to use the family of functions defined by the kernel header file .../include/linux/moduleparam.h, most notably, the family of module_param* macros. These are explained in more detail in Chapter 8, "Device Driver Basics," when we introduce device drivers.

The new mechanism maintains backward compatibility by including an unknown function pointer argument in the parsing routine. Thus, parameters that are unknown to the module_param* infrastructure are considered unknown, and the processing falls back to the old mechanism under control of the developer. This is easily understood by examining the well-written code in .../kernel/params.c and the parse_args() calls in .../init/main.c.

The last point worth mentioning is the purpose of the flag member of the obs_kernel_param structure created by the __setup macro. Examination of the code in Listing 5-6 should make it clear. The flag in the structure, called early, is used to indicate whether this particular command line parameter was already consumed earlier in the boot process. Some command line parameters are intended for consumption very early in the boot process, and this flag provides a mechanism for an early parsing algorithm. You will find a function in main.c called do_early_param() that traverses the linker-generated array of __setup- generated structures and processes each one marked for early consumption. This gives the developer some control over when in the boot process this processing is done.

5.3. Kernel Command Line Processing

5.3.1. The __setup Macro

Listing 5-4. Console Setup Code Snippet

Listing 5-5. Family of __setup Macro Definitions from init.h

Listing 5-6. Kernel Command Line Processing

Listing 5-5. Family of `__setup` Macro Definitions from `init.h`