Flylib.com

Books Software

 
 
 

Section 2.7. Kernel Speak: Listening to Kernel Messages


2.7. Kernel Speak: Listening to Kernel Messages

When your Linux system is up and running, the kernel itself logs messages and provides information about its status throughout its operation. This section gives a few of the most common ways the Linux kernel speaks to an end user .

2.7.1. printk ()

One of the most basic kernel messaging systems is the printk() function. The kernel uses printk() as opposed to printf() because the standard C library is not linked to the kernel. printk() uses the same interface as printf() does and displays up to 1,024 characters to the console. The printk() function operates by trying to grab the console semaphore, place the output into the console's log buffer, and then call the console driver to flush the buffer. If printk() cannot grab the console semaphore, it places the output into the log buffer and relies on the process that has the console semaphore to flush the buffer. The log-buffer lock is taken before printk() places any data into the log buffer, so concurrent calls to printk() do not trample each other. If the console semaphore is being held, numerous calls to printk() can occur before the log buffer is flushed. So, do not rely on printk() statements to indicate any program timing.

2.7.2. dmesg

The Linux kernel stores its logs, or messages, in a variety of ways. sysklogd() is a combination of syslogd() and klogd() . (More in-depth information can be found in the man page of these commands, but we can quickly summarize the system.) The Linux kernel sends its messages through klogd() , which tags them with appropriate warning levels, and all levels of messages are placed in /proc/kmsg . dmesg is a command-line tool to display the buffer stored in /proc/kmsg and, optionally , filter the buffer based on the message level.

2.7.3. /var/log/messages

This location on a Linux system is where a majority of logged system messages reside. The syslogd() program reads information in /etc/syslogd.conf for specific locations on where to store received messages. Depending on the entries in syslogd.conf , which can vary among Linux distributions, log messages can be stored in numerous files. However, /var/log/messages is usually the standard location.



2.8. Miscellaneous Quirks

This section serves as a catch-all for quirks that plagued the authors when they began to traipse through the kernel code. We include them here to give you an edge on Linux internals.

2.8.1. __init

The __init macro tells the compiler that the associate function or variable is used only upon initialization. The compiler places all code marked with __init into a special memory section that is freed after the initialization phase ends:

-----------------------------------------------------------------------
drivers/char/random.c
 679 static int __init batch_entropy_init(int size, struct entropy_store *r)
-----------------------------------------------------------------------

As an example, the random device driver initializes a pool of entropy upon being loaded. While the driver is loaded, different functions are used to increase or decrease the size of the entropy pool. This practice of device driver initialization being marked with __init is common, if not a standard.

Similarly, if there is data that is used only during initialization, the data needs to be marked with __initdata . Here, we can see how __initdata is used in the ESP device driver:

-----------------------------------------------------------------------
drivers/char/esp.c
 107 static char serial_name[] __initdata = "ESP serial driver";
 108 static char serial_version[] __initdata = "2.2";
-----------------------------------------------------------------------

Also, the __exit and __exitdata macros are to be used only in the exit or shutdown routines. These are commonly used when a device driver is unregistered.

2.8.2. likely() and unlikely ()

likely() and unlikely() are macros that Linux kernel developers use to give hints to the compiler and chipset. Modern CPUs have extensive branch-prediction heuristics that attempt to predict incoming commands in order to optimize speed. The likely() and unlikely() macros allow the developer to tell the CPU, through the compiler, that certain sections of code are likely, and thus should be predicted, or unlikely, so they shouldn't be predicted .

The importance of branch prediction can be seen with some understanding of instruction pipelining . Modern processors do anticipatory fetchingthat is, they anticipate the next few instructions that will be executed and load them into the processor. Within the processor, these instructions are examined and dispatched to the various units within the processor (integer, floating point, and so on) depending on how they can best be executed. Some instructions might be stalled in the processor, waiting for an intermediate result from a previous instruction. Now, imagine in the instruction stream, a branch instruction is loaded. The processor now has two instruction streams from which to continue its prefetching . If the processor often chooses poorly, it spends too much time reloading the pipeline of instructions that need execution. What if the processor had a hint of which way the branch was going to go? A simple method of branch prediction, in some architectures, is to examine the target address of the branch. If the value is previous to the current address, there's a good chance that this branch is at the end of a loop construct where it loops back many times and only falls through once.

Software is allowed to override the architectural branch prediction with special mnemonics . This ability is surfaced by the compiler by the __builtin_expect() function, which is the foundation of the likely() and unlikely() macros.

As previously mentioned, branch prediction and processor pipelining is complicated and beyond the scope of this book, but the ability to "tune" the code where we think we can make a difference is always a performance plus. Consider the following code block:

-----------------------------------------------------------------------
kernel/time.c
 90 asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone __user *tz)
  91 {
  92   if (likely(tv != NULL)) {
  93     struct timeval ktv;
  94     do_gettimeofday(&ktv);
  95     if (copy_to_user(tv, &ktv, sizeof(ktv)))
  96       return -EFAULT;
  97   }
  98   if (unlikely(tz != NULL)) {
  99     if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
 100       return -EFAULT;
 101   }
 102   return 0;
 103 }
-----------------------------------------------------------------------

In this code, we see that a syscall to get the time of day is likely to have a timeval structure that is not null (lines 9296). If it were null, we couldn't fill in the requested time of day! It is also unlikely that the timezone is not null (lines 98100). To put it another way, the caller rarely asks for the timezone and usually asks for the time.

The specific implementation of likely() and unlikely() are specified as follows : [4]

[4] __builtin_expect() , as seen in the code excerpt, is nulled before GCC 2.96, because there was no way to influence branch prediction before that release of GCC.

-----------------------------------------------------------------------
include/linux/compiler.h
  45 #define likely(x)  __builtin_expect(!!(x), 1)
  46 #define unlikely(x)  __builtin_expect(!!(x), 0) 
-----------------------------------------------------------------------

2.8.3. IS_ERR and PTR_ERR

The IS_ERR macro encodes a negative error number into a pointer, while the PTR_ERR macro retrieves the error number from the pointer.

Both macros are defined in include/linux/err.h .

2.8.4. Notifier Chains

The notifier-chain mechanism is provided for the kernel to register its interest in being informed regarding the occurrence of variable asynchronous events. This generic interface extends its usability to all subsystems or components of the kernel.

A notifier chain is a simply linked list of notifier_block objects:

-----------------------------------------------------------------------
include/linux/notifier.h
14 struct notifier_block
15 {
16 int(*notifier_call)(struct notifier_block *self, unsigned long, void *);
17 struct notifier_block *next;
18 int priority;
19 };
-----------------------------------------------------------------------

notifier_block contains a pointer to a function ( notifier_call ) to be called when the event comes to pass. This function's parameters include a pointer to the notifier_block holding the information, a value corresponding to event codes or flags, and a pointer to a datatype specific to the subsystem.

The notifier_block struct also contains a pointer to the next notifier_block in the chain and a priority declaration.

The routines notifier_chain_register() and notifier_chain_unregister() register or unregister a notifier_block object in a specific notifier chain.