2.8. Miscellaneous
Quirks
This section serves as a catch-all for quirks that plagued the authors when they
began
to traipse through the kernel code. We include them here to give you an edge on Linux internals.
2.8.1. __init
The
__init
macro
tells
the compiler that the associate function or variable is used only upon initialization. The compiler places all code
marked
with
__init
into a special memory section that is freed after the initialization phase ends:
-----------------------------------------------------------------------
drivers/char/random.c
679 static int __init batch_entropy_init(int size, struct entropy_store *r)
-----------------------------------------------------------------------
As an example, the random device driver initializes a pool of entropy upon being loaded. While the driver is loaded, different functions are used to increase or decrease the size of the entropy pool. This practice of device driver initialization being marked with
__init
is common, if not a standard.
Similarly, if there is data that is used only during initialization, the data needs to be marked with
__initdata
. Here, we can see how
__initdata
is used in the ESP device driver:
-----------------------------------------------------------------------
drivers/char/esp.c
107 static char serial_name[] __initdata = "ESP serial driver";
108 static char serial_version[] __initdata = "2.2";
-----------------------------------------------------------------------
Also, the
__exit
and
__exitdata
macros are to be used only in the exit or shutdown routines. These are commonly used when a device driver is unregistered.
2.8.2. likely() and
unlikely
()
likely()
and
unlikely()
are macros that Linux kernel developers use to give hints to the compiler and chipset. Modern CPUs have
extensive
branch-prediction heuristics that attempt to predict incoming commands in order to optimize speed. The
likely()
and
unlikely()
macros allow the developer to tell the CPU, through the compiler, that certain sections of code are likely, and thus should be predicted, or unlikely, so they shouldn't be
predicted
.
The importance of branch prediction can be seen with some understanding of instruction
pipelining
. Modern processors do anticipatory fetchingthat is, they anticipate the
next
few instructions that will be executed and load them into the processor. Within the processor, these instructions are examined and dispatched to the various units within the processor (integer, floating point, and so on) depending on how they can best be executed. Some instructions might be stalled in the processor, waiting for an intermediate result from a previous instruction. Now, imagine in the instruction stream, a branch instruction is loaded. The processor now has two instruction streams from which to continue its
prefetching
. If the processor often chooses poorly, it spends too much time reloading the pipeline of instructions that need execution. What if the processor had a
hint
of which way the branch was going to go? A simple method of branch prediction, in some architectures, is to examine the target address of the branch. If the value is previous to the current address, there's a good chance that this branch is at the end of a loop construct where it
loops
back many times and only
falls
through once.
Software is allowed to override the architectural branch prediction with special
mnemonics
. This ability is surfaced by the compiler by the
__builtin_expect()
function, which is the foundation of the
likely()
and
unlikely()
macros.
As previously mentioned, branch prediction and processor pipelining is complicated and beyond the scope of this book, but the ability to "tune" the code where we think we can make a difference is always a performance plus. Consider the following code block:
-----------------------------------------------------------------------
kernel/time.c
90 asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone __user *tz)
91 {
92 if (likely(tv != NULL)) {
93 struct timeval ktv;
94 do_gettimeofday(&ktv);
95 if (copy_to_user(tv, &ktv, sizeof(ktv)))
96 return -EFAULT;
97 }
98 if (unlikely(tz != NULL)) {
99 if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
100 return -EFAULT;
101 }
102 return 0;
103 }
-----------------------------------------------------------------------
In this code, we see that a syscall to get the time of day is likely to have a
timeval
structure that is not null (lines 9296). If it were null, we couldn't fill in the
requested
time of day! It is also unlikely that the timezone is not null (lines 98100). To put it another way, the caller rarely asks for the timezone and usually asks for the time.
The specific implementation of
likely()
and
unlikely()
are specified as
follows
:
-----------------------------------------------------------------------
include/linux/compiler.h
45 #define likely(x) __builtin_expect(!!(x), 1)
46 #define unlikely(x) __builtin_expect(!!(x), 0)
-----------------------------------------------------------------------
2.8.3. IS_ERR and PTR_ERR
The
IS_ERR
macro encodes a negative error number into a pointer, while the
PTR_ERR
macro retrieves the error number from the pointer.
Both macros are defined in
include/linux/err.h
.
2.8.4. Notifier Chains
The notifier-chain mechanism is provided for the kernel to register its interest in being informed regarding the occurrence of variable asynchronous events. This generic interface extends its usability to all subsystems or
components
of the kernel.
A
notifier chain
is a simply linked list of
notifier_block
objects:
-----------------------------------------------------------------------
include/linux/notifier.h
14 struct notifier_block
15 {
16 int(*notifier_call)(struct notifier_block *self, unsigned long, void *);
17 struct notifier_block *next;
18 int priority;
19 };
-----------------------------------------------------------------------
notifier_block
contains a pointer to a function (
notifier_call
) to be called when the event comes to pass. This function's parameters include a pointer to the
notifier_block
holding the information, a value corresponding to event codes or flags, and a pointer to a datatype specific to the subsystem.
The
notifier_block
struct also contains a pointer to the next
notifier_block
in the chain and a priority declaration.
The routines
notifier_chain_register()
and
notifier_chain_unregister()
register or unregister a
notifier_block
object in a specific notifier chain.
|