22.3 Using the KIDS Example to Extend the Linux Network Architecture

   


Now that we have given a brief overview of the elements in the KIDS framework, this section will discuss its implementation in the Linux kernel as an example of how the functionality of the Linux network architecture can be extended. We focus our discussion on the design and management of the components: how and why they were designed, and how they are introduced to the kernel at runtime. In addition, we will see how hooks are implemented on the basis of different existing kernel interfaces, which means that we don't have to change the kernel to be able to use KIDS. Finally, we use the kidsd daemon as an example to show how components and hooks are configured and how they interact between the kernel and the user level.

22.3.1 Components and Their Instances

The KIDS framework offers different types of components that can be used to implement different QoS mechanisms (e.g., token buckets see Section 18.6.1). A component can occur more than once within a component chain, and each of these occurrences can have different parameters. This means that we should be able to create an arbitrary number of instances from a component, but still try to keep the memory required by these instances low. This principle reminds us strongly of the object-orientation concept that lets you create an arbitrary number of object instances from a class. Although all of these classes exist independently, they have the same behavior, because they use the same methods.

This means that the component concept of Linux KIDS has an object-oriented character, though it was written in C, a programming language that doesn't support object orientation. The component concept of Linux KIDS consists of the following two parts:

  • Components are QoS mechanisms implementing a specific behavior. They are managed in the bhvr_type structure of Linux KIDS. This structure contains all properties of a component (e.g., its behavior in the form of pointers to corresponding methods shown below). These methods are used by several instances of that component concurrently, so they have to be reentrant. Components correspond to the principle of classes in the object-oriented model.

  • Component instances are created when we need an instance of a component. To this end, we create a data structure of the type bhvr. It stores all information about this component instance mainly, its individual parameter configuration. The instance should have the component's behavior, so reference is made to the information stored in the bhvr_type structure of the component. Component instances correspond to objects (or object instances) in the object-oriented model.

The following discussion introduces how these two structures are built and what the parameters mean. Subsequently, we will see how components can be registered or unregistered dynamically.

struct bhvr_type

kids/kids_bhvr.h


Figure 22-4 shows how components and their instances interact. The bhvr_type structure of the token bucket stores general component information.

Figure 22-4. The bhvr_type and bhvr structures manage components and their instances.

graphics/22fig04.gif


 struct bhvr_type {    char              name[STRLEN];    unsigned int      bhvr_class_id;    unsigned long     private_data_size;    unsigned int      instances;    struct bhvr_type  *next;    int               (*func) (struct bhvr *, struct sk_buff *);    struct sk_buff*   (*deq_func) (struct bhvr *);    int               (*constructor) (struct bhvr *bhvr, char * data, int flag);    int               (*destructor) (struct bhvr *bhvr);    struct bhvr*      (*get_bhvr) (struct bhvr *bhvr, char * port);    int               (*append_bhvr) (struct bhvr *new_bhvr, struct bhvr                            *old_bhvr, char *port);    int               (*proc) (struct bhvr *bhvr, char *ptr, int layer);    int               (*get_config) (struct bhvr *bhvr, char *ptr); };

The fields have the following meaning:

  • name is the name of the component (e.g., Token_Bucket).

  • bhvr_class_id contains the component's class. (see Section 22.2.1.) Possible values are BHVR_ID, ENQ_BHVR_ID, DEQ_BHVR_ID, DEQ_DISC_ID, and QUEUE_ID.

  • private_data_size specifies the size of the private data structure used in the bhvr structure for each instance of a component. Preferably, a separate private structure should be defined here (e.g., tb_data see below), and a sizeof instruction to specify the size of this structure should be inserted in this position.

  • instances manages the number of instances created from a component. This variable is managed by Linux KIDS. It should show a value of 0 when a component is removed from the kernel.

  • next is also used internally, namely to link bhvr_type structures in the bhvr_type_list. (See Figure 22-4.)

The following elements of the bhvr_type structure are function pointers that specify the behavior of a component and are used to managing it.

  • func(bhvr, skb) refers to a function that is invoked when a packet (or a socket buffer, skb) is passed to an instance (bhvr) of this component. It implements the functionality of this component type. A socket buffer is passed when func() is invoked. This means that this function corresponds to the implementation of a packet interface and is used only for operative components and enqueuing components. Section 22.3.5 uses an example introducing the func() method of the Token-Bucket component.

    The bhvr parameter contains a pointer to the bhvr structure of the component instance, which is passed to the socket buffer, skb, when the func() function is invoked. Because the func() method is used for all instances of the Token_Bucket component, the pointer to the instance-specific information also has to be passed. Otherwise, it would be impossible to see which instances, with what parameter or variable assignments, is meant.

  • deq_func(bhvr) is used for dequeuing and strategic components. It corresponds to the implementation of a message interface and is invoked when a packet is requested from an instance (bhvr) of this component. A component implements only one of two functions, either func() or deq_func(), depending on whether its input has a packet interface or a message interface.

  • constructor(bhvr, data, flag) is invoked when a bhvr instance of this component is initialized or when its configuration changed. This method takes the character-string data with the component's private data to be configured as parameters. The flag parameter shows whether this is the first initialization of this instance (INIT_BHVR) or it is a change to its parameters at runtime, where only the information passed should be altered.

  • destructor(bhvr) is invoked to destroy the bhvr instance of the component. All cleanup work required (e.g., free memory or deactivate timer) should be done at this point.

  • get_bhvr(bhvr, port) is invoked by KIDS to obtain a pointer to the bhvr structure of the component instance appended to the output, port. The number and names of a component's outputs are individual, so we have to implement a component-specific function.

  • append_bhvr(new_bhvr, old_bhvr, port) connects the new_bhvr component instance to the output, port, of the existing component instance, old_bhvr. Again, we have to implement separate functions for the individual outputs of a component.

  • proc(bhvr, ptr, layer) creates information about the bhvr component instance. This information can be output from proc files. The layer parameter specifies the distance from the component instance to the hook; this is required for indenting within the output. The ptr pointer specifies the buffer space this output should be written to. (See Section 2.8.)

  • get_config(bhvr, ptr) is invoked by KIDS to write the configuration of the bhvr component instance to the ptr buffer space, based on the KIDS configuration syntax (see Section 22.3.6).

struct bhvr

kids/kids_bhvr.h


Each of the bhvr structures representing the specific instances of a component manages the information of a component instance (e.g., name on number of references). The bhvr data structure is built as follows:

 struct bhvr {        char                   name[STRLEN];        unsigned int           use_counter;        struct bhvr            *next_bhvr;        struct bhvr_type       *bhvr_type;        char                   bhvr_data[0]; }; 

The fields have the following meaning:

  • name: The name of this instance (e.g., tb0 or marker1);

  • use_counter specifies the number of direct predecessors of this component instance the number of references to this bhvr structure.

  • next_bhvr is used to link the individual bhvr data structures in the bhvr_list. This list is managed by KIDS and used to search for a component by its name (get_bhvr_by_name()).

  • bhvr_type points to the relevant bhvr_type structure, representing the type of this component instance. This means that this pointer specifies the behavior of the component instance, which is registered in the bhvr_type structure.

  • bhvr_data is a placeholder for the private information of this component instance (as is shown later). No type can be specified, because the structure of each component's information is individual. A type cast is required before each access for example,

    struct tb_data *data = (struct tb_data *) &(tb_bhvr->bhvr_data);}

    The private information space is directly adjacent to the bhvr structure. The length of private information is taken into account for reserving the memory of the bhvr structure. As was mentioned earlier, it is managed in the bhvr_type structure.

Using the Token-Bucket Component as an Example for a Private Data Structure

The data structure containing private information (bhvr_data) is of particular importance. Its structure depends on the respective component, because it stores that component's parameters and runtime variables. Because all instances of a component have the same variables, though with different assignments, this data structure is stored in the instances (i.e., in the bhvr structure), and its length (which is identical for all instances of a component) is stored in the bhvr_type structure.

This tells us clearly that all information concerning the state or configuration of a special component instance is managed in the instance itself in the private data structure of the bhvr structure.

The following example represents the private data structure of the Token_Bucket component:

 struct tb_data {    unsigned int     rate, bucket_size;    unsigned long    token, packets_arvd, packets_in, packets_out;    CPU_STAMP        last_arvl, cycles_per_byte;    struct bhvr      *enough_token_bhvr;    struct bhvr      *not_enough_token_bhvr;    int              (*enough_token_func) (struct bhvr *, struct sk_buff *);    int              (*not_enough_token_func) (struct bhvr *, struct sk_buff *); }; 

The meaning of each of the variables in such a private data structure can be divided into three groups:

  • The parameter and runtime variables of a component are individual in that the component implements a special algorithm. This is the reason why they are managed in a private data structure of the component, which exists separately in each of that component's instances. Examples for parameter and runtime variables include the rate and bucket_size variables in the Token-Bucket component.

  • In addition, private information manages the following two elements for each component output, because the number of outputs is also individual to the respective component and so it cannot be accommodated in the bhvr_type structure:

    • The first element is a function pointer to the func() function (for a packet interface) or deq_func() (for a message interface) in the subsequent component instance. This means that a component instance stores a reference to the handling routine for the component instance appended to this output.

    • The second element is a reference to the bhvr_structure of the subsequent component instance at this output. This pointer is used eventually to link the component instances.

    The reference to the handling routine of the subsequent component instance is actually not required, because it can be identified over the bhvr_type pointer from the corresponding structure of the successor. However, this double unreferencing method is saved at the cost of an additional pointer, for performance reasons. If no component instance is appended to an output, then the two variables take the value NULL, and a packet to be forwarded is recursively returned to the hook. (See Section 22.3.5.)

22.3.2 Registering and Managing Components

Before we can use Linux KIDS to implement the desired QoS mechanisms, we have to tell the kernel which components are currently available. To this end, Linux KIDS maintains a list, bhvr_type_list, to manage all registered components. This list is based on simple linking of the respective bhvr_type data structures that store the entire information about components. (see Figure 22-4.) Linking of the data structures into a list corresponds to the normal approach to manage functionalities in the Linux kernel. (see Section 22.1.)

We can use the function register_bhvr_type(bhvr_type) to register a component represented by a bhvr_type structure. (See Figure 22-5.) More specifically, the bhvr_type structure is entered in the bhvr_type_list. (See Figure 22-4.) From then on, this component is known in the kernel, and we can create instances of that component. To remove a component from the list, we can invoke unregister_bhvr_type(bhvr_type). Of course, we have to ensure that there are no instances of the component left before we remove it, which is the reason why the instances variable has to be checked first.

Figure 22-5.
 struct bhvr_type token_bucket_element = {     "Token_Bucket",                     /* name                  */     BHVR_ID,                            /* class                 */     sizeof(struct token_bucket_data),   /* private data size     */     0,                                  /* instances             */     NULL,                               /* next                  */     token_bucket_func,                  /* packet interface      */     NULL,                               /* message interface     */     token_bucket_init,                  /* constructor           */     NULL,                               /* destructor            */     token_bucket_get,                   /* get bhvr of a port    */     token_bucket_append,                /* append bhvr on a port */     token_bucket_proc,                  /* proc output routine   */     token_bucket_config                 /* get config of a bhvr  */ }; int init_module(void) {     register_bhvr_type(&token_bucket_element); } void cleanup_module(void) {     unregister_bhvr_type(&token_bucket_element); } 

In addition to the list of component categories, Linux KIDS has two other elements that can be used to register or unregister functionalities dynamically. To prevent this chapter from getting too long, we will discuss these two elements only briefly. They are managed similarly to the previous elements:

  • Hooks are represented by the hook data structure; they are registered by register_hook(hook) and unregistered by unregister_hook(hook). If a protocol instance wants to supply a hook, it simulates a packet interface or message interface, builds an appropriate hook data structure, and registers the hook. Subsequently, components can be appended to this hook. The files kids/layer2_hooks.c and kids/nf_hooks.c include examples for hooks based on the TC or netfilter interface.

  • Different queue categories are managed by the kids_queue_type data structure; we can use register_queue_type() to register or unregister_queue_type() to unregister them. An instance of a queue variant is represented by a kids_queue structure. The management of queues is almost identical to that of component categories, but components and queues are different, so it was found necessary to manage them separately.

22.3.3 Managing Component Instances

The previous section described how we can register and manage components in Linux KIDS; this section discusses how we can manage instances of components how component instances are created, deleted, and linked. A special syntax was developed to keep the managing of the QoS mechanisms as simple as possible. Section 22.3.6 will introduce this syntax. A character-oriented device, /dev/kids, is used to pass configuration commands to Linux KIDS and to invoke one of the methods introduced below.

create_bhvr()

kids/kids_bhvr.c


create_bhvr(type, name, data, id) creates an instance of the type component designated by name. For creating this instance, that component has to be present in the list of registered components (bhvr_type_list).

Initially, storage space is reserved for the data of the new component instance. This memory space consists of a bhvr structure that is identical for all components and a private data structure that is individual to each component. Subsequently, the bhvr structure is initialized, and the constructor of the component occupying this private data with this component's configuration parameters is invoked. These configuration parameters were extracted from the CREATE command and passed in the data character string. Once it has been created, the component is no longer connected to any other component. Finally, it is added to the bhvr_list.

remove_bhvr()

kids/kids_bhvr.c


remove_bhvr(name, force) deletes the component instance designated by name and removes it from the bhvr_list. force can be specified to state that the use_counter of that instance should be ignored, as normally should not be the case, because there could still be references to this data structure. Before the data structure is released, the component's destructor is invoked to free resources, if present.

change_bhvr()

kids/kids_bhvr.c


change_bhvr(name, data) can be used to alter the private data of a component instance at runtime. All that happens here, however, is that the data character string holding the information to be changed invokes the constructor. The INIT_BHVR flag is not set; thus, the constructor knows that only the parameters specified have to be altered. Otherwise, the entire component instance would be reset.

22.3.4 Implementing Hooks

Hooks are extensions of existing protocol instances allowing us to easily embed QoS components based on the rules of the KIDS framework [Wehr01a]. One of the most important factors is the position we want to extend by a hook and thus by QoS mechanisms within the process of a protocol instance. The reason is that we can always address a certain number of packets at specific positions (e.g., all packets to be forwarded, at the IP_FORWARD hook, or all packets of the IP instance to be delivered locally, at the IP_LOCAL_DELIVER hook).

Thanks to its set of different interfaces, the Linux network architecture offers an inherent way to extend a protocol instance by a functionality. These interfaces have been utilized in the KIDS framework, and so the hooks shown in Figure 22-3 could be implemented without the need to change the source code of the Linux kernel. The hooks for the IP instance are based on the netfilter interface (see Section 19.3); the data-link layer hooks are based on the Traffic Control interface.

The following example represents the netfilter handling method of the IP_FORWARD hook. It merely checks for whether a component instance is appended and invokes that instance, if present:

 unsigned int ip_forward_hook_fn(unsigned int hooknum, struct sk_buff **skb, ...) {      if (ip_forward_hook && ip_forward_hook->bhvr && ip_forward_hook->func)              return ip_forward_hook->func(ip_forward_hook->bhvr, skb[0]);      else              return NF_ACCEPT; }; 

Additional hooks can be integrated easily, even at runtime. To integrate a hook, we have to store the information required about the hook in a hook data structure and use the register_hook() method to register it. The protocol instance we want to extend is then simply extended by a function call, structured similarly to the above example with the IP_FORWARD hook. You can find additional information about the concept of hooks in [Wehr01a].

22.3.5 How a Component Works

Once we have registered all components of the KIDS framework with the kernel and created a component chain and appended it to a hook, we need a description of how such a component should operate. The following example uses a packet in the Token_Bucket component to describe how this component operates:

token_bucket_func()

kids/std_bhvr.c


 int token_bucket_func(struct bhvr *tb_bhvr, struct sk_buff *skb) {    struct tb_data  *data = (struct tb_data *) &(tb_bhvr->bhvr_data);    CPU_STAMP       now;    data->packets_arvd++;    TAKE_TIME(now);    /* calcs the tokens, that are produced since the last packet arrival */    (unsigned long) data->token += (((unsigned long) (now - data->last_arvl)) /                                         (unsigned long) data->cycles_per_byte);    /* check, if the bucket is overflood */    if (data->token > data->bucket_size)       data->token = data->bucket_size;    data->last_arvl = now;    /* check, if there are enough tokens to send the packet */    if (data->token < skb->len)    {   /* not enough tokens -> out of profile */        data->packets_out++;        /* forward the packet to the next behavior (out-of-profile) */        if ((data->not_enough_token_bhvr) && (data->not_enough_token_func))           return data->not_enough_token_func(data->not_enough_token_bhvr, skb);    }    else    {   /* enough tokens -> in profile */        data->token -= skb->len;        data->packets_in++;        /* forward the packet to the next behavior (in-profile) */        if ((data->enough_token_bhvr ) && (data->enough_token_func))           return data->enough_token_func(data->enough_token_bhvr, skb);    }    return KIDS_ACCEPT; /* Do not discard packet, when no behavior is attached */ } 

The Token_Bucket component belongs to the operative component class, which means that it has a packet input and up to n packet outputs. In this example, these are the Conform and Non_Conform outputs.

When an instance of the Token_Bucket component receives a packet, the corresponding func() handling routine is invoked, as shown in the example. The parameters it gets include the socket buffer, skb, and a pointer to the data in the component instance (bhvr). At the very beginning, a pointer to the instance's private data is set from within this pointer, which requires a type cast. Since the function has to be fully reentrant, it must operate only on its own private data or local variables.

Subsequently, computations according to the token bucket algorithm (see Section 18.6.1) are done, and statistic variables are updated. The result from these computations determines the output that should be used to forward the socket buffer (i.e., the kind of handling to follow). If no instance is appended to the corresponding output, the function returns the KIDS_ACCEPT result, which means that the packet was processed and should now be forwarded to the hook. The counterpart is KIDS_DROP; it tells the hook that it should drop the socket buffer.

However, if a component instance follows at the desired output (data->..._bhvr != NULL), then that instance's handling routine is invoked (data->..._func()), and a pointer to the data of the subsequent instance is passed to this handling routine, in addition to the socket buffer. The return value of the subsequent component instance is used immediately for the return value of the token-bucket component.[1]

[1] Of course, as an alternative, the token bucket could evaluate the call's result, however, this is not implemented yet. In the case of a packet marked to be dropped (KIDS_DROP), for example, it could put the tokens used by this packet back into the bucket.

Linux KIDS handles packets in component chains via these nested function calls. Figure 22-6 shows this once more schematically. A description of the enqueuing and dequeuing hooks would go beyond the scope and volume of this chapter. We refer our readers to [Wehr01a].

Figure 22-6. Component instances interacting at a packet hook.

graphics/22fig06.gif


22.3.6 Configuring KIDS Components

This section briefly introduces how component chains are configured in Linux KIDS. The first question we have to answer is how configuration information gets from the user to the appropriate positions in the kernel. Various methods are available, including new system calls and special socket interfaces. Linux KIDS uses a different and very simple method, one based on a character-oriented device. (See Section 2.5.)

First, we create a character-oriented device, /dev/kids. On the user level, this device is explained in [RuCo01].

To configure Linux KIDS, we use the following commands, which can either be written directly into /dev/kids or sent to the KIDS daemon, kidsd, over a TCP connection. This daemon will then pass our commands on to the kernel.

  • CREATE bhvr_class bhvr_type bhvr_name DATA private data * END

    creates an instance of the bhvr_type component (e.g., Token_Bucket) and initializes this instance with the private data specified in the command. Possible component classes (bhvr_class) include BHVR, QUEUE, ENQ_BHVR, DEQ_BHVR, and DEQ_DISC for example,

     CREATE BHVR Token_Bucket tb1 DATA Rate 64000 Bucket_Size 5000 END 

  • REMOVE bhvr_class bhvr_name END

    removes the specified component, if it has no successors.

  • CHANGE bhvr_class bhvr_name DATA private data END

    changes the specified data of the component instance, bhvr_name. This command changes nothing to the existing structure; it merely accesses the private data of the specified instance.

  • CONNECT bhvr_class bhvr_name TO bhvr_class bhvr_name port | HOOK hook_name END

    connects the input of the bhvr_name instance to the output, port, of the specified instance following next, or to the specified hook, hook_name. One single instance can be connected to several outputs or hooks. However, only one single component instance can be appended to each output or hook.

  • DISCONNECT bhvr_class bhvr_name port END

    deletes the connection on the specified output. The REMOVE command can delete a component instance only provided that all links have been deleted.


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net