Chapter 4: Protocol Software | Designing Embedded Communications Software

Protocols define the common language used for communicating systems. Protocols form the core of the system function and can be defined at each layer in the OSI reference model. In a typical communications system, modules typically support these protocols. This chapter focuses on protocol implementation, including state machines, interfaces and management information.

4.1 Protocol Implementation

Protocols can be defined via standard specifications from bodies such as the ITU-T, IETF, IEEE, ANSI, and so on. Protocols can also be proprietary and defined by a vendor for use in communicating with their own equipment. Proprietary protocols may be required in those cases in which the vendor feels that existing standards-based protocols are insufficient for their application or in cases in which the vendor believes that their proprietary protocol may give them a competitive advantage. In either case, protocols need to be defined in a specification. There are several protocol specification languages and tools such as Specification Description Language (SDL). Independent of the tool/language used, a protocol specification involves:

The architectural relationship of the communicating entities—for example, a master– slave mode, or peer-to-peer mode
A set of valid states for each of the communicating entities—for example, initializing, active, or disconnected
A set of messages called Protocol Data Units (PDUs) used by the communicating entities
Timers used by the communicating entities and their values
Actions to be taken on receipt of various messages and events

The communicating entities defined in a protocol specification assume various modes based on the nature of the communication. The specification defines which messages are sent by the communicating entities and in which mode the messages are valid. Master– slave communication, as in IBM’s Synchronous Data Link Control (SDLC) protocol uses two modes for the protocol—master and slave. Telecom and WAN equipment often have a user mode and network mode. In a protocol like Frame Relay LMI (Local Management Interface), equipment located at a customer premise (termed Customer Premise Equipment or CPE) plays the role of the user, while the Frame Relay switch interfacing to it operates in the network mode. Here, the user node queries the network equipment for the status of the network connections on the link through a Status Enquiry message. The network equipment responds with a Status message with this information.

4.1.1 State Machines

Protocols can be either stateful or stateless. A stateful protocol depends on historical context. This implies that the current state of the protocol depends upon the previous state and the sequence of actions which caused a transition to the current state. TCP is an example of a stateful protoco. A stateless protocol does not need to maintain history. An example of a stateless implementation is IP forwarding, in which the forwarding operation is performed independent of the previous action or packets.

Stateful protocols use constructs called state machines (sometimes called Finite State Machines or FSMs) to specify the various states that the protocol can assume, which events are valid for those states, and the action to be taken on specific events. Consider a protocol having two states—Disconnected and Connected (see Figure 4.1). In the Disconnected state, an Initialization event enables the transition to the Connected state. Similarly, valid events in the Connected state are protocol messages and timer events. A Disable event causes the protocol to move from the Connected state to the Disconnected state. The states and transitions described in Figure 4.1 are very simple. A real protocol implementation has many more states and events.

click to expand
Figure 4.1: A Simple Protocol State Machine.

The implementation of the simplified Connect/Disconnect state machine can be done with a switch and case statement, as shown in Listing 4.1.

Listing 4.1: A simple state machine implementation via a switch-case construct.

           switch (event) {                case E1: /* Initialize */                   If (current_state == DISCONNECTED) {                       InitializeProtocol ();                       current_state = CONNECTED;                   }                    break;                case E2: /* Protocol Messages */                   If (current_state == CONNECTED) {                       ProcessMessages ();                                          }                    break;                case E3: /* Timer Event(s) */                   If (current_state == CONNECTED) {                       ProcessTimers ();                                          }                    break;                case E4: /* Disconnect Event */                   If (current_state == CONNECTED) {                       ShutdownProtocol ();                       current_state = DISCONNECTED;                                          }                    break;                 default:                     logError ("Invalid Event, current_state, event);                     break;             }               /* Perform other processing */

The previous example is a simple way to implement a State Machine, but it is not very scalable. With several states and events, a switch and case statement would be extremely complex, becoming difficult to implement and maintain.

An alternate method is to use a State Event Table (SET). The concept is quite simple—we construct a matrix with M rows and N columns, in which N represents the number of states, and M represents the number of events. Each column in the table represents a state, while each row represents an event that could occur in any of the states. An individual entry is in the intersection box of the state and the event and represents a tuple—{Action, Next State}, as shown in Table 4.1. For example, the entry at the intersection of S1 and E1 implies:

The action to be performed on the occurrence of the event E1 while in state S1.
The next state to transition to on completion of the action—note that it could be the same state, S1 itself.

Using the State Event Table shown in Table 4.1, a typical state machine access function would use the logic in Listing 4.2.

Listing 4.2: Logic for an access function.

/*  Entry for current state and event is SET [Event][CurrentState] */ Perform Action (SET [Event][CurrentState]);  CurrentState = Next State (SET [Event][CurrentState]) ;

Typical events are specific, pre-defined types of messages, timer events, maximum retransmission attempts, a port going up or down, and error conditions like invalid messages, or user intervention conditions like protocol enabling and disabling.

States depend upon the type of protocol module implemented. Certain events will not be valid in some states. In those cases, the SET entry action would indicate an error. Action routines can be shared because two entries may have the same action.

Table 4.2 depicts the SET for the simple state diagram provided in Figure 4.1.

Table 4.1: State event table.
	State S1	State S2	State S3	State S4
Event E1	{Action, Next State }	{Action, Next State }	{Action, Next State }	{Action, Next State }
Event E2	{Action, Next State }	{Action, Next State }	{Action, Next State }	{Action, Next State }
Event E3	{Action, Next State }	{Action, Next State }	{Action, Next State }	{Action, Next State }
Event E4	{Action, Next State }	{Action, Next State }	{Action, Next State }	{Action, Next State }
Event E5	{Action, Next State }	{Action, Next State }	{Action, Next State }	{Action, Next State }

Table 4.2: SET for simple state machine in xrefparanum.
	State S1 Disconnected	State S2 Connected
Event E1 Initialize	{{Action: SendStartupMessage, Start Timers}, Next State = S2 }	{{Action: LogError}, Next State = S2 }
Event E2 Protocol Messages	{{Action: LogError}, Next State = S1 }	{{Action: ProcessMessages}, Next State = S2 }
Event E3 Timer Events	{{Action: LogError}, Next State = S1 }	{{Action: ProcessTimers}, Next State = S2 }
Event E4 Disconnect	{{Action: LogError}, Next State = S1 }	{{Action: SendShutdownMessage, Stop Timers}, Next State = S1 }

Actions

If an event is not valid in a state, the protocol implementation can perform one of two actions—a No-Op (i.e., no operation or do nothing) or call an error routine (as in the table above). For example, protocol messages may be received from an external entity even when the protocol is in the disconnected state, which may or may not be considered an error. Good defensive programming identifies all abnormal behavior upfront, before the system is deployed. In that context, it is a good idea to log errors in which unexpected events occur.

Several protocol specifications identify states and events, so it is relatively easy for the communications designer to construct a SET from the specification. Action routines are invoked on specific triggers such as timer expiration and/or received messages. The action routine can cause the construction of the relevant message, after which it can schedule the message for transmission.

Using Predicates

In addition to the two fields in the SET entry, there could be a third field, commonly called a predicate, which can be used as a parameter to the action and its value used to decide among various alternatives within the action routine. With predicates, a third entry can be added to {Action, Next State} so that it becomes {Action, Next State, Predicate}. The predicate may also be altered by the state machine. Using the State Event Table (SET) of Table 4.1, a typical state machine access function with a predicate would use the logic in Listing 4.3.

Listing 4.3: Logic for an access function with a predicate.

       /*  Entry for current state and event is SET [Event][CurrentState] */          Perform Action (SET [Event][CurrentState],                          SET  [Event][CurrentState].Predicate);          CurrentState = Next State (SET [Event][CurrentState]) ;

State Machine Processing

In Chapter 2, we had outlined the format of the main loop for a typical communications task. With state machines, this would now look like Listing 4.4.

Listing 4.4: Main loop for a typical communications task with state machines.

While (1){          Wait for any of the events;           /* break out of the hard wait loop */           if (MessageQueuing event)                Process MessageQueue;           If (timerEvent)                Process TimerEvents           Perform Housekeeping functions; /* e.g. release                                             transmit buffers */      }       ProcessMessageQueue ()       {        Determine type of message;        Classify the message and set the event variable;        Pass event through the SET ; /*state machine access function*/       }       ProcessTimers ()       {        Determine attributes of expired timers;        Classify the timer type and set the event variable;        Pass event through the SET; /*state machine access function*/       }

The pseudocode given above for the protocol task provides the basis for the event determination. Events, messages and timeouts received by the protocol task are translated into events for the state machine.

Multiple State Machines

Protocols do not need to be implemented with a single state machine—in fact, there are often multiple SETs in a protocol implementation. There could be a separate received message state machine, so the only events are incoming messages. There could be a separate state machine for timers, a separate one for port events, and so on. Often, the protocol specification indicates this separation. The OSPF specification in RFC 2328 from the IETF, for example, specifies a neighbor state machine and an interface state machine.

Each of the state machines can be implemented with its own SET. The advantage of this separation is that each SET needs to specify only its relevant set of events and appropriate actions for those events. This modular and distributed approach to SET design contributes to a cleaner system implementation.

SET versus Switch–Case Constructs

The SET implementation is easier to understand than the switch–case statement, since it replaces code complexity with its matrix data structure, ensuring that all states and events are considered up front. In an implementation with only a few valid events in some states, the SET will have a number of actions in whicha call is made to an error routine or a No-op routine and where there is no state change. In these cases, the matrix will typically look like a “sparse matrix,” with just a few valid entries. If this had been implemented with a switch–case construct, all invalid events would have been caught with the default case.

The system designer will need to choose between the two approaches for state machine implementation using the constraints for the system being designed. If the state machine is simple and the SET is a sparse matrix, use a switch–case construct. Otherwise, implement the SET with the events and action routines.

4.1.2 Protocol Data Unit (PDU) Processing

Protocol Data Units (PDUs) are generated and received by the protocol implementation. Received PDUs are decoded and classified according to the type of PDU, usually via a type field in the PDU header. Often, it is the lower layer which makes this determination. Consider a TCP packet encapsulated inside an IP packet in an implementation with separate IP and TCP tasks. A TCP packet uses protocol type 6 in the IP header, so the IP task can pass the PDUs with this protocol type to the TCP task. Subsequently, the TCP task looks only at the TCP header for its own processing.

PDU Preprocessing

A PDU is typically preprocessed before it is passed to the SET. Preprocessing performs actions such as packet syntax verification and checksum validation. In the TCP example, the checksum is a “ones complement” checksum calculated across the complete PDU. The sender of the PDU calculates and inserts this checksum in the TCP packet.

The receiver TCP implementation recalculates the checksum based on the received PDU and compares it with the checksum inserted by the sender. If the values do not match, the packet is dropped. Checksum calculation is CPU and memory intensive, since it accesses each byte of each packet. Since high I/O throughput is often a real-time requirement, special I/O processors may be added to the hardware configuration to perform TCP checksum calculations.

Events to State Machine

Using preprocessing, the packet type is determined and the appropriate event passed to the SET. Normally, the PDU is not retained once the processing is complete. An exception is a protocol like OSPF, which retains a copy of each Link State Advertisement (LSA) PDU in its LSA database. This is a requirement of the protocol—the OSPF implementation may need to resend one or more received LSAs based on specific conditions, and it needs to have a copy of the PDU for this purpose. In this situation, the OSPF protocol copies the PDU into its own buffers. Alternately, it can retain the PDU buffer and add it to a linked list. This approach avoids copying of the PDU but may involve retaining the same linked buffer management system. The linked buffer scheme may not always be an efficient scheme for data structures to be maintained by the protoco.

PDU Transmission

PDUs are transmitted by the action routines of the SET. For example, timer expiration can cause the appropriate SET action routine to generate a PDU. Second, a received message such as a Frame Relay LMI Status Enquiry (another event to the protocol SET) can cause the generation and transmission of an LMI Status response message. PDU construction is done by the protocol which allocates buffers, fills in the protocol header fields and contents, calculates the checksum, and queues the PDU for transmission or for passing to the lower layer.

4.1.3 Protocol Interfaces

Protocol tasks do not exist or execute in isolation. They need to interface and interact with the other system components in the system environment. These include:

Real Time Operating System
Memory Management
Buffer Management
Timer Management
Event Management
Inter-Process Communication (IPC)
Driver Components
Configuration and Control

The RTOS functions as the base platform on which the protocols execute. It is responsible for initializing the protocol task, establishing its context, scheduling it based on priority and readiness, and providing services via system call routines. Each task requires its own stack, which is usually specified at the time of task creation. The RTOS allocates memory for the stack and sets the stack pointer in the protocol task context to point to this allocated area.

Some of the functions of buffer, timer management, and IPC can be libraries in the RTOS, but, for this discussion, they are treated as separate functional entities.

Memory Management

Memory management functions are required for allocating and releasing memory for individual applications by maintaining the memory blocks in the system heap. Calls such as malloc and free are examples of common memory management functions.

Unlike desktop systems, real-time systems can have multiple memory partitions. Packet buffers can be maintained in DRAM while tables could be maintained in SRAM, and each of these are viewed as separate partitions with their own memory management functions (see Figure 4.2). In the VxWorks™ RTOS, partitions can be created with the memPartCreate call. Individual blocks can be created out of these partitions with the routine memPartAlloc and released with memPartFree. The system calls malloc and free are actually special cases of memPartAlloc and memPartFree acting on the system partition, which is the memory partition belonging to the RTOS itself.

Buffer Management

Buffer management includes initialization, allocation, maintenance, and release of buffers used for receiving and transmitting frames to or from physical ports. There can be multiple buffer pools, each consisting of buffers of a specific size.

click to expand
Figure 4.2: Multiple memory partitions in a communications system.

Memory for buffer pool(s) can be allocated using memory management functions. Protocol tasks use the buffer management interface functions to obtain, process, and release buffers needed for their operation. Often, buffer management libraries are provided along with the RTOS—like the mbuf and zbuf libraries available in VxWorks. In some systems, the buffer management library has been developed internally by the software team. This library is considered an "infrastructure" library which can be utilized by all tasks—protocol and otherwise.

Timer Management

Timer management includes the initialization, allocation, management, and use of timers. These functions are provided by a timer management library. As with buffer management, the timer library can either be provided as part of the RTOS or independently developed. Protocol tasks make calls to the timer management library to start and stop timers and are signaled by the timer management subsystem by an event when a timer expires. As indicated in Section 4.1.1, the tasks can use timer expiration as events to the appropriate SET.

Buffer and timer management are discussed in greater detail in Chapter 6.

Event Management

The main loop of the protocol task waits on events. Event management involves the event library, which is used in events such as timer expiration, buffer enqueuing, and so on. The event library also ensures that the task waiting on events is able to selectively determine which signals it will receive. This is usually accomplished using a bit mask that indicates the events that will signal a task. A variation of this is the select call used in several RTOSes to wait on a specific set of socket or file descriptors.

Main loop processing of events has the advantage of a single entry point for all events. The main loop can pass control to the appropriate SET based on the type of event. Alternately, if we permit “lateral” calls into the SET, as can happen from an ISR, there would be two issues. The first is that the action routine called from the SET would take a much longer time than is permissible in an ISR. Second, since the SET is a global structure, it would need to use some form of mutual exclusion since there would then be two entry points—one from the main loop and one from an ISR. The preferred approach is for the ISR to send an event to the protocol task, so that event processing and subsequent action routine calls take place only in the main loop.

Inter-Process Communication (IPC)

Tasks use multiple means of communicating with other tasks. These communication mechanisms may be provided by the RTOS or via separate libraries. The mechanisms include:

Message Queues
Semaphores for mutual exclusion
Shared Memory
Mailboxes (which is a combination of a message queue and a semaphore)
Signals/Events

These mechanisms are discussed in the literature and will not be detailed here. Selection of one or more forms of IPC depends upon the type of interchange and its availability in the RTOS. Most RTOSes offer these mechanisms—the application developer can choose the appropriate mechanism depending upon the application.

Driver Interfaces

Tasks interface with drivers at the lowest level of the OSI model. For reusability and modularity, a common method for implementing a driver is to segment the driver into two layers:

An adaptation layer providing a uniform interface to higher layer protocols
A device-specific layer

The advantage of this layering is that the higher layer tasks need not change with newer versions of the driver. The driver’s adaptation layer will provide the same interface to the higher layer/protocol tasks. The device-specific layer and its interface to the adaptation layer will be the only modules which need to be modified. If the next generation of the device and the device driver support more ports, the interfacing tasks will see no difference as long as they deal only with the adaptation layer of the driver.

4.1.4 Configuration and Control

A protocol task communicates with an external manager for configuration, control, status, and statistics. However, the protocol does not talk to the manager directly. It typically interfaces to an agent resident on the same embedded communications device. This agent acts on behalf of the external manager (see Figure 4.3) and translates the requests and responses between the protocols and the manager. The manager-to-agent communication is typically through a standard protocol like Simple Network Management Protocol (SNMP), CORBA or TL1. This permits the individual protocols to stay independent of the management protocol and mechanism.

click to expand
Figure 4.3: A manager–agent model.

A special case of an agent–manager interaction is the Command Line Interface (CLI). Using the manager–agent model, a CLI can be considered an embedded agent with the user being the manager. Almost every embedded communications device will have a CLI irrespective of whether it has an SNMP agent. The CLI is typically accessible through a serial (console) port or by a remote mechanism such as telnet (when the device implements an end node TCP/IP stack). The user inputs are translated by the CLI task to requests to the individual protocol tasks—very similar to an SNMP agent.

The agent talks to its manager through an external interface, but uses an internal interface to talk to individual protocol tasks. The set of variables to be configured and the status and statistics to be observed are typically defined in a protocol Management Information Base (MIB). The MIB is a database or repository of the management information. Protocol MIBs are usually specified by standards bodies like the IETF.

Management Types

There are four common types of management found in communications architectures:

Device or Element Management
Network Management
Service Management
Business Management

The above list details a hierarchy of management functions required in a communications system. Device management feeds into network management, which feeds into service management. A simple way to understand this is with a DSL service example. The service provider can install a DSL modem at the subscriber’s location and use SNMP to manage it from a remote location (device management). The information from all modems on the network tells the manager about the status of the network (network management). Service management is one more level of aggregation which helps control the DSL service (downtime, traffic measurement, peering details). Business management determines if the service is making money.

Protocol Management

The following provides a list of some of the operations used in configuring a protocol:

Enabling and disabling the protocol
Enabling and disabling the protocol on a specific port
Addressing a specific interface (e.g., the IP address on a port)
Setting maximum frame size
Managing protocol message timeouts
Timing out peer entities
Authenticating security information (e.g., passwords, security keys)
Managing traffic parameters
Encapsulation information

The set of configuration information is quite extensive, but many parameters have default values specified in the MIBs. Some parameters such as IP addresses do not have default values and need to be set manually. The set of parameters without default values and which require manual configuration are called basic parameters in our discussion. These need to be set before the protocol can function. The same consideration applies to individual ports—before a protocol can be enabled on an interface, a set of basic parameters for an individual port needs to be configured. For example, before IP can be enabled on an Ethernet port, the port’s IP address needs to be set. While designing protocol software, be sure to identify basic parameters up front—both at the global protocol level and at the port level.

Debugging Protocols

Protocols need to be enabled and disabled before a communications system is supposed to run so that protocols can be debugged. It is useful to isolate the source of error conditions on a network by selectively disabling protocols. A network administrator may want to turn off a specific protocol due to a change in the composition of the network. For example, a network possessing both IP and IPX end stations can transition to become an IP-only network. Instead of replacing the routing equipment on the network, the administrator can just disable IPX forwarding and the related IPX RIP (Routing Information Protocol) and IPX SAP (Service Advertisement Protocol) protocols function on the router.

Once basic parameters are set and the protocol has been enabled, the manager can configure any of the parameters used by the protocol. The manager can also view status of the parameters and tables such as connection tables as well as statistics information.

4.1.5 System Startup

When a communications system starts up, it follows a sequence of steps involving memory allocation, initialization of the various system facilities, and protocols. These steps are performed after the hardware diagnostics are completed and the RTOS is initialized. The latter is required so that the communications software can utilize the OS facilities for its initialization.

The following is the sequence of operations used to initialize a communications system with multiple protocols:

Initialize memory area and allocate task/heap in various partitions
Initialize the buffer and timer management modules
Initialize the driver tasks/modules
Initialize and start the individual protocol tasks based on the specified priority
Pass control to the RTOS (which, in turn, passes control to the highest priority task)

There is no step for shutdown or termination above as is often the case in embedded communications systems. There is rarely a reason to shut down the system. Unlike a desktop system, embedded communications devices are in the “always on” mode. Where required, a system reset is done to load a new configuration or image, but this is not the same as a shutdown. This philosophy of “always on” is also seen in the individual protocol tasks, where the main loop is an infinite loop. Tasks wait on events and continue processing without having to break out of the loop for any type of event.

Protocol Initialization

When the protocol obtains control with the initialization of its task, the protocol task performs initialization according to the following steps:

Initialize sizing parameters for the tables
Allocate memory for dynamic data structures and state table(s)
Initialize state table variables
Initialize buffer and timer interfaces
Read configuration from local source and initialize configuration
Initialize lower and higher layer interfaces—including registration with higher and/or lower layers
Create and spawn off additional protocol tasks, if required
Wait on infinite loop

Setting Sizing Parameters at Startup

This method is preferred for protocol subsystems as opposed to using compile-time constants for the sizing parameters. A compile-time constant requires recompilation of source code when moving to a target with a larger (or lower) amount of memory, so we use variables instead.

These variables can be set by a startup routine, which can read the values from an external entity such as an EEPROM or flash. The sizing variables are then used by the protocol task(s) for allocating the various tables required for operation.

Initialization

The SET initialization is then performed, after which the buffer and timer management interfaces are initialized. This could involve interfacing to the buffer management module for allocating the buffer pool(s) required for the protocol operation. Timers are started as required by interfacing to the timer management module.

Restoring Configuration

Following this is the restoration of configuration—this could be from done a local non-volatile storage such as flash or from a remote host. This is a critical step in the operation of communications devices in general and protocol tasks in particular. Most protocols require an extensive amount of configuration—so when a system restarts due to maintenance, upgrade, or bug fixes, the network manager does not have to reconfigure all the parameters. This is done by saving the system configuration, including protocol configurations, on a local or remote device and having the new image read this configuration at startup.

Application Interface and Task Initialization

The next steps in protocol initialization deal with the registration of modules and callback routines and initialization of interfaces at the higher and lower layers. Subsequently, more protocol tasks may also need to be created. In a router, there could be a root task for TCP/IP subsystem which is responsible for all the protocols in the suite. The root task would initialize and spawn off the additional tasks (IP, TCP, UDP). Each of the tasks can perform its own initialization using the sequence outlined.

4.1.6 Protocol Upgrades

Communications equipment is critical to the functioning of the network. This means that it should not taken out of commission during upgrades, but this is not always possible. In certain environments like the core of the Internet, routers cannot go down for system upgrades. There are several schemes for handling this, including redundancy, protocol task isolation, and control and data plane separation. In particular, protocol task isolation is becoming common in new designs.

Instead of a monolithic design, in which all protocol tasks are linked in with the RTOS and provided as a single image, some vendors are using a user mode—kernel mode design to isolate tasks. The key enablers for this approach are:

Memory protection between tasks or processes
Ability to start and stop tasks or processes
Plane separation between forwarding and routing

Consider a case where an IS–IS routing task runs as a user mode process in UNIX with the actual IP forwarding done in the kernel (with hardware support, as appropriate). If the IS–IS task needs to be upgraded, we can configure the current IS–IS task to save its configuration in a common area (say as a file on the same UNIX system), kill the task and start the “new” (upgraded version) IS–IS task. This upgrade task picks up the configuration from the file and establishes a connection with the IP forwarding in the kernel. Forwarding will continue uninterrupted, and so will the other protocol tasks in the system.

UNIX, Linux, and BSD variants are being used as the operating systems of choice in some of the new communications equipment. In multi-board configurations, several equipment vendors run modified versions of Linux or BSD UNIX on the control blade. These modifications are usually in the area of process scheduling and code optimization in the kernel.