26.2 BSD Sockets

   


The Linux kernel offers exactly one socket-related system call, and all socket calls of applications are mapped to this system call. The function asmlinkage long sys_socketcall(int call, unsigned long *args) is defined in net/socket.c. Moreover, a number is assigned in include/asm/unistd.h (#define __NR_socketcall 102) and added to a table with system calls in arch/i386/kernel/entry.S. The socket function to be addressed can be stated in the call parameter of a call. The admissible parameters are defined in include/linux/net.h: SYS_SOCKET, SYS_BIND, SYS_CONNECT, SYS_LISTEN, SYS_ACCEPT, SYS_GETSOCKNAME, SYS_GETPEERNAME, SYS_SOCKET-PAIR, SYS_SEND, SYS_RECV, SYS_SENDTO, SYS_RECVFROM, SYS_SHUTDOWN, SYS_SETSOCKOPT, SYS_GETSOCKOPT, SYS_SENDMSG, SYS_RECVMSG. From within libraries in the user space, the sys_socketcall() call with a specific parameter is mapped to an independent function (e.g., sys_socketcall (SYS_SOCKET ,...) becomes the call socket(...)).

sys_socketcall()

net/socket.c


The function to be called is selected in the kernel by using a switch command in the function sys_socketcall(), and the command copy_from_user() is used to first copy the function's arguments into a vector, unsigned long a[6]:

 ... if copy_from_user(a, args, nargs[call]))        return -EFAULT; a0=a[0]; a1=a[1]; switch(call)       {        case SYS_SOCKET:             err = sys_socket(a0,a1,a[2]);             break;        case SYS_BIND:             err = sys_bind(a0, (struct sockaddr *)a1, a[2]);             break;        case SYS_CONNECT:             err = sys_connect (a0, (struct sockaddr *)a1, a[2]);             break;        case SYS_LISTEN:             err = sys_listen (a0,a1);             break;        ...        } ... 

The most important structure within the BSD socket support is struct socket. It is defined in include/linux/net.h:

 struct socket {        socket_state            state;        unsigned long           flags;        struct proto_ops        *ops;        struct inode            *inode;        struct fasync_struct    *fasync_list; /* Asynchronous wakeup list*/        struct file             *file;        /* File back pointer for gc*/        struct sock             *sk;        wait_queue_head_t       wait;        short                   type;        unsigned char           passcred; }; 

This structure is slightly reduced, compared to that in earlier kernel versions. The socket state stored in state can take the following values (include/linux/net.h): SS_FREE (not busy), SS_UNCONNECTED (not connected), SS_CONNECTING (currently being connected), SS_CONNECTED (connected), SS_DISCONNECTING (currently being disconnected). The flags are required to synchronize accesses. The ops pointer references the protocol operation of the connected protocol (e.g., TCP or UDP) after the initialization. Just as there is an inode for each file in Linux, an inode is assigned to each BSD socket. A pointer to the file structure is stored in file; this structure is connected to the socket and can also be used to address it. If any process is waiting for events at this file, then that process can be reached over fasync_list.

A matching sock structure can be used via the sk pointer. However, this structure is initialized by protocol-specific sockets underneath the BSD sockets (e.g., PF_INET sockets) and connected to this pointer. The wait entry serves to implement synchronous (blocking) receipt. The type field serves to store the second parameter by the same name of the socket call in the user space. The admissible parameters are defined in include/asm/socket.h: SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_RDM, SOCK_SEQPACKET, and SOCK_PACKET. (The last should no longer be used.)

Now let's see how all of this works in the Linux kernel when socket() is invoked by an application. As was mentioned, this call is passed within the function sys_socketcall() in net/socket.c by invoking the function sys_socket().

sys_socket()

net/socket.c


This function initializes the socket structure and allocates an inode and a file descriptor by calling the functions sock_create() and sock_map_fd().

sock_create()

net/socket.c


In this function, the first step checks on whether the protocol family specified in the family parameter is available. An attempt may be made to load the corresponding module. Subsequently, the type field of the socket structure is described, and two additional functions are invoked: sock_alloc(), to provide a socket, and net_families[family]->create(). The function sock_alloc() is described further later on. net_families[family]->create() executes the create() function of the lower-layer socket. For this purpose, the respective protocols register themselves with the vector static struct net_proto_family *net_families[NPROTO] when the system starts or when the appropriate module is loaded by the sock_register() function (net/socket.c). They also pass the name of the protocol family and a pointer to the create() function. In case of the PF_INET socket, for example, this is done by the sock_register(&inet_family_ops) (net/ipv4/af_inet.c) call, which exports the inet_create() function. The control flow leaves the BSD socket area and is passed to the implementation of the lower-layer socket.

sock_alloc()

net/socket.c


This function is initially invoked by sock_create(); it reserves a new inode and allocates a socket structure. The fields of the socket structure are initialized to null, or state is initialized to SS_UNCONNECTED.

sock_map_fd()

net/socket.c


This function uses a number of helper functions to allocate a file descriptor to the socket, which is used to address this socket. It is also called a socket descriptor, but there is no difference from other file descriptors. This is the reason why you can also use the read() and write() I/O calls to read or write over a socket. The file entry for the socket structure is also set in the sock_map_fd() function.

The create() function of the lower-layer socket, which is invoked by net_families[family]->create() as described above now has to fill the other fields of the socket structure with entries in particular, the ops pointer, which references the proto_ops (include/linux/net.h) structure. This structure serves to supply the BSD socket with the functions of a lower-layer socket. It includes a variable, which is stored in the socket family, and a number of function pointers:

 struct proto_ops { int          family; int          (*release)          (...); int          (*bind)             (...); int          (*connect)          (...); int          (*socketpair)       (...); int          (*accept)           (...); int          (*getname)          (...); unsigned int (*poll)             (...); int          (*ioctl)            (...); int          (*listen)           (...); int          (*shutdown)         (...); int          (*setsockopt)       (...); int          (*getsockopt)       (...); int          (*sendmsg)          (...); int          (*recvmsg)          (...); int          (*mmap)             (...); ssize_t      (*sendpage)         (...); }; 

Notice that not all of these functions have to be fully implemented; in such a case, however, an error message should be returned.

This also makes the sending of data over a BSD socket easily understandable: For example, when an application sends data over the sendto() socket call, then the function sys_socketcall() executes sys_sendto() in net/socket.c. There, a message consisting of the transmit data, the address, and control fields is composed. Finally, the function sock_sendmsg() uses sock->ops->sendmsg() to invoke the transmit function of the respective lower-layer protocol-specific socket.

BSD sockets support many different protocols, so a general address structure, sockaddr (include/linux/socket.h), was defined. It consists of a protocol-family identifier and the corresponding address:

 struct sockaddr {        sa_family_t       sa_family;       /* address family, AF_xxx */        char              sa_data[14];     /* 14 bytes of protocol address */ }; 


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net