Linux Threads


No discussion of process fundamentals is complete without an explanation of Linux threads because an understanding of threads is crucial for troubleshooting processes. As mentioned earlier, the implementation of threads in Linux differs from that of UNIX because Linux threads are not contained within the proc structure. However, Linux does support multithreaded applications. "Multithreading" just means two or more threads working in parallel with each other while sharing the same address space. Multithreaded applications in Linux just use more than one task. Following this logic in the source, include/linux/sched.h shows that the task_struct structure maintains a one-to-one relationship with the task's thread through the use of a pointer to the thread_info structure, and this structure just points back to the task structure.

Excerpts from the source illustrate the one-to-one relationship between a Linux task and thread.

include/linux/sched.h ... struct task_struct {     volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */     struct thread_info *thread_info; ...


To see the thread_info structure point back to the task, we review include/asmi386/thread_info.h.

...    struct thread_info {        struct task_struct     *task;         /* main task structure */ ...


Using multithreaded processes has its advantages. Threading allows for better processor loading and memory utilization. A drawback is that it also significantly increases the program's complexity. On a single-CPU machine, a multithreaded program for the most part performs no better than a single-threaded program. However, well-designed multithreaded applications executed on a Symmetric Multi-Processor (SMP) machine can have each thread executing in parallel, thereby significantly increasing application performance.

Threaded application performance is enhanced by the fact that threads share resources. Different types of processes share resources in different ways. The initial process is referred to as the heavyweight process (HWP), which is a prerequisite for lightweight processes. Traditionally, a thread of a process is referred to as a lightweight process (LWP), as mentioned earlier. The main difference between these two is how they share their resources. Simply stated, when an HWP forks a new process, the only thing that is shared is the parent's text. If an HWP must share information with another HWP, it uses techniques such as pipes, PF_UNIX (UNIX sockets), signals, or interprocess communication's (IPCS) shared memory, message queues, and semaphores. On the other hand, when an HWP creates an LWP, these processes share the same address space (except the LWP's private stack), thus making utilization of system resources more efficient.

Note that although several forms of threads exists, such as user space GNU Portable Threads (PTH) and DCE threads, in this chapter, we only cover the concept of POSIX threads because they are the most commonly used threads in the industry. POSIX threads are implemented by the pthread library. The use of POSIX threads ensures that programs will be compatible with other distributions, platforms, and OSs that support POSIX threads. These threads are initiated by the pthread_create() system call; however, the Linux kernel uses the clone() call to create the threads. As implied by its name, it clones the task. Just as fork() creates a separate process structure, clone() creates a new task/thread structure by cloning the parent; however, unlike fork(), flags are set that determine what structures are cloned. Only a select few flags of the many flags available are required to make the thread POSIX compliant.

The Linux kernel treats each thread as an individual task that can be displayed with the ps command. At first, this approach might seem like a large waste of system resources, given that a process could have a great number of threads, each of which would be a clone of the parent. However, it's quite trivial because most task structures are kernel objects, which enables the individual threads to just reference the address space. An example is the HWP's file descriptor table. With clone(), all threads just reference the kernel structure by using the flag CLONE_FILES.

With help from developers from around the world, the Linux kernel is developing at an extraordinary rate. A prime example is the fork() call. With the IA-64 Linux kernel, the fork() call actually calls clone2(). In addition, pthread_create() also calls clone2(). The clone2() system call adds a third argument, ustack_size. Otherwise, it is the same as clone(). With the IA-32 2.6 kernel release, the fork() call has been replaced with the clone() call. The kernel clone() call mimics fork() by adjusting clone() flags.

Detailed next are examples of tasks and threads being created on different versions and distributions of Linux:

  • IA-32 (2.4.19) Fork call

    2970  fork()                    = 3057 <-- The PID for the new HWP

  • IA-32 (2.4.19) Thread creation

    3188  clone(child_stack=0x804b8e8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 3189 <-- LWP

  • IA-32 (2.6.3) Fork call

    12383 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x4002cba8) = 12499 <-- HWP

  • IA-32 (2.6.3) Thread creation

    12440 <... clone resumed> child_stack=0x42184b08, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYS VSEM|CLONE_SETTLS |CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0x42184bf8, {entry_number:6, base_addr:0x42184bb0, limit:1048575, s eg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0x42184bf8) = 12444 <--LWP

  • IA-64 (2.4.21) Fork call

    24195 clone2(child_stack=0, stack_size=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x200000000002cdc0) = 24324 <--HWP

  • IA-64 (2.4.21) Thread creation

    24359 clone2(child_stack=0x20000000034f4000, stack_size=0x9ff240, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSV SEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0x2000000003ef3960, tls=0x2000000003ef3f60, child_tidptr=0x2000000003ef3960) = 24365 <--LWP

As the previous examples show, the kernel clone() call creates threads, whereas clone2() creates threads, new processes, or both. In addition, the previous traces reveal the creation of threads and the flags needed to make them POSIX compliant, as defined in the next listing.

clone(child_stack=0x804b8e8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) child_stack:       Unique process stack CLONE_VM :         Parent and child run in the same address space CLONE_FS:          Parent and child share file system info CLONE_FILES:       Parent and child share open file table CLONE_SIGHAND:     Parent and child share signal handlers


Identifying Threads

As previously discussed, the ps command lists all tasks in Linux, preventing the user from distinguishing the HWP from the LWP. At approximately the 2.4.9 kernel release, the Task Group ID (tgid) was added to fs/proc/array.c. This placed a task's tgid in the /proc/<pid>/status file. A key point is that the tgid is equal to the HWP's PID. This new feature enables users to identify threads of a multithreaded process with ease.

Reviewing the source, we see:

# ./fs/proc/array.c ... static inline char * task_state(struct task_struct *p, char *buffer) {         int g;         read_lock(&tasklist_lock);         buffer += sprintf(buffer,                 "State:\t%s\n"                 "Tgid:\t%d\n"                 "Pid:\t%d\n"                 "PPid:\t%d\n"                 "TracerPid:\t%d\n"                 "Uid:\t%d\t%d\t%d\t%d\n"                 "Gid:\t%d\t%d\t%d\t%d\n",                 get_task_state(p), p->tgid, ...


Linux commands, such as ps, were modified to make use of this new value, enabling them to display only the parent HWP task (tgid), or all threads of a task by passing the -m or -eLf flag.

In Listing 8-1, we have included a small example of a threaded program that demonstrates how threads appear in Linux. Note that this code makes no attempt either to lock threads with mutex locks or semaphores or to perform any special signal masking. This code just creates threads that perform sequential counts to exercise the CPU(s).

Listing 8-1. Example of a Threaded Program

#include <pthread.h> /* POSIX threads */ #include <signal.h> #include <stdlib.h> #include <linux/unistd.h> #include <errno.h> #define num_threads 8 void *print_func(void *); void threadid(int); void stop_thread(int sig); /* gettid() is not portable.. if compiling on other Operating Systems, \ remove reference to it */ _syscall0(pid_t,gettid) int main () {         int x;         pid_t tid;         pthread_t threadid[num_threads];         (void) signal(SIGALRM,stop_thread); /*signal handler */         printf("Main process has PID= %d PPID= %d and TID= %d\n", \         getpid(), getppid(), gettid());         /* Now to create pthreads */         for (x=1; x <= num_threads;++x)         pthread_create(&threadid[x], NULL, print_func, NULL );         sleep(60); /* Let the threads warm the cpus up!!! :) */         for (x=1; x < num_threads;++x)                 pthread_kill(threadid[x], SIGALRM);         /*wait for termination of threads before main continues*/         for (x=1; x < num_threads;++x)         {         printf("%d\n",x);         pthread_join(threadid[x], NULL);         printf("Main() PID %d joined with thread %d\n", getpid(), \         threadid[x]);         } } void *print_func (void *arg) {        printf("PID %d PPID = %d Thread value of pthread_self = %d and \        TID= %d\n",getpid(), getppid(), pthread_self(),gettid());        while(1); /* nothing but spinning */ } void stop_thread(int sig) { pthread_exit(NULL); }

Using Listing 8-1, create a binary by compiling on any UNIX/Linux system that supports POSIX threads. Reference the following demonstration:

1.

Compile the source.

# gcc -o thread_test thread_test.c -pthread


Next, execute thread_test and observe the tasks with pstree. Note that we have trimmed the output of pstree to save space.

2.

Execute the object.

#./thread_test


3.

In a different shell, execute:

# pstree -p init(1)-+-apmd(1177) ~~~~~~Saving space~~~~~         |-kdeinit(1904)- ~~~~~~Saving space~~~~~         |               |-kdeinit(2872)-+-bash(2874)---thread_test                          (3194)-+-thread_test(3195)         |               |                  | |-thread_test(3196)            |            |                  | |-thread_test(3197)            |            |                  | |-thread_test(3198)            |            |                  | |-thread_test(3199)            |            |                  | |-thread_test(3200)            |            |                  | |-thread_test(3201)            |            |                  | `-thread_test(3202) ~~~~~~Saving space~~~~~         |               |                  `-bash(3204)--- pstree(3250) ~~~~~~Saving space~~~~~


4.

We can display more details with the ps command. (Note that the PIDs would have matched if we had run these examples at the same time.)

# ps -eo pid,ppid,state,comm,time,pri,size,wchan | grep test 28807 28275 S thread_test      00:00:12  18 82272 \ schedule_timeout


Display threads with -m.

# ps -emo pid,ppid,state,comm,time,pri,size,wchan | grep test 28807 28275 S thread_test      00:00:00  18 82272 \ schedule_timeout 28808 28807 R thread_test      00:00:03  14 82272 - 28809 28807 R thread_test      00:00:03  14 82272 \ ia64_leave_kernel 28810 28807 R thread_test      00:00:03  14 82272 \ ia64_leave_kernel 28811 28807 R thread_test      00:00:03  14 82272 \ ia64_leave_kernel 28812 28807 R thread_test      00:00:03  14 82272 - 28813 28807 R thread_test      00:00:02  14 82272 \ ia64_leave_kernel 28814 28807 R thread_test      00:00:02  14 82272 \ ia64_leave_kernel 28815 28807 R thread_test      00:00:03  14 82272 -


Even though some UNIX distributions have modified commands such as ps or top to display a process with all its threads by including special options such as -m or -L, HPUX has not. Therefore, the HPUX ps command only shows the HWP process and not the underlying threads that build the process. On the other hand, Solaris can display the LWP of a process by using the -L option with its ps command.

Other vendors have created their own tools for displaying threads of a process. HPUX's glance is a good example. Using the same procedures as earlier, we demonstrate multithreads in HPUX to show the main difference between UNIX threads and Linux's implementation of threads.

HPUX 11.11: # cc -o thread_test thread_test.c -lpthread hpux_11.11 #glance Process Name   PID   PPID Pri Name   ( 700% max)   CPU   IO Rate   RSS \ Cnt ------------------------------------------------------------------------ thread_test         14689  14579 233 root     698/ 588   57.3  0.0/ 0.2 \ 560kb    9


Thus, using HPUX's glance, we can see that the thread count is nine, with one thread representing the main HWP and eight additional threads that were created by the program as shown in the source. Each thread does not have its own PID as with Linux threads. In addition, Linux tools such as top do not show the threads of a process consuming CPU cycles. This can be tested by executing the tHRead_test program in one tty and the top program in another tty.



Linux Troubleshooting for System Administrators and Power Users
Real World Mac Maintenance and Backups
ISBN: 131855158
EAN: 2147483647
Year: 2004
Pages: 129
Authors: Joe Kissell

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net