A ShortIntroductiontoPOSIX Threads

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

A Short Introduction to POSIX Threads

Keith Gaughan March 22, 2003

Contents
1 A bit of background 1.1 Whats a thread? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Is there a downside to using threads? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . POSIX threads 2.1 Working with threads . . . . . . . . 2.2 Breakdown of the example . . . . . 2.2.1 pthread t . . . . . . . . . . 2.2.2 pthread create() . . . . . . . 2.2.3 pthread join() . . . . . . . . 2.3 Other calls you need to know about . 2.3.1 pthread exit() . . . . . . . . 2.3.2 pthread self() . . . . . . . . 2.3.3 pthread yield() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 2 2 3 3 3 4 4 4 4 4 4 4 5 6 6 6 6 6 7 7 7 7 7 7 7 7 7

Synchronisation 3.1 Critical sections . . . . . . . . . . . . . . . . 3.2 Mutexes . . . . . . . . . . . . . . . . . . . . 3.3 Breakdown of the example . . . . . . . . . . 3.3.1 pthread mutex t . . . . . . . . . . . . 3.3.2 PTHREAD MUTEX INITIALIZER 3.3.3 pthread mutex lock() . . . . . . . . . 3.3.4 pthread mutex unlock() . . . . . . . 3.4 Other calls you need to know about . . . . . . 3.4.1 pthread mutex init() . . . . . . . . . 3.4.2 pthread mutex destroy() . . . . . . . 3.4.3 pthread mutex trylock() . . . . . . . 3.5 Condition variables . . . . . . . . . . . . . . 3.6 Synchronisation problems . . . . . . . . . . 3.6.1 Deadlocks . . . . . . . . . . . . . . . 3.6.2 Race conditions . . . . . . . . . . . . Le Fin

A BIT OF BACKGROUND

A bit of background

Many moons ago, when dinosaurs roamed the earth and we were all still living in caves (around 1990, as it happens), most computers were only capable of running one program at a time. This, as you might guess, became rather painful at times. It wasnt too nice having to quit WordPerfect 5.1 just so that you could check some numbers in Lotus 1-2-3. Systems like these were known as Single Tasking, or Single Processing. It wasnt the same everywhere though. Back when sh decided it might be fun to try that whole whacky walking-on-land thing (the late 60s and early 70s), there were already Operating Systems capable of running more than one program at once. These were known as Timesharing Operating Systems, because they ran on large Mainframes that served a whole bunch of people who were running programs on them. These systems were the precursors to the early Multitasking OSs, such as UNIX. With OSs such as UNIX, you were now able to run multiple programs on the one machine simultaneously, each executing instance being known as a process. Now you could spend all day playing Patience whilst the webserver you were running on your machine was serving pictures of your beloved Bonzo the Wonderdog to an ever-so-interested world. Even with multiprocessing, youre still left with a problem: processes can only do one thing at a time. Say youre browsing the web and you decide to go to Happy Tree Friends1 to look at all the happy woodland animals being mercilessly slaughtered. If your browser was only capable of doing one thing at a time, youd have to wait until the whole page was downloaded, then wait for the browser to work out how to render the page, and then, and only then, could watch ants inict some serious GBH on an anteater. One way of getting past this difculty is multithreading. Normal processes have only one thread of control, and so can only do one thing at a time. With multithreading, a process can do more than one thing simultaneously, for instance one thread could be taking care of the GUI whilst another is doing some I/O, and yet another is doing some rather heavy calculations.

1.1

Whats a thread?

Im presuming you already know what a process is. Well, a thread is kind of a lightweight process, but unlike a regular heavyweight process, a thread has no memory or resources of its own besides a stack, and its own set of registers. Each heavyweight process consists of at least one thread, and all the other resources it needs such as memory, le descriptors, and so on. Any threads within a process share all these resources with one another. This makes creating and destroying threads a lot less time-consuming than processes. There are some really good reasons for writing multithreaded programs: Increased application responsiveness, e.g. one thread can be handling the applications UI whilst others are doing all the legwork in the background. Increased application throughput. More efcient use of system resources such as memory and CPU time. The ability to run well on uniprocessor and multiprocessor systems. Being able to structure your program in such a way that things that can run in parallel can be properly modularised, making for better strutured programs. And theres far more besides that.

1.2

Is there a downside to using threads?

Theres one really nasty side-effect one can run into when writing multithreaded applicationsquite horrid things can happen when two threads attempt to access the same shared resources. In fact, horrid is an
1 Me?

Evil? Naw!

POSIX THREADS

understatement, grotesque might be a more apt choice, and after youve tried debugging a misbehaving multithreaded application, youll know exactly what I mean. Then theres the horror that are race conditions, but well deal with them later. On Linux at least, theres little benet to using threads over child processes. Unlike operating systems like Solaris where it takes about thirty times longer to spawn a new process than to spawn a new thread, Linux processes spawn almost as quickly as threads, and thats fast. In fact, the only reason why the Apache webserver was rewritten to support threads rather than a process pool was to make it stable under Windows. Finally, its worth bearing in mind that multithreaded applications are a royal pain to debug. If you want to know why, read a book on Chaos Theory2 . About 99% of applications arent worth multithreading either because they dont lend themselves to it without a modicum of difculty, or because they just wont benet from any extra concurrency. Think carefully before you decide to make a program multithreaded3 .

POSIX threads

The standard API4 used for implementing multithreaded applications is POSIX Threads, and thats the library this article deals with. As far as threading libraries go, Pthreads is quite a simple one. Theres no wierd and wonderful features such as spin locks or priority-inheriting mutexes. Theres not many calls in there either. In fact, theres only about ten or so that youd use on a regular basis. To use Pthreads, youll have to include the following: #include <pthread.h> . . . and when youre linking, dont forget -lpthread so its linked to the library, e.g. gcc foo.c -o foo -Wall -lpthread

2.1

Working with threads

Each process has at least one thread, that created when the main() function is called. This thread is free to create additional threads as the programmer sees t. Heres a simple example. /* Simple Pthreads example. */ #include <stdlib.h> #include <stdio.h> #include <pthread.h> void* ThreadFunc(void* arg); int main(int argc, char argv[]) { pthread_t idThread; puts("Lets create a thread!"); pthread_create(&idThread, NULL, ThreadFunc, (void*) 5); pthread_join(idThread, NULL); } void* ThreadFunc(void* arg)
2 In a nutshell, the interactions of simple entities within a simple system can lead to complex behaviour, or misbehaviour as the case may be. 3 Its worth bearing in mind that GUI toolkits are something that are regularly multithreaded, AWT being an example for those familiar with Java. This does not mean that the programs using such libraries should be considered multithreaded. 4 Thats Application Programming Interface, now wake up you at the back!

POSIX THREADS

{ int i, n; /* Get the value of the argument passed in. */ n = (int) arg; /* Do stuff! */ for (i = 0; i < n; i++) printf("Loop %d: La la la!\n", i + 1); return NULL; } This will produce the following output: Lets create a thread! Loop 1: La la la! Loop 2: La la la! Loop 3: La la la! Loop 4: La la la! Loop 5: La la la!

2.2

Breakdown of the example

This example covers quite a bit. Heres a synopsis of the functions and types used. 2.2.1 pthread t

Pthreads library includes the type pthread t for holding a threads id. Any threads that want to manipulate another need to know its id before they can ddle with it. Needless to say, it could contain anything, so dont go poking at its contents. pthread_t idThread; 2.2.2 pthread create()

This spawns a new thread. int pthread_join(pthread_t* id, const pthread_attr_t* attr, void* (*ThreadFunc)(void*), void* arg); id is used for returning the id of the created thread. With this, the spawning thread (usually the main thread) can manipulate its settings. attr is a pointer to a pthread attr t structure. What this does and contains is rather esoteric, so passing NULL will cause the default values to be used, which are virtually always what you want anyway. ThreadFunc is the function the thread should execute. This functions prototype should look like the function ThreadFunc in the example. arg is the value ot pass as an argument to the thread function. In this example, we want to pass the value 5 into it, so we have to cast it into a pointer to void. When the thread function is called, it can cast this value back into whatever it should be. If you want to pass more than one argument to a thread, youll have to create a structure to hold them.

SYNCHRONISATION

2.2.3

pthread join()

Here, were joining a thread. Joining ensures that when the program exits, it waits for the thread to nish executing. This is quite similar to the wait() call used for child processes. In practice, you should rarely need to do this. int pthread_join(pthread_t id, void** status); id is the id of the thread we wish to join pthread create(). status will hold the value returned in pthread exit(), when its called. If this is NULL, its ignored.

2.3
2.3.1

Other calls you need to know about


pthread exit()

Terminates the calling thread, returning status. If theres nothing you want to pass back, just pass in NULL. The example didnt need this because the return NULL at the end does much the same thing. void pthread_exit(void* status); 2.3.2 pthread self()

Returns the id of the current thread. pthread_t pthread_self(void); 2.3.3 pthread yield()

Causes the thread to yield execution in favour of another thread with the same priority. void pthread_yield(void);

3 Synchronisation
Think back to when you were younger. They were more innocent days; days when all you had to worry about was counting how many cornakes had ended up in your Operating Systems lecturers beard. You concentrate harder and vague recollections about odd things such as critical sections, mutexes, monitors, and semaphores bubble up to the surface. If you didnt think knowing about them would be of much use to you, be prepared for a terrible shock.5

3.1

Critical sections

When youre dealing with multithreaded code, your threads share virtually everything. Its like living in a house where youre the only person who buys milk and everybody else drinks it on you6 . Critical sections are like when youre heading to the fridge to get the milk and you want to make sure nobody else goes and nicks it on you in the meantime. Critical sections are parts of your code where a thread accesses a shared resource can occur. If two or more threads attempt to access the same resource or set of resources, things can get a bit like a bit like a crowd trying to get through the same door at the same time. They must therefore be treated atomically any other threads trying to execute a critical section will be blocked until the lock on that critical section is released. Critical sections should be kept as short as possible, and carefully optimised because they have a signicant effect on our programs performance.
5I 6I

think the pertainent question here would be which one of The Young Ones do you associate yourself with most? should know, I do! More fool me.

SYNCHRONISATION Thread 1 bal = GetBalance(acc); bal += bal * rate; /* La la la... */ SetBalance(acc, bal); Thread 2 bal = GetBalance(acc); bal += deposit; SetBalance(acc, bal); /* Doh! */

Figure 1: Why you need synchronisation

3.2

Mutexes

Threads are like being given an Uzipart of the trick is not to shoot yourself in the foot, or anywhere else for that matter, while you have it. Mutual exclusion locks, or mutexes, are the simplest and most primitive way of delimiting critical sections so that threads behave nicely to one another and Pthreads supplies a family of calls for using them. The two most important calls are pthread mutex lock(), which locks a mutex, and the crypically titled pthread mutex unlock(), which I wont bother explaining7 . The rst thread to lock the mutex in question gets ownership and all other threads are forced to got to sleep. When the owner unlocks it, one of the sleeping threads will be reactivated and given a chance to get ownership. Of course, this could all go rather Nelson Muntz8 and another thread could have managed to get it rst. Though normally such behaviour is ne, this might be a problem if youre dealing with realtime systems. . . 9 Always acquire mutexes in the same order. Its also a good idea to release them in the reverse of the order you acquired them in. I think an examples in order! /* Pthreads mutex example. */ #include <stdlib.h> #include <stdio.h> #include <pthread.h> void* ThreadFunc(void* arg); /* Create and initialise the mutex for use. */ pthread_mutex_t cntrMutex = PTHREAD_MUTEX_INITIALIZER; /* The global resource our mutex is to protect. */ int cntr; int main(int argc, char argv[]) { pthread_t idThread1; pthread_t idThread2; puts("Lets create some threads!"); pthread_create(&idThread1, NULL, ThreadFunc, (void*) 21); pthread_create(&idThread2, NULL, ThreadFunc, (void*) 14); pthread_join(idThread1, NULL); pthread_join(idThread2, NULL); }
7 Ok,

ok! It unlocks the mutex. Happy now? haw! 9 . . . which happens to be what Im going to be doing for my work placement.
8 Haw

SYNCHRONISATION

void* ThreadFunc(void* arg) { int i, nMax, n; /* Get the value of the argument passed in. */ nMax = (int) arg; /* Do stuff! */ for (i = 0; i < nMax; i++) { n = rand() % nMax; pthread_mutex_lock(&cntrMutex); for (cntr = 0; cntr < n; i++) printf("Loop %d: La la la!\n", cntr + 1); pthread_mutex_unlock(&cntrMutex); } return NULL; }

3.3

Breakdown of the example

In this example, we spawn two threads, each of which competes over a single resource: a global integer called cntr. Each recieves a single integer representing the maximum number of times it can execute its loops, 21 for the rst thread and 14 for the second. They both compete for access to this so that they can run an inner loop that requires cntr. 3.3.1 pthread mutex t

This represents a single mutex used for protecting a single resource. Before this can be used, it must be initialised with pthread mutex init(). In the example, resource being protected is the global variable cntr, but could be anything the threads share, e.g. a le descriptor, access to the screen, etc. . . pthread_mutex_t cntrMutex; 3.3.2 PTHREAD MUTEX INITIALIZER

Assigning this macro to a mutex initialises it for use. pthread_mutex_t cntrMutex = PTHREAD_MUTEX_INITIALIZER; 3.3.3 pthread mutex lock()

Locks the mutex. If its already locked, the calling thread is suspended until the mutex is unlocked. int pthread_mutex_lock(pthread_mutex_t* mutex); 3.3.4 pthread mutex unlock()

Unlocks the mutex and wakes the rst thread sleeping on it. int pthread_mutex_unlock(pthread_mutex_t* mutex);

LE FIN

3.4
3.4.1

Other calls you need to know about


pthread mutex init()

Initialises a mutex for use. You must do this before you can make any other calls on the mutex. This is only really useful for mutexes allocated on the heap with malloc(). int pthread_mutex_init(pthread_mutex_t* mutex, const pthread_mutexattr_t* attr); mutex is a pointer to the mutex to initialise. attr points to some extra conguration data you need to pass for creating some of the more esoteric mutex variants. Generally, you can just pass NULL, which will make it use the defaults, as in the example. 3.4.2 pthread mutex destroy()

Destroys the mutex, making it unusable. You only need to call this to clean up mutexes initialised with pthread mutex init(). This doesnt deallocate the mutex itself, however, and youll still have to call free() for that. int pthread_mutex_destroy(pthread_mutex_t* mutex); 3.4.3 pthread mutex trylock()

Like pthread mutex lock(), but if the mutex is already locked, returns immediately with EBUSY. int pthread_mutex_trylock(pthread_mutex_t* mutex);

3.5

Condition variables

Ill deal with these in an additional appendix.

3.6
3.6.1

Synchronisation problems
Deadlocks

Deadlocks occur where one thread needs another thread to do something before it can proceed, and the second needs the rst to do so too. Theyve gone and got themselves stuck in the proverbial door, neither one able to free themselves. This, you might guess, is a Bad Thing. The only way to avoid it is to be careful. Always acquire locks in the same order. Always10 . There should be very little reason for you not to do otherwise. 3.6.2 Race conditions

Threads have a little side-effectthey introduce an element of nondeterminacy11 into your otherwise sane and predictible program. Like deadlocks, theyre usually a sign that locks have been taken out of order somewhere in the program.

Le Fin

If youve any questions or comments, youll always be able to get me at <[email protected]>, though <[email protected]> should work too. The latest version of this document should be downloadable from either my website or my weblog.
10 If 11 i.e.

you dont, expect me to be down at your house ready to whack a soggy haddock against the back of your head. randomnessyou cant tell what order things will happen in.

You might also like