Introduction To Concurrency

I have recently been researching concurrency control (specifically uses of Hardware Transactional Memory) for my Master's thesis, so I figured I would start off with some posts about basic concurrency control problems.

In computer systems, multiple processors (or threads) require coordination when operating on the same data similar to how several people working together require coordination when working with the same resources (such as files). Concurrency control techniques are used to provide coordination between concurrent operations on the data.

For a real-world example where concurrency control is needed, consider booking airline flights (this example is from a discussion with Fredrik Kjolstad). The airline gives me a list of seats that are free, I pay for one of them, and the airline records that the seat is mine and no longer free. Now what if two people, Alice and Bob, are booking flights at the same time? Here is an ordering of actions that results in a major problem: If both people get the list of seats at the same time, they get identical lists. Alice wants a window seat near the front of the plane, so she decides on seat 10F. She tells the airline that she wants seat 10F, she then pays for it, and the airline records that seat 10F belongs to "Alice." Bob also wants a window seat near the front, so he decides on 10F as well. There is already a problem: Bob sees that 10F is free because he got the list before Alice made her reservation. He tells the airline that he wants seat 10F, pays for it, and then the airline records that seat 10F belongs to "Bob." Now both Alice and Bob were confirmed for the same seat, but when Alice shows up to the airport her reservation has disappeared and she will be unable to board.

The problem here is fairly obvious. The airline has to check that the seat is still free when they make a new reservation. This step, however, was not required when there was only one person booking flights at a time. There is no way that a seat that was free when the list was retrieved is no longer free at the time of payment if there are no concurrent reservations.

One way to solve the problem is to allow only one person to make reservations at a time. This is called mutual exclusion (implemented using a "lock"). During this time, one person has exclusive access to the booking system. They may read the list of seats and reserve one without interruption. It is obvious, however, that this is a terrible idea in this instance. An airline that allows only one person in the world to make a reservation at a time will leave a lot of people waiting for their turn. Additionally, a delay for one customer affects everyone. If I am in the middle of making a reservation, but then go off to get coffee, the entire world must wait for me.

A better strategy is to "lock" each individual seat rather than the whole booking system. It doesn't matter if I am booking seat 10F at the same time that someone else is booking seat 9D. The new algorithm to book a seat is: 1) Get the list of seats and choose one, 2) Request exclusive access to the seat ("lock" it), 3) When we have exclusive access, check that it is still free, 4) Make a reservation if it is free, otherwise go back to step 1 and choose another seat (also, no matter what happens, make sure to unlock the seat when we are done). This solves the problem of making sure that the seat is still free when booking it, but it does so at a finer granularity and doesn't lock everyone out of the entire booking system. This strategy also has several issues, but I will address those in a future post.

While the issue with the airline booking system is obvious, concurrency issues in computer programs can be much more subtle. Let's look at a simple example that is similar to the "like" functionality on Facebook.

int likes = 0;

void like_post()
{
    likes += 1;
}

The like_post() function increases the number of "likes" that a post has (for simplicity let's just say there is only one post, and it's number of likes is in the "likes" variable, which starts at 0). All that this function does is add one to the likes variable, however, there is a major problem with calling like_post() concurrently.

I wrote a program that creates two threads that each call like_post() 10,000 times (source code in my blog examples repository here). The output we would expect is, 'Post liked 20000 times', but I ran that program several times and saw 'Post liked 13449 times', 'Post liked 19949 times', and several other incorrect values.

The (very subtle) problem is in the "likes += 1" line. When the compiler generates an add to "likes" it does the following:

  1. Load the old value of "likes" into a CPU register
  2. Increment that value by 1
  3. Store the new value back into "likes"

Looking at it this way, the problem is very similar to the airline example. Each thread (or "user") reads the old value (step 1), modifies it (step 2), and stores the new value back (step 3). Both threads can read the value 0, independently increment it (both increment to 1), and store it back (both store back the value 1). The value we would expect from liking a post twice is "2".

The fundamental problem is that the assembly instructions for increment are not executed atomically. Each thread may see partial results from the other (in this case, they see the original "likes" value even though the other thread is at step 2 and has not written back the new value yet). Similar to the airline example, we may solve this problem using a lock that allows only a single thread to read, modify, and write back the "likes" value at a time:

int likes = 0;

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;

void like_post()
{
    pthread_mutex_lock(&lock);
    likes += 1;
    pthread_mutex_unlock(&lock);
}

Running this code results in the correct output, 'Post liked 20000 times'. Full source code is available here.

The gcc and clang compilers include atomic built-ins that can solve this problem as well. The issue was that incrementing a number is not done atomically, but many CPU instruction sets include instructions that can atomically fetch and increment a number.  The compiler intrinsic function "__sync_fetch_and_add" uses one of these instructions:

int likes = 0;

void like_post()
{
    __sync_fetch_and_add(&likes, 1);
}

This also results in the correct output. Full source code is available here.

That's it for now. In one of my next posts I will go into more detail about concurrency control, locking, and what exactly I'm doing for my thesis.

New blog

I've decided to start this blog about my various projects and interests. I never really got into the whole blogging craze when it started up, but I think it will be beneficial to share what I am doing with a larger audience and possibly get some feedback (also, it seems like every programmer has a blog these days).

About me:

The "About me" page I already have on here was copied from my old homepage, and it reads like a CV. Since I am just starting here, I will take this opportunity to introduce myself. My name is Christopher Johnson (but I just go by Chris), and I am a Ph.D student at MIT CSAIL in the Parallel and Distributed Operating Systems group. I am currently working on my Master's thesis on how Hardware Transactional Memory can be used to ease programming of concurrent data structures in operating system kernels. In my free time I work on various Android applications and side projects, which I will describe in more detail in this blog. My leisure interests include video gaming, reading, playing board games with friends, and exploring around Boston. Most of these will probably appear in this blog at some point in time.

I think that's about it for now. Time to get back to work.