Sunday, January 27, 2008

Getting the rug pulled out from under you

At some point when you start writing multi-threaded programs in C++ you come to the realization that the this pointer is no longer the trustworthy soul that it once was. Instead it has become a fair-weather friend that is just waiting for you to make that one little mistake and then it completely pulls the rug out from under you. Consider this innocuous looking code:

void Foo::function()
{
    sleep(1000);
    ++m_variable;
}

where m_variable is a member variable of class Foo. Looks totally OK right? Now consider calling Foo::function in the following way:

void other_function()
{
    Foo f;
    run_in_other_thread(boost::bind(&Foo::function, ref(f)));
}


If you were to trace through the call to Foo::function that run_in_other_thread runs in another thread you would see that it seg faults on ++m_variable. How can this be? We're in a member function of a Foo object, how can m_variable not be valid? The problem is that this is no longer a valid pointer. The object that this points at in the call to Foo::function goes out scope in the original thread invalidating this as used in Foo::function.

In single-threaded code there is no danger in Foo::function. In order to call Foo::function you need to have a valid reference to the object that you are calling the method on. In the multi-threaded world all bets are off. You no longer have any guarantee that the object that you called Foo::function on still exists or if it does still exist that it will continue to exist all the way through the call to Foo::function. The object could exist on another thread's stack and when that stack frame goes away, BOOM, out go the lights in the call to Foo::function.

About now you're probably cursing C++'s lack of garbage collection1. In languages like Java or C# the garbage collector saves us. In order to call Foo::function you have to have a live reference to the Foo object in the current thread thus the garbage collector can't get rid of it half way through the call to Foo::function. There still could be other resource allocation issues but as with most things the garbage collector takes care of about 90% of your problems.

But if we're stuck with C++ due to legacy code and adding garbage collection isn't an option we need to deal with this issue in some way. The first thing that comes to mind is to only use methods in contexts where you can be sure that the object whose method is being called will outlast the method call. When the object is on the stack and you're only using in the thread that owns that stack then you're fine. That's a good argument for keeping as much of your state local to each thread as possible.

There are probably a few other cases where you can reason about object lifetimes well enough to be able to say definitively that an object will outlive all of it's method calls. But these cases are going to be rare and as with lock-based programming you are taking your life in your hands and really it would be nice to have other options as we do for data sychronization.

When you need to start sharing state across threads you boost::shared_ptr becomes a very good friend2. At the very least this means that any object that is going to be shared across threads needs to be allocated on the heap and stored in a shared_ptr. Each thread should have its own shared_ptr pointing at the object so you can be sure that the object will stick around as long as there are threads referencing it. Just be sure that you create the new pointers in threads that already have a shared_ptr instance of their own3.

So you need to be careful about separating object ownership and object access. Being a method in a class gives a function the latter. Object ownership is a thread-level concept, not an object or function level concept. So you need to consider whether the thread that a function is executing in has some sort of ownership of the object that you are using. If not then your method call could have the rug yanked out form under at any time4.


  1. If you're not then you really should be.
  2. Just be sure to respect shared_ptr's boundaries when it comes to threads. Failure to do this will very quickly turn boost::shared_ptr into a mortal enemy.
  3. Just wanted to really stress that shared_ptr has very specific threading issues and you should really take the time to understand them before using shared_ptr in a multi-threaded program.
  4. Again, the lack of garbage collection should really concern you.

No comments: