placeholder

Saturday, May 3, 2008

It's not rocket science

So i was listening to the stack overflow podcast and Jeff Atwood declared that he thought it was ridiculous that kids are graduating with a CS degree without being taught about source code control. Now I agree that source code control is an important part of software engineering but it doesn't seem like a good subject for a college class. Implementing a source code control system sounds like a good subject for a class but using one? It's not like using a SCCS is all that hard. Where I work we use Seapine Surround which is far from being well known but we usually have new people up to speed with it's basic use in a couple of minutes. It's just not that hard. Now I would look askance at someone who was supposedly experienced yet had never used a source code control system, but a new grad I wouldn't really mind.

Sunday, January 27, 2008

Getting the rug pulled out from under you

At some point when you start writing multi-threaded programs in C++ you come to the realization that the this pointer is no longer the trustworthy soul that it once was. Instead it has become a fair-weather friend that is just waiting for you to make that one little mistake and then it completely pulls the rug out from under you. Consider this innocuous looking code:

void Foo::function()
{
    sleep(1000);
    ++m_variable;
}

where m_variable is a member variable of class Foo. Looks totally OK right? Now consider calling Foo::function in the following way:

void other_function()
{
    Foo f;
    run_in_other_thread(boost::bind(&Foo::function, ref(f)));
}

If you were to trace through the call to Foo::function that run_in_other_thread runs in another thread you would see that it seg faults on ++m_variable. How can this be? We're in a member function of a Foo object, how can m_variable not be valid? The problem is that this is no longer a valid pointer. The object that this points at in the call to Foo::function goes out scope in the original thread invalidating this as used in Foo::function.

In single-threaded code there is no danger in Foo::function. In order to call Foo::function you need to have a valid reference to the object that you are calling the method on. In the multi-threaded world all bets are off. You no longer have any guarantee that the object that you called Foo::function on still exists or if it does still exist that it will continue to exist all the way through the call to Foo::function. The object could exist on another thread's stack and when that stack frame goes away, BOOM, out go the lights in the call to Foo::function.

About now you're probably cursing C++'s lack of garbage collection¹. In languages like Java or C# the garbage collector saves us. In order to call Foo::function you have to have a live reference to the Foo object in the current thread thus the garbage collector can't get rid of it half way through the call to Foo::function. There still could be other resource allocation issues but as with most things the garbage collector takes care of about 90% of your problems.

But if we're stuck with C++ due to legacy code and adding garbage collection isn't an option we need to deal with this issue in some way. The first thing that comes to mind is to only use methods in contexts where you can be sure that the object whose method is being called will outlast the method call. When the object is on the stack and you're only using in the thread that owns that stack then you're fine. That's a good argument for keeping as much of your state local to each thread as possible.

There are probably a few other cases where you can reason about object lifetimes well enough to be able to say definitively that an object will outlive all of it's method calls. But these cases are going to be rare and as with lock-based programming you are taking your life in your hands and really it would be nice to have other options as we do for data sychronization.

When you need to start sharing state across threads you boost::shared_ptr becomes a very good friend². At the very least this means that any object that is going to be shared across threads needs to be allocated on the heap and stored in a shared_ptr. Each thread should have its own shared_ptr pointing at the object so you can be sure that the object will stick around as long as there are threads referencing it. Just be sure that you create the new pointers in threads that already have a shared_ptr instance of their own³.

So you need to be careful about separating object ownership and object access. Being a method in a class gives a function the latter. Object ownership is a thread-level concept, not an object or function level concept. So you need to consider whether the thread that a function is executing in has some sort of ownership of the object that you are using. If not then your method call could have the rug yanked out form under at any time⁴.

If you're not then you really should be.↩
Just be sure to respect shared_ptr's boundaries when it comes to threads. Failure to do this will very quickly turn boost::shared_ptr into a mortal enemy.↩
Just wanted to really stress that shared_ptr has very specific threading issues and you should really take the time to understand them before using shared_ptr in a multi-threaded program.↩
Again, the lack of garbage collection should really concern you.↩

Monday, January 14, 2008

Haskell Shuffling

After reading Jeff Atwood's post about shuffling I decided it would be interesting to implement both his naive shuffling algorithm and the Fisher-Yates algorithm in Haskell. It seemed like a good little exercise and gave me a chance to check out the ST monad and the various Array data types in Haskell.

This post is a literate Haskell program, just copy it into a file named Shuffle.lhs and you can import the Shuffle module into any other Haskell program. Here's the module definition and imports:


> module Shuffle (naiveShuffle, fyShuffle) where
>
> import System.Random
> import Data.Array.ST
> import Data.Array.MArray 
> import Data.Array.IArray 
> import Data.Array.Unboxed 
> import Control.Monad.ST
> import Control.Monad.State

So what were trying to do is take a sequence of numbers and generate a new sequence of numbers that is a random re-ordering of the initial sequence. Since we're going to be dealing with randomly re-ordering elements of a sequence using a list would not be the greatest idea given list's O(n) performance for inserting and deleting elements. Instead we want to use an Array. To be more specific I'm going to use STUArray since I want a mutable array while I'm doing the shuffling. Also since I'm just shuffling Ints I'm using an unboxed array that directly stores the values in the array. Were we shuffling a non-primitive type then we would need to use STArray instead which would cost a bit in performance and memory usage since pointers to the elements are stored in the array instead of the elements themselves. We will also be working in the ST monad, as is required when using mutable arrays outside of the IO monad.

Instead of working in the ST monad we could use IOUArray but then our shuffling routines would only be usable in the IO monad. Using the ST monad gives us a bit more flexibility.

It turns out that the only real difference between the naive and Fisher-Yates(FY) shuffling algorithms is how we choose the elements in the array to swap. In both cases we start with the last element in the array and swap it with a random element from the array, then do the same with the second to last element and so until we get to the start of the array. In the naive algorithm we swap with any element in the array while in the FY algorithm we only consider elements before the current element for swapping. We can encode these rules in the following two functions which pass a bounds filter to the actual shuffle algorithm implementation. The bounds filter function just takes the array bounds and the index of the element being swapped and returns the bounds for generating a random element to swap with:

 
> naiveShuffle :: (Int, Int) -> StdGen -> (UArray Int Int, StdGen)
> naiveShuffle bs randGen = shuffleImpl bs randGen boundsFilter
>     where
>     boundsFilter bs _ = bs
>
> fyShuffle :: (Int, Int) -> StdGen -> (UArray Int Int, StdGen)
> fyShuffle bs randGen = shuffleImpl bs randGen boundsFilter
>     where
>     boundsFilter (lo, _) cur = (lo, cur)

Concentrate on boundsFilter for the moment, we'll get to the rest in a minute. boundsFilter takes the array bounds and the index of the current element and generates bounds for the random element index to swap the current element with. In the naive case the bounds for the random index are just the full array bounds while in the FY case the bounds are the lo end of the array bounds and the current index. This is really the only difference between the two algorithms.

The rest of the shuffling algorithm is defined in shuffleImpl which takes the bounds for the array to shuffle, a random number generator and a filter for the bounds. It is implemented as:

 
> shuffleImpl :: (Int, Int) -> StdGen -> 
>                ((Int, Int) -> Int -> (Int, Int)) -> 
>                (UArray Int Int, StdGen)
> shuffleImpl bs@(lo, hi) randGen boundsFilter = runST runShuffle
>     where
>     runShuffle :: ST s (UArray Int Int, StdGen)
>     runShuffle = do a <- createArray bs
>                     r' <- doShuffle hi a randGen
>                     a' <- unsafeFreeze a
>                     return (a', r')
>     doShuffle :: Int -> (STUArray s Int Int) -> StdGen -> 
>                  ST s StdGen
>     doShuffle cur a r
>         | cur> lo = 
>             do
>             (n, r') <- return $ randomR (boundsFilter bs cur) r
>             swapElems a cur n
>             doShuffle (cur - 1) a r'
>         | otherwise = return r
>     swapElems :: (STUArray s Int Int) -> Int -> Int -> ST s ()
>     swapElems a n1 n2 = do
>                         v1 <- readArray a n1
>                         v2 <- readArray a n2
>                         writeArray a n1 v2
>                         writeArray a n2 v1

shuffleImpl returns the shuffled array and the updated random number generator. The first thing to note is that we start with a call to runST. We have to use runST instead of runSTUArray because we want to get the updated random number generator out of the ST computation and runSTUArray only returns the computed array. You've probably noticed that there are type annotations on all of the function definitions so far. And so far none of them have been necessary, they're there for pedagogical purposes¹. Now for the definition of createArray:

 
> createArray :: (Int, Int) -> (ST s (STUArray s Int Int))
> createArray bs@(low, _) = newListArray bs [low..]

When we define createArray the type annotation is required so that the call to MArray.newListArray knows which type of array to create. All newListArray knows is that we want something that is of type-class MArray. The explicit type annotation tells the compiler to use the STUArray instance of MArray when the call to newListArray is made.

So really all shuffleImpl does is use runST to run the runShuffle computation. In runShuffle we use createArray to create a new array initialized to the integers in our bounds in ascending order. Then doShuffle is run which iterates the elements of the array swapping them according to our random number generation scheme. Note that the updates to the random number generator have to be threaded though the calls to doShuffle. When doShuffle is done we have to freeze the mutable array so that it can be sent out of the ST monad and back to the caller of shuffleImpl. We use unsafeFreeze here, which avoids an array copy when the immutable array is created. Since we are not going to use the mutable array anymore beyond this point this is actually a safe thing to do. Finally the immutable array and the updated random number generator are returned.

One thing that gave me trouble when I first started trying to use the ST monad was that I wanted to put forall s . on all my type annotations. The definition of ST involves forall so I thought that I needed forall all over the place as well. The problem is that in all of the ST s types above the compiler fills in s for you. The type for s is hidden in the call to runST and the user of the ST monad does not get to know what it is. It's only purpose is to keep the state of one call to runST separate from any other calls to runST.

Did you notice in doShuffle how we're passing StdGens all over the place? This is screaming out for the State monad, or in our case its cousin the StateT monad transformer. So we're now going to wrap our ST monad in StateT so we don't have to pass random number generators all over the place. We'll call the new version of shuffleImpl that uses StateT shuffleImpl'.

 
> type ShuffleState s a = StateT StdGen (ST s) a
>
> shuffleImpl' :: (Int, Int) -> StdGen -> 
>                 ((Int, Int) -> Int -> (Int, Int)) -> 
>                 (UArray Int Int, StdGen)
> shuffleImpl' bs@(lo, hi) randGen boundsFilter = 
>     runST (runStateT runShuffle randGen)
>     where
>     runShuffle :: ShuffleState s (UArray Int Int)
>     runShuffle  = do a <- lift $ createArray bs
>                      doShuffle hi a
>                      lift $ unsafeFreeze a
>     doShuffle :: Int -> (STUArray s Int Int) -> ShuffleState s ()
>     doShuffle cur a
>         | cur> lo = 
>             do n <- getRandom $ boundsFilter bs cur
>                swapElems a cur n
>                doShuffle (cur - 1) a
>         | otherwise  = return ()
>     getRandom :: (Int, Int) -> ShuffleState s Int
>     getRandom bs = do r <- get
>                       (n, r') <- return $ randomR bs r
>                       put r'
>                       return n
>     swapElems :: (STUArray s Int Int) -> Int -> Int -> 
>                  ShuffleState s ()
>     swapElems a n1 n2 = do
>                         v1 <- lift $ readArray a n1
>                         v2 <- lift $ readArray a n2
>                         lift $ writeArray a n1 v2
>                         lift $ writeArray a n2 v1
>
>

The first thing we do is define our state type ShuffleState. Note that it is parameterized on both the type of the monadic value a and the ST monad type s. This is important. I originally tried only parameterizing on a and introducing s on the right side using forall. As with the non-State implementation the use of forall is the wrong thing to do. The compiler is smart enough to figure out what s should be in all the uses of ShuffleState.

The big changes in shuffleImpl' is that we put a call to runStateT inside the call to runST. This runs the computation in the combined ST and State monads. Our state is the random number generator. We no longer pass around the random number generator, instead we stick it in the state in the call to runStateT and then in getRandom we grab the generator from the state, get a random number and stick the updated generator back in the state. Otherwise things work mostly the same as in shuffleImpl modulo a few calls to lift that are needed to lift values from the ST monad into the combined monad. In our case we need to lift any value that is only in the ST monad, like the results of readArray and writeArray.

You might have noticed that shuffleImpl' is actually bigger than shuffleImpl. This is due to getRandom. While it is bigger, the actual code is a bit cleaner so I think it's worth the trade-off. If we were doing random number generation in more than just the one spot then we would probably see a net gain in code size.

So there you go, a quick tutorial on using mutable arrays in the ST array on it's own and with StateT.

I suppose pedagogical could mean anal in this case. Normally I wouldn't declare the types of functions defined in a where clause but it seems instructive to do so in this case.↩

Saturday, December 29, 2007

Dunce Cap Removal

OK, I have something to admit. I realized the other day that I totally misunderstood how variable bindings in Haskell do blocks propagated to later statements in the block. Once I realized this and saw how wrong I had been it was sort of like taking off a dunce cap that I didn't know I was wearing. I guess it was a good thing that I never went shooting my mouth off about this subject else my dunce cap would have become visible to everybody else as well.

To be a bit more explicit about what I was so wrong about consider this bit of Haskell code:

   do
    x <- someAction
    y <- someOtherAction
    putStrLn $ show $ x + y

How does the last line access the x binding? For some reason I had concocted this crazy complicated mechanism where the compiler was creating ever longer tuples containing all the variables bound in the do block and then automatically extracting the proper element from the tuple in each statement that a variable was used in. It was really hand-wavey and looking back on it I guess I was being a bit lazy. Had I really sat down and worked out how this ever-growing tuple thing worked I would have seen how clunky it was and probably noticed the dunce cap a lot sooner.

Then the other day I decided to manually de-sugar a do block in some code that I was working on. Doing the same to the above results in

 someAction >>= 
        (\x -> someOtherAction >>= 
        (\y -> putStrLn $ show $ x + y))

It was at this point I realized how big of an idiot I had been. The compiler doesn't need to start creating tuples of ever-growing length. In the putStrLn expression y is passed in so it is obviously available. But notice that the lambda expression containing putStrLn is defined within another lambda expression that x is passed into. x is part of the putStrLn lambda's environment and thus is accessible. So instead of building bigger and bigger tuples as we go further into the do block the compiler is instead nesting more and more lambda expressions. Since the previous lines in a do block become lambda expressions containing the current expression any parameters they take are available in the inner lambda expressions.

So now my dunce cap is off. Maybe if I had exposed it to someone earlier they would have disabused me of my misunderstanding earlier. Maybe they would have just pointed and laughed or let me go on thinking the way that I was, sort of an intellectual kick-me sign. At least getting rid of this dunce cap means that I am learning. Though now I'm left wondering how many more dunce caps I'm wearing that I don't know about.

Saturday, November 17, 2007

Poor-man's Closure

So C++ is lacking in some areas of modern programming language design, that's not a big secret. But given the flexibility of C++ the question becomes: how hard is it to add missing features? Here I'm going to take a look at closures and lambda expressions.

First off we can talk about closures. As far as I know there it isn't possible to add full closures to C++. But we can get a poor man's closure using the boost::bind library. If you don't know about this library yet you should run (not walk) over to boost.org and check it out. If you work at all with STL algorithms it will cut down on the amount of code you write immensely. So for a demonstration of the usefulness of boost::bind as a sort of closure mechanism lets say that you want to add two to all the elements in some std::vector<int>. You could do this using std::binder1st and std::mem_fun but that's kind of a pain in the ass. Instead you can do

std::transform(values.begin(), values.end(), 
               values.begin(),
               boost::bind(std::plus<int>(), _1, 2));

Things start getting interesting when you want to use bind to modify a value from outside of your STL algorithm. Take a look at this:

int lastVal = 0;
std::for_each(values.begin(), values.end(),
              boost::bind(sum, boost::ref(lastVal), _1));

Assuming that sum is a function that takes two ints and puts the result into it's first argument then this would leave the sum of the vector in lastVal ¹.

This last example also can be used to illustrate why boost::bind doesn't create a true closure. Suppose that instead of using our bind expression immediately we return it using boost::function ². When we try to use the bind object very bad things will happen, a crash if we're lucky. This happens because the boost::ref object in the bind object is referencing a variable that no longer exists³. If we had a true closure then the variable would continue to exist as long as the closure did and we'd be safe. Without garbage collection I don't see how to make this work⁴.

Once you have closures then lambda expressions (or local function definitions) become very useful. Unfortunately C++ has neither. There is the boost::lambda library which can be useful in this area but it is somewhat limited and in my experience it doesn't work so hot with VC++, even the newest incarnation of the compiler⁵. We can get local function definitions though by making a local structure definition⁶:

int sum(const std::vector<int>& values)
{   
    int total = 0;

    struct Calculate
{
    Calculate(int& total_) : m_total(total_) {}

    int& m_total;

    void operator()(int v2)
    {
    m_total += v2;
    }           
}

std::for_each(values.begin(), values.end(), Calculate(total));

return total;
}

One thing to note about the above is that in order to pull total in to the Calculate closure we had to have Calculate hold a reference to it⁷. Local structures like Calculate can refer to local variables in the enclosing function but only if they are static. Now we probably don't want to go declaring total static since that will lead to reentrancy problems. This is unfortunate, if we didn't have this static only restriction then Calculate would be much more compact. We can use bind to build the closure though and make things a bit more compact⁸:

int sum(const std::vector<int>& values)
{   
    int total = 0;

    struct Calculate
{
    static void run(int& total_, int v)
    {
    total_ += v;
        }           
}

std::for_each(values.begin(), values.end(), 
              boost::bind(Calculate::run, boost::ref(total), _1));

return total;
}

While this makes things a bit more compact the difference when you're binding more variables is much more dramatic. If we could make local function definitions the change in size would be slightly better still. If we didn't have the refer only to static local variables restriction it would be even better.

So we can get pretty close to closures and local function definitions in C++ without having them built into the language. Obviously you have to be careful to keep your local function definitions short in order to maintain readability. While it's nice to have a short block of code that you're passing into a function defined right where you pass it in, once the code gets too long it will start obscuring the enclosing functions flow and really belongs off on its own. But even if the block of code in question is defined outside of your function you can still close over it with bind.

I realize this is a really contrived example and that there are much better ways to sum a sequence (std::accumulate comes to mind). But this example is a simple illustration of a technique that is useful when combined with the poor man's lambda that I talk about later.↩
Note that we can't return the bind expression directly because the type of boost::bind(sum, boost::ref(lastVal), _1)) is some hideous thing that the documentation calls various variations on unspecified. Instead we need to capture the bind expression in a boost::function<void(int)> and return that.↩
You might be debating the usefulness of this construct and you'd be right. In this case it's not so useful to create a bind the binds to a local variable and is returned from the function where that local variable exists. But we could instead bind to a member variable in an object and register that bind object as a callback somewhere. This is useful but puts you on dangerous ground where we need the object that contains the bound member variable to outlast the callback registration. ↩
One could address this by cluttering up the code with boost::shared_ptr's but that can quickly get messy. I've heard that the C++ standardization committee is talking about adding closures to the language but for the life of me I can't figure out how that's going to work without garbage collection. Maybe they're only talking about adding closures to Managed C++.↩
Visual C++ * with the service pack at the time of this writing.↩
Yep, you can make structure definitions within a function definition. I didn't know about this until a few months ago and it has made my life a lot easier. One thing to note is that you cannot have static members in a local structure so their usefulness can be somewhat curtailed in some situations. You also can't use templates at all in the local structure.↩
The underscore on the end of total_ is only there to make explicit that the constructor parameter is different from the total local variable. The constructor parameter could just as well have been called total since nothing in Calculate can refer to the local variable total.↩
Note that if you're using VC++ 2005 without the service pack a compiler bug could prevent this code from compiling.↩

Thursday, September 13, 2007

Missing the point

OK, I'll admit. I've jumped on the SICP bandwagon. I've worked through about half the book¹ and am really liking it. Of course I've come to it a bit late to the book not being a computer science graduate, though it seems these days one probably wouldn't get exposed to it anyway or so I hear. It seems like one of those books that's great but people dread taking a class that uses it, kind of like Jackson or Goldstein for physics majors. Anyways, had the bandwagon not come by for me to jump on I probably wouldn't have ever found it so here's to bandwagons.

Enough blowing smoke up Abelman and Sussman's asses about their book. It's great, we all get it. Though it seems like some people have decided they liked the book but not the language and are going to do something about it. And to me it seems like that's missing the point. Sort of like how people go all ga-ga over peanut butter and chocolate together², like somehow these two things are just meant to go together. Well in this case they are. A&B chose lisp for a reason, and not just because they're lisp weenies.

The way I feel about it is that there are certain language categories that everyone programmer should learn at least one language from. They are machine level ³, static imperative ⁴, dynamic ⁵, functional ⁶ and lisp ⁷.

Learning a machine level language will help you understand what the machine is doing with your code. Learning lisp will teach you what the compiler is doing with your code. Think of it as the assembly language of computation, whereas assembly is the, well, assembly language of computers. That's why A&B chose it. While there are other languages that look more like lambda calculus (ML anyone?), when you're writing lisp you're writing out the abstract syntax trees for your program. And understanding how you're programs goes from source code to a computation is powerful stuff.

Now I don't want to bag on the folks at SICP in other language but if you use another language when working through SICP then you're sort of missing the point. Going back and using the exercises later when you're learning a new language is fine, but the first time through use lisp. It won't bite but you might wear out your ( and ) keys.

Kind of slow going, but when you've got a two year old at home you don't get a many solid blocks of time to too deeply into things.↩
Never understood that one myself. Sure they taste OK together but I'm actually happier keeping them apart. But what do I know, I don't Even like my ice cream with chunky bits in it and have problems using my microwave.↩
C is probably good enough but some assembly wouldn't hurt.↩
Java, C#, C++, etc. For better or worse it's where the jobs are so you probably can't avoid learning at least one of these language.↩
Python, Ruby, Javascript, etc.
↩
Preferably pure, lazy and strongly typed, but I guess we can compromise on a point or two.↩

Monday, June 25, 2007

pointer-less?

One thing I remember about Java back when I was working on enterprisey sorts of things is that for a language that supposedly has no pointers it sure seemed like we got a lot of NullPointerExceptions. NullPointerException was definitely the biggest cause of production bugs followed closely by ClassCastException. So if a language is purported to not have pointers why do they seem to be causing so many problems? Isn't Java supposed to be better than C++ due to this lack of pointers?

The problem is that Java does have pointers. In fact everything that isn't a primitive type is handled through pointers. There is no stack/heap dichotomy like there is in C/C++: you either have a primitive type that lives on the stack (or is part of a non-primitive object) or you have a non-primitive type that lives ~~on the heap~~ in the object store. These non-pointer pointers do have some advantages over their C/C++ cousins, the biggest being that dereferencing a null pointer doesn't bring your whole program crashing down around you. Ah the advantages of a managed run-time where an invalid pointer dereference doesn't have to treated as a possible attack and require the slaying of the perpetrator with extreme prejudice.

While it is nice that you can catch NullPointerExceptions, it would be nicer if people actually did it more than half the time. Like I said above: when I worked on enterprisey stuff NullPointerException was the single biggest cause of production problems. Now I don't think that me or any of the people I was working with in said shop were morons. And it seemed like failure to account for the possibility of null pointers was one of the biggest problems we saw in programming tests we gave during interviews. I'd say that this is a product of the "java has no pointers" mentality but C/C++ programmers make their fair share of the same sort of error. Maybe certain languages that have done away with pointers altogether are on to something. Anyway, maybe keep this in mind the next time you're bagging on C/C++ pointers compared to Java/C# keep in mind that you're not really that far ahead of the game, a few steps maybe but not leaps and bounds.