Friday, February 13, 2009
Hoops to Jump Through
disable_shared_from_this
About once a year I end up thinking that using boost::enable_shared_from_this
is a good idea. And every time it ends up wasting almost an hour of my time. You see there's a restriction on shared_from_this
that disallows its use in the constructor of the object that is being shared from this. Unfortunately this restriction is not mentioned in the documentation. I always try to use shared_from_this
from the constructor and get bad weak pointer errors at run time. I then spend twenty minutes flailing around in the source code trying to figure out what I'm doing wrong until finally a bit of googling brings the not in the constructor restriction out in the open. Once I realize this and start trying to fix things so I don't need to use shared_from_this
in the constructor I find that I don't need enable_shared_from_this
at all. This happens every time. I'm guessing that most of the boost folks have had the same experience as well since they don't go out of their way to call attention to enable_shared_from_this
. Maybe I should stay away from libraries whose documentation can only be reached from a link in an answer to a FAQ about another library. That is just not a good sign.
boost::enable_shared_from_this
turned out to be useful.
Monday, September 8, 2008
Why Emacs?
In the process of hacking some C# compatibility into my Emacs config I came across this page. The page itself is moderately interesting and another entry in the guys blog held the solution to a problem I was facing. But the thing that jumped out at me were some of the comments. The most egregious was
You are so lame man... you spent all that time just so you have MOST of the features you get for free on the C# express. lol you FAIL at seeing the big picture
Yes, there are features in Visual Studio that Emacs lacks. I'll give him that. But then Visual Studio lacks the quality that Emacs has. The problem is to get at this quality you can't have this poster's attitude:
The problem is that while emacs and vim will always be flexible, they >require a lot of knowledge, time etc... to gather stuff.
And while you may know and enjoy writing elisp code, I simply refuse >to learn something specifically for one program. I could live with >python or ruby these days, or with a human-readable nice and easy >special language, but learning lisp just so that i maximize using >emacs sounds like sure overkill, especially cuz i would know i >wouldnt really use lisp outside from that (sorry lispers)
The power of Emacs comes from Elisp1. This is similar to the problem with this comment (and a few others like it)
Hey you do know that their is an Emacs mode inside of VS 2005 and >higher; just go to Tools ->Options-> Enviroment->Keyboard and select >Emacs.
It's not the keyboard shortcuts that make people use Emacs. It's the control that Elisp gives you. And I don't just mean writing huge amounts of code that implement huge features like JDEE2. It's little things. Like today I needed to insert a new function into a COM interface and then shift the id numbers of all the following functions up one. I could have just gone through and typed all the numbers but instead I just had to do a regex replace from "id(([0-9]+))" to "id(\,(+ 1 (string-to-number \1)))" and boom it was done in a split second. A lot less time, including that spent writing the regex, and way less brain numbing. And that's a pretty minor example, once you get the hang of it you start automating all sorts of little things that bug you that are too specific to be something the MS, Jetbrains or the Eclipse folks are going to bother dealing with. And, at least in my experience with Visual Studio, adding thee features to the IDE would be more pain then it is worth.
Maybe the best way to put it is: do you control what your tool can do or does your tool control what you can do? 3
-
Though it's funny, if the poster actually went out and learned a proper lisp like scheme or CL then the complaint would be coming form the opposite direction given the craptastic dialect that Elisp is. ↩
-
Though I bet it's a lot less code than you'd write in VBScript or whatever it is that Visual Studio is using these days. ↩
-
Insert Beavis and Butthead snickering here if you are so inclined. ↩
Wednesday, May 7, 2008
Blame the Management
So I've come up with a new reason that C++'s lack of garbage collection is problematic. The strange thing is that the reason is in one of C++'s usual sweet spots: performance.
Really I should qualify the previous statement, I'm talking about performance in multi-threaded programs where you actually want your software to work in some sane fashion as opposed to just flailing around and crashing. In order to accomplish this you need to be making liberal use of boost::shared_ptr
or something similar in order to avoid having the rug pulled out from under you on a regular basis1.
Turns out that the shared_ptr
has a dirty little secret: copying one in a multi-threaded program is orders of magnitude slower than copying a bare pointer. The reason for this is that the shared_ptr
has a reference count in it. When you copy the shared_ptr
that reference count needs to be incremented in a thread-safe manner. On platforms that support it this is done using atomic integer operations, otherwise OS synchronization primitives come into play. Either way you're talking a lot more cycles than just a pointer copy since at minimum you need to wait for the atomic operations to maintain some semblance of processor cache sanity.
So does this mean that we abandon shared_ptr
in multi-threaded programs? In C++ the alternative (bare pointers) is much, much worse so I would say no.
Now if we had a modern2 garbage collected system we wouldn't need to sacrifice cycles to the cache sanity gods. Instead we could copy pointers to our hearts content. Occasionally we would need to give the garbage man some cycles so in a way the garbage collector is amortizing the cycles we need to spend on memory management. But if well implemented this is a much lower tax to pay. In extreme circumstances you could even take control of the collector and only allow it to run when it's not going to get in the way of more important things.
So how do we mitigate things? One option would be to hook up the Boehm garbage collector to our programs. At some point I'm going to sit down and benchmark the collector versus boost::shared_ptr
. But that might not be an option depending on how much legacy code we're talking about.
If you're stuck with no garbage collection then the thing to do is minimize how often a shared_ptr
is being copied. In the vast majority of cases you should take a const reference to a shared_ptr
as parameters. Unless you're doing something wrong then somewhere either on the stack or shudder in a global variable3 there is an instance of the shared_ptr
so you shouldn't have to worry about losing the object pointed at while your function is running. The exception to this is when passing the shared_ptr
to another thread. In that case you will need to create a copy of the shared_ptr
for the other thread. Note that taking a const reference saves you from a double-whammy. If you take the pointer by value then you pay for the new shared_ptr
instance when the function is called and again when the shared_ptr
instance is destroyed as the function exits.
You should also prefer plain old dynamic_cast
over dynamic_pointer_cast
. The latter will increment the reference count on the pointer being casted. If you have the original shared_ptr
then you know the object being pointed at isn't going away so just cast the underlying pointer. Obviously if you are going to store the pointer away some where you'll have to pay for the copy.
Another thing to watch out for is locking boost::weak_ptr
. You pay every time since the reference count will be locked. If you are by chance using the pointer value without using the pointed to object all the time then consider storing the bare pointer along with the weak_pointer. Then you can use the pointer value whenever without paying to lock but still have the object lifetime monitoring of weak_ptr
when you need it.
boost::shared_ptr
is great but be aware that it's use isn't free. And while it isn't free the alternative is worse, as long as we're not referring to real garbage collection, which is better.
Saturday, May 3, 2008
It's not rocket science
So i was listening to the stack overflow podcast and Jeff Atwood declared that he thought it was ridiculous that kids are graduating with a CS degree without being taught about source code control. Now I agree that source code control is an important part of software engineering but it doesn't seem like a good subject for a college class. Implementing a source code control system sounds like a good subject for a class but using one? It's not like using a SCCS is all that hard. Where I work we use Seapine Surround which is far from being well known but we usually have new people up to speed with it's basic use in a couple of minutes. It's just not that hard. Now I would look askance at someone who was supposedly experienced yet had never used a source code control system, but a new grad I wouldn't really mind.
Sunday, January 27, 2008
Getting the rug pulled out from under you
At some point when you start writing multi-threaded programs in C++ you come to the realization that the this
pointer is no longer the trustworthy soul that it once was. Instead it has become a fair-weather friend that is just waiting for you to make that one little mistake and then it completely pulls the rug out from under you. Consider this innocuous looking code:
void Foo::function()
{
sleep(1000);
++m_variable;
}
where m_variable
is a member variable of class Foo
. Looks totally OK right? Now consider calling Foo::function
in the following way:
void other_function()
{
Foo f;
run_in_other_thread(boost::bind(&Foo::function, ref(f)));
}
If you were to trace through the call to Foo::function
that run_in_other_thread
runs in another thread you would see that it seg faults on ++m_variable
. How can this be? We're in a member function of a Foo
object, how can m_variable
not be valid? The problem is that this
is no longer a valid pointer. The object that this
points at in the call to Foo::function
goes out scope in the original thread invalidating this
as used in Foo::function
.
In single-threaded code there is no danger in Foo::function
. In order to call Foo::function
you need to have a valid reference to the object that you are calling the method on. In the multi-threaded world all bets are off. You no longer have any guarantee that the object that you called Foo::function
on still exists or if it does still exist that it will continue to exist all the way through the call to Foo::function
. The object could exist on another thread's stack and when that stack frame goes away, BOOM, out go the lights in the call to Foo::function
.
About now you're probably cursing C++'s lack of garbage collection1. In languages like Java or C# the garbage collector saves us. In order to call Foo::function
you have to have a live reference to the Foo
object in the current thread thus the garbage collector can't get rid of it half way through the call to Foo::function
. There still could be other resource allocation issues but as with most things the garbage collector takes care of about 90% of your problems.
But if we're stuck with C++ due to legacy code and adding garbage collection isn't an option we need to deal with this issue in some way. The first thing that comes to mind is to only use methods in contexts where you can be sure that the object whose method is being called will outlast the method call. When the object is on the stack and you're only using in the thread that owns that stack then you're fine. That's a good argument for keeping as much of your state local to each thread as possible.
There are probably a few other cases where you can reason about object lifetimes well enough to be able to say definitively that an object will outlive all of it's method calls. But these cases are going to be rare and as with lock-based programming you are taking your life in your hands and really it would be nice to have other options as we do for data sychronization.
When you need to start sharing state across threads you boost::shared_ptr
becomes a very good friend2. At the very least this means that any object that is going to be shared across threads needs to be allocated on the heap and stored in a shared_ptr
. Each thread should have its own shared_ptr
pointing at the object so you can be sure that the object will stick around as long as there are threads referencing it. Just be sure that you create the new pointers in threads that already have a shared_ptr
instance of their own3.
So you need to be careful about separating object ownership and object access. Being a method in a class gives a function the latter. Object ownership is a thread-level concept, not an object or function level concept. So you need to consider whether the thread that a function is executing in has some sort of ownership of the object that you are using. If not then your method call could have the rug yanked out form under at any time4.
- If you're not then you really should be.↩
-
Just be sure to respect shared_ptr's boundaries when it comes to threads. Failure to do this will very quickly turn
boost::shared_ptr
into a mortal enemy.↩ -
Just wanted to really stress that
shared_ptr
has very specific threading issues and you should really take the time to understand them before usingshared_ptr
in a multi-threaded program.↩ - Again, the lack of garbage collection should really concern you.↩
Monday, January 14, 2008
Haskell Shuffling
After reading Jeff Atwood's post about shuffling I decided it would be interesting to implement both his naive shuffling algorithm and the Fisher-Yates algorithm in Haskell. It seemed like a good little exercise and gave me a chance to check out the ST
monad and the various Array
data types in Haskell.
This post is a literate Haskell program, just copy it into a file named Shuffle.lhs and you can import the Shuffle module into any other Haskell program. Here's the module definition and imports:
> module Shuffle (naiveShuffle, fyShuffle) where
>
> import System.Random
> import Data.Array.ST
> import Data.Array.MArray
> import Data.Array.IArray
> import Data.Array.Unboxed
> import Control.Monad.ST
> import Control.Monad.State
So what were trying to do is take a sequence of numbers and generate a new sequence of numbers that is a random re-ordering of the initial sequence. Since we're going to be dealing with randomly re-ordering elements of a sequence using a list would not be the greatest idea given list's O(n) performance for inserting and deleting elements. Instead we want to use an Array
. To be more specific I'm going to use STUArray
since I want a mutable array while I'm doing the shuffling. Also since I'm just shuffling Int
s I'm using an unboxed array that directly stores the values in the array. Were we shuffling a non-primitive type then we would need to use STArray
instead which would cost a bit in performance and memory usage since pointers to the elements are stored in the array instead of the elements themselves. We will also be working in the ST
monad, as is required when using mutable arrays outside of the IO monad.
Instead of working in the ST
monad we could use IOUArray
but then our shuffling routines would only be usable in the IO
monad. Using the ST
monad gives us a bit more flexibility.
It turns out that the only real difference between the naive and Fisher-Yates(FY) shuffling algorithms is how we choose the elements in the array to swap. In both cases we start with the last element in the array and swap it with a random element from the array, then do the same with the second to last element and so until we get to the start of the array. In the naive algorithm we swap with any element in the array while in the FY algorithm we only consider elements before the current element for swapping. We can encode these rules in the following two functions which pass a bounds filter to the actual shuffle algorithm implementation. The bounds filter function just takes the array bounds and the index of the element being swapped and returns the bounds for generating a random element to swap with:
> naiveShuffle :: (Int, Int) -> StdGen -> (UArray Int Int, StdGen)
> naiveShuffle bs randGen = shuffleImpl bs randGen boundsFilter
> where
> boundsFilter bs _ = bs
>
> fyShuffle :: (Int, Int) -> StdGen -> (UArray Int Int, StdGen)
> fyShuffle bs randGen = shuffleImpl bs randGen boundsFilter
> where
> boundsFilter (lo, _) cur = (lo, cur)
Concentrate on boundsFilter
for the moment, we'll get to the rest in a minute. boundsFilter
takes the array bounds and the index of the current element and generates bounds for the random element index to swap the current element with. In the naive case the bounds for the random index are just the full array bounds while in the FY case the bounds are the lo
end of the array bounds and the current index. This is really the only difference between the two algorithms.
The rest of the shuffling algorithm is defined in shuffleImpl
which takes the bounds for the array to shuffle, a random number generator and a filter for the bounds. It is implemented as:
> shuffleImpl :: (Int, Int) -> StdGen ->
> ((Int, Int) -> Int -> (Int, Int)) ->
> (UArray Int Int, StdGen)
> shuffleImpl bs@(lo, hi) randGen boundsFilter = runST runShuffle
> where
> runShuffle :: ST s (UArray Int Int, StdGen)
> runShuffle = do a <- createArray bs
> r' <- doShuffle hi a randGen
> a' <- unsafeFreeze a
> return (a', r')
> doShuffle :: Int -> (STUArray s Int Int) -> StdGen ->
> ST s StdGen
> doShuffle cur a r
> | cur> lo =
> do
> (n, r') <- return $ randomR (boundsFilter bs cur) r
> swapElems a cur n
> doShuffle (cur - 1) a r'
> | otherwise = return r
> swapElems :: (STUArray s Int Int) -> Int -> Int -> ST s ()
> swapElems a n1 n2 = do
> v1 <- readArray a n1
> v2 <- readArray a n2
> writeArray a n1 v2
> writeArray a n2 v1
shuffleImpl
returns the shuffled array and the updated random number generator. The first thing to note is that we start with a call to runST
. We have to use runST
instead of runSTUArray
because we want to get the updated random number generator out of the ST
computation and runSTUArray
only returns the computed array. You've probably noticed that there are type annotations on all of the function definitions so far. And so far none of them have been necessary, they're there for pedagogical purposes1. Now for the definition of createArray
:
> createArray :: (Int, Int) -> (ST s (STUArray s Int Int))
> createArray bs@(low, _) = newListArray bs [low..]
When we define createArray
the type annotation is required so that the call to MArray.newListArray
knows which type of array to create. All newListArray
knows is that we want something that is of type-class MArray
. The explicit type annotation tells the compiler to use the STUArray
instance of MArray
when the call to newListArray
is made.
So really all shuffleImpl
does is use runST
to run the runShuffle
computation. In runShuffle
we use createArray
to create a new array initialized to the integers in our bounds in ascending order. Then doShuffle
is run which iterates the elements of the array swapping them according to our random number generation scheme. Note that the updates to the random number generator have to be threaded though the calls to doShuffle
. When doShuffle
is done we have to freeze the mutable array so that it can be sent out of the ST
monad and back to the caller of shuffleImpl
. We use unsafeFreeze
here, which avoids an array copy when the immutable array is created. Since we are not going to use the mutable array anymore beyond this point this is actually a safe thing to do. Finally the immutable array and the updated random number generator are returned.
One thing that gave me trouble when I first started trying to use the ST
monad was that I wanted to put forall s .
on all my type annotations. The definition of ST
involves forall
so I thought that I needed forall
all over the place as well. The problem is that in all of the ST s
types above the compiler fills in s
for you. The type for s
is hidden in the call to runST
and the user of the ST
monad does not get to know what it is. It's only purpose is to keep the state of one call to runST
separate from any other calls to runST
.
Did you notice in doShuffle
how we're passing StdGen
s all over the place? This is screaming out for the State
monad, or in our case its cousin the StateT
monad transformer. So we're now going to wrap our ST
monad in StateT
so we don't have to pass random number generators all over the place. We'll call the new version of shuffleImpl
that uses StateT
shuffleImpl'
.
> type ShuffleState s a = StateT StdGen (ST s) a
>
> shuffleImpl' :: (Int, Int) -> StdGen ->
> ((Int, Int) -> Int -> (Int, Int)) ->
> (UArray Int Int, StdGen)
> shuffleImpl' bs@(lo, hi) randGen boundsFilter =
> runST (runStateT runShuffle randGen)
> where
> runShuffle :: ShuffleState s (UArray Int Int)
> runShuffle = do a <- lift $ createArray bs
> doShuffle hi a
> lift $ unsafeFreeze a
> doShuffle :: Int -> (STUArray s Int Int) -> ShuffleState s ()
> doShuffle cur a
> | cur> lo =
> do n <- getRandom $ boundsFilter bs cur
> swapElems a cur n
> doShuffle (cur - 1) a
> | otherwise = return ()
> getRandom :: (Int, Int) -> ShuffleState s Int
> getRandom bs = do r <- get
> (n, r') <- return $ randomR bs r
> put r'
> return n
> swapElems :: (STUArray s Int Int) -> Int -> Int ->
> ShuffleState s ()
> swapElems a n1 n2 = do
> v1 <- lift $ readArray a n1
> v2 <- lift $ readArray a n2
> lift $ writeArray a n1 v2
> lift $ writeArray a n2 v1
>
>
The first thing we do is define our state type ShuffleState
. Note that it is parameterized on both the type of the monadic value a
and the ST
monad type s
. This is important. I originally tried only parameterizing on a
and introducing s
on the right side using forall
. As with the non-State implementation the use of forall
is the wrong thing to do. The compiler is smart enough to figure out what s
should be in all the uses of ShuffleState
.
The big changes in shuffleImpl'
is that we put a call to runStateT
inside the call to runST
. This runs the computation in the combined ST
and State
monads. Our state is the random number generator. We no longer pass around the random number generator, instead we stick it in the state in the call to runStateT
and then in getRandom
we grab the generator from the state, get a random number and stick the updated generator back in the state. Otherwise things work mostly the same as in shuffleImpl
modulo a few calls to lift
that are needed to lift values from the ST
monad into the combined monad. In our case we need to lift any value that is only in the ST
monad, like the results of readArray
and writeArray
.
You might have noticed that shuffleImpl'
is actually bigger than shuffleImpl
. This is due to getRandom
. While it is bigger, the actual code is a bit cleaner so I think it's worth the trade-off. If we were doing random number generation in more than just the one spot then we would probably see a net gain in code size.
So there you go, a quick tutorial on using mutable arrays in the ST
array on it's own and with StateT
.
-
I suppose pedagogical could mean anal in this case. Normally I wouldn't declare the types of functions defined in a
where
clause but it seems instructive to do so in this case.↩