codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Follow publication

The Haskell Concurrency Primitive Shootout

Recently I saw this tweet:

I took a look at System.Timeout and it doesn’t appear to utilize a global lock, but it does use two global IORefs and atomicModifyIORef. If you dig down deep enough into atomicModifyIORef‘s implementation, it does lead to some locking. Maybe that is causing contention (spoiler probably not) ?

To see if I could cause contention I wrote the following benchmark:

main = do
ref <- newIORef
maxCap <- getNumCapabilities
defaultMain $ flip map [0 .. maxCap - 1] $ \n ->
bgroup (show (n + 1) ++ " threads")
[ bench "IORef" $ whnfIO $ do
xs <- forM [0 .. n] $ \i -> asyncOn i $
replicateM_ 10000 $ do
b <- atomicModifyIORef ref $ \x ->
let !x' = x + 1 in x' `seq` (x', ())
return $! b
mapM_ wait xs

Running it with stack run benchmarks -oIORef.html I get :

Time in milliseconds (ms)

So IORef does slowdown considerably as one increases the number of concurrent modifications.

Is this slow? How does it compare to the other Haskell concurrency primitives.

I extended the benchmark to include MVar and TVar. I also compared an alternative implementation of atomic IORef modification, atomic-primops‘s atomicModifyIORefCas, and threw in the AtomicCounter just for fun. Here are the results:

Time in milliseconds (ms)

So what does it mean? Well MVar is the slowest. MVar is also always slower than TVars. This is surprising, since I would assume the overhead in TVar's transactional guarantees would make it slower. Here is a good theory on why this might be the case

The next surprising thing is atomicModifyIORefCAS is much faster than atomicModifyIORef. I don’t get this. Both atomicModifyIORefCAS and atomicModifyIORef are doing the same thing. They both call a compare and swap primitive and loop if the value is not the new value. Actually, I assumed atomicModifyIORef would be faster because it’s CAS loop is implemented in C — . Please look over my benchmarks here to ensure I’m not doing anything stupid.

Conclusion

  1. Prefer TVar to MVar. STM is easier to use and it’s not clear there is necessarily a performance benefit to MVars (there was zero benefit in this test).

2. atomicModifyIORefCAS is considerably faster than atomicModifyIORef .

3. AtomicCounter is very fast and performs well under contention.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

No responses yet

Write a response