为什么writeSTRef比表达式更快?

时间:2012-03-27 05:24:51

标签: haskell

每次迭代写两次writeSTRef

fib3 :: Int -> Integer
fib3 n = runST $ do
    a <- newSTRef 1
    b <- newSTRef 1
    replicateM_ (n-1) $ do
        !a' <- readSTRef a
        !b' <- readSTRef b
        writeSTRef a b'
        writeSTRef b $! a'+b'
    readSTRef b
每次迭代都会写一次writeSTRef

fib4 :: Int -> Integer
fib4 n = runST $ do
    a <- newSTRef 1
    b <- newSTRef 1
    replicateM_ (n-1) $ do
        !a' <- readSTRef a
        !b' <- readSTRef b
        if a' > b'
          then writeSTRef b $! a'+b'
          else writeSTRef a $! a'+b'
    a'' <- readSTRef a
    b'' <- readSTRef b
    if a'' > b''
      then return a''
      else return b''

基准,给定n = 20000

  

基准测试20000 / fib3   平均值:5.073608 ms,磅5.071842 ms,ub 5.075466 ms,ci 0.950   std dev:9.284321 us,lb 8.119454 us,ub 10.78107 us,ci 0.950

     

基准测试20000 / fib4   平均值:5.384010 ms,lb 5.381876 ms,ub 5.386099 ms,ci 0.950   std dev:10.85245 us,lb 9.510215 us,ub 12.65554 us,ci 0.950

fib3比fib4快一点。

1 个答案:

答案 0 :(得分:17)

我想你已经从#haskell得到了一些答案;基本上,每个writeSTRef归结为对内存的一次或两次写入,这在这种情况下很便宜,因为它们可能永远不会超过1级缓存。

另一方面,由fib3中的if-then-else产生的分支创建了两个在连续迭代中交替进行的路径,这对于许多CPU分支预测器来说是一个坏的情况,为管道添加了气泡。请参阅http://en.wikipedia.org/wiki/Instruction_pipeline

纯版本怎么样?

fib0 :: Int -> Integer
fib0 = go 0 1 where
    go :: Integer -> Integer -> Int -> Integer
    go a b n = case n > 0 of
        True -> go b (a + b) (n - 1)
        False -> b

它甚至更快:

benchmarking fib0 40000
mean: 17.14679 ms, lb 17.12902 ms, ub 17.16739 ms, ci 0.950
std dev: 97.28594 us, lb 82.39644 us, ub 120.1041 us, ci 0.950

benchmarking fib3 40000
mean: 17.32658 ms, lb 17.30739 ms, ub 17.34931 ms, ci 0.950
std dev: 106.7610 us, lb 89.69371 us, ub 126.8279 us, ci 0.950

benchmarking fib4 40000
mean: 18.13887 ms, lb 18.11173 ms, ub 18.16868 ms, ci 0.950
std dev: 145.9772 us, lb 127.6892 us, ub 168.3347 us, ci 0.950
相关问题