代码之家  ›  专栏  ›  技术社区  ›  Joey Adams

Haskell中非装箱数组的循环性能

  •  3
  • Joey Adams  · 技术社区  · 16 年前

    首先,它很棒。然而,我遇到了这样一种情况:我的基准测试出现了奇怪的结果。我是Haskell的新手,这是我第一次被可变数组和单子弄脏手。以下代码基于 this example .

    for 接受数字和步进函数而不是范围的函数(如 forM_ 对于 函数(循环A)反对嵌入等价的递归函数(循环B)。循环A明显比循环B快。更奇怪的是,将循环A和B放在一起比单独使用循环B快(但比单独使用循环A慢一点)。

    我能为这些差异想出一些可能的解释。请注意,这些只是猜测:

    • 循环B以低于循环a的缓存效率的方式对数组进行故障处理。为什么?
    • 我犯了个愚蠢的错误;循环A和循环B实际上是不同的。
      • 请注意,在所有3种情况下,都有循环A和循环B中的一个或两个,程序会产生相同的输出。

    这是密码。我试过了 ghc -O2 for.hs 使用GHC版本6.10.4。

    import Control.Monad
    import Control.Monad.ST
    import Data.Array.IArray
    import Data.Array.MArray
    import Data.Array.ST
    import Data.Array.Unboxed
    
    for :: (Num a, Ord a, Monad m) => a -> a -> (a -> a) -> (a -> m b) -> m ()
    for start end step f = loop start where
        loop i
            | i <= end   = do
                f i
                loop (step i)
            | otherwise  = return ()
    
    primesToNA :: Int -> UArray Int Bool
    primesToNA n = runSTUArray $ do
        a <- newArray (2,n) True :: ST s (STUArray s Int Bool)
        let sr = floor . (sqrt::Double->Double) . fromIntegral $ n+1
    
        -- Loop A
        for 4 n (+ 2) $ \j -> writeArray a j False
    
        -- Loop B
        let f i
            | i <= n     = do
                writeArray a i False
                f (i+2)
            | otherwise  = return ()
            in f 4
    
        forM_ [3,5..sr] $ \i -> do
            si <- readArray a i
            when si $
                forM_ [i*i,i*i+i+i..n] $ \j -> writeArray a j False
        return a
    
    primesTo :: Int -> [Int]
    primesTo n = [i | (i,p) <- assocs . primesToNA $ n, p]
    
    main = print $ primesTo 30000000
    
    2 回复  |  直到 16 年前
        1
  •  2
  •   Travis Brown    16 年前

    我只是试着用 Criterion

    另外,如果您的step函数实际上只是一个step,并且它的参数没有做任何奇怪的事情,那么 for 似乎要快一点,尤其是对于较小的阵列:

    for' :: (Enum a, Num a, Ord a, Monad m) => a -> a -> (a -> a) -> (a -> m b) -> m ()
    for' start end step = forM_ $ enumFromThenTo start (step start) end
    

    以下是标准的结果,其中 loopA' 你的循环是用我的吗 for' ,在哪里 loopC A和B在一起:

    benchmarking loopA...
    mean: 2.372893 s, lb 2.370982 s, ub 2.374914 s, ci 0.950
    std dev: 10.06753 ms, lb 8.820194 ms, ub 11.66965 ms, ci 0.950
    
    benchmarking loopA'...
    mean: 2.368167 s, lb 2.354312 s, ub 2.381413 s, ci 0.950
    std dev: 69.50334 ms, lb 65.94236 ms, ub 73.17173 ms, ci 0.950
    
    benchmarking loopB...
    mean: 2.423160 s, lb 2.419131 s, ub 2.427260 s, ci 0.950
    std dev: 20.78412 ms, lb 18.06613 ms, ub 24.99021 ms, ci 0.950
    
    benchmarking loopC...
    mean: 4.308503 s, lb 4.304875 s, ub 4.312110 s, ci 0.950
    std dev: 18.48732 ms, lb 16.19325 ms, ub 21.32299 ms, ci 0.950<
    

    module Main where
    
    import Control.Monad
    import Control.Monad.ST
    import Data.Array.ST
    import Data.Array.Unboxed
    
    import Criterion.Main
    
    for :: (Num a, Ord a, Monad m) => a -> a -> (a -> a) -> (a -> m b) -> m ()
    for start end step f = loop start where
        loop i
            | i <= end   = do
                f i
                loop (step i)
            | otherwise  = return ()
    
    for' :: (Enum a, Num a, Ord a, Monad m) => a -> a -> (a -> a) -> (a -> m b) -> m ()
    for' start end step = forM_ $ enumFromThenTo start (step start) end
    
    loopA  arr n = for  4 n (+ 2) $ flip (writeArray arr) False
    loopA' arr n = for' 4 n (+ 2) $ flip (writeArray arr) False
    
    loopB arr n =
      let f i | i <= n     = do writeArray arr i False
                                f (i+2)
              | otherwise  = return ()
      in f 4
    
    loopC arr n = do
      loopA arr n
      loopB arr n
    
    runPrimes loop n = do
        let sr = floor . (sqrt::Double->Double) . fromIntegral $ n+1
        a <- newArray (2,n) True :: (ST s (STUArray s Int Bool))
    
        loop a n
    
        forM_ [3,5..sr] $ \i -> do
            si <- readArray a i
            when si $
                forM_ [i*i,i*i+i+i..n] $ \j -> writeArray a j False
        return a
    
    primesA  n = [i | (i,p) <- assocs $ runSTUArray $ runPrimes loopA  n, p]
    primesA' n = [i | (i,p) <- assocs $ runSTUArray $ runPrimes loopA' n, p]
    primesB  n = [i | (i,p) <- assocs $ runSTUArray $ runPrimes loopB  n, p]
    primesC  n = [i | (i,p) <- assocs $ runSTUArray $ runPrimes loopC  n, p]
    
    main = let n = 10000000 in
      defaultMain [ bench "loopA"  $ nf primesA  n
                  , bench "loopA'" $ nf primesA' n
                  , bench "loopB"  $ nf primesB  n
                  , bench "loopC"  $ nf primesC  n ]
    
        2
  •  2
  •   Don Stewart    16 年前

    或许可以和枪战nsieve程序进行比较和对比?在任何情况下,唯一知道真正发生了什么的方法就是 look at the core (例如 the ghc-core tool

    {-# OPTIONS -O2 -optc-O -fbang-patterns -fglasgow-exts -optc-march=pentium4 #-}
    --
    -- The Computer Language Shootout
    -- http://shootout.alioth.debian.org/
    --
    -- Contributed by Don Stewart 2005
    -- nsieve over an ST monad Bool array
    --
    
    import Control.Monad.ST
    import Data.Array.ST
    import Data.Array.Base
    import System
    import Control.Monad
    import Data.Bits
    import Text.Printf
    
    main = do
        n <- getArgs >>= readIO . head :: IO Int
        mapM_ (\i -> sieve (10000 `shiftL` (n-i))) [0, 1, 2]
    
    sieve n = do
       let r = runST (do a <- newArray (2,n) True :: ST s (STUArray s Int Bool)
                         go a n 2 0)
       printf "Primes up to %8d %8d\n" (n::Int) (r::Int) :: IO ()
    
    go !a !m !n !c
        | n == m    = return c
        | otherwise = do
              e <- unsafeRead a n
              if e then let loop j
                              | j < m     = do
                                  x <- unsafeRead a j
                                  when x $ unsafeWrite a j False
                                  loop (j+n)
    
                              | otherwise = go a m (n+1) (c+1)
                        in loop (n `shiftL` 1)
                   else go a m (n+1) c