【问题标题】:Most efficient way to do list comprehension with filter使用过滤器进行列表理解的最有效方法
【发布时间】:2016-03-11 00:37:03
【问题描述】:

我正在学习 Haskell 并来自 Python,所以列表推导很熟悉。接受这个列表理解(请):

[x^2 | x <- [1..10], x^2 < 50]
[1,4,9,16,25,36,49]

表达式x^2 是否在此处对x 的每个值进行两次评估?有没有办法编写这种理解,使得表达式 x^2 只计算一次?这样做是否有意义:

filter (< 50) [x^2 | x <- [1..10]]
[1,4,9,16,25,36,49]

这是更“Haskell 方式”的做事方式吗?而且效率更高吗?

【问题讨论】:

  • 这取决于GHC是否决定做common subexpression elimination。这对我来说看起来很简单,所以我希望 GHC 会为您优化它(当然,如果您启用了优化)并且这两个实现将执行相同的操作。比我有更好的低级 GHC 经验的人可能会向您展示生成的核心并指出优化。不过,我确实认为您的第二个实现更容易阅读。
  • 对于它的价值,我可能会为此使用无点样式:filter (&lt; 50) $ map (^ 2) [1..10]
  • @ChrisMartin 我想我不知道 Haskell 中的惯用理解如何。我在 Python 中到处使用它们。有什么想法吗?
  • 列表推导也支持let绑定:[y | x &lt;- [1..10], let y = x ^ 2, y &lt; 50]
  • @JonPurdy 说了什么。 --- 另外,你能猜到[y | x &lt;- [1..], let y = x ^ 2, y &lt; 50] 会做什么吗? (然后,看看Data.List.takeWhile)。

标签: haskell


【解决方案1】:

您可以在列表推导中使用let

[ z | x <- [1..10], let z = x^2, z < 50]

然后x^2 只计算一次。

【讨论】:

    【解决方案2】:

    我会这样做,这与您的第二个示例类似:

    filter (<50) (map (^2) [1..10])
    

    我对列表理解有偏见。他们基本上只做三件事(映射、过滤和交叉产品),并且您希望拥有比这三件事更大的操作词汇。研究Data.List 模块。

    至于性能,我们可以通过使用criterion 库轻松地对其进行基准测试,而无需花费太多精力。 (我已经输入了a repo here——你可以使用the Stack tool 构建它。)

    import Criterion.Main
    
    main = defaultMain
           [ bgroup "one" [ bench "10"    $ nf one 10
                          , bench "100"   $ nf one 100
                          , bench "1000"  $ nf one 1000
                          , bench "10000" $ nf one 10000
                          ]
           , bgroup "two" [ bench "10"    $ nf two 10
                          , bench "100"   $ nf two 100
                          , bench "1000"  $ nf two 1000
                          , bench "10000" $ nf two 10000
                          ]
           , bgroup "three" [ bench "10"    $ nf three 10
                            , bench "100"   $ nf three 100
                            , bench "1000"  $ nf three 1000
                            , bench "10000" $ nf three 10000
                            ]
           ]
    
    one :: Int -> Int
    one n = sum [x^2 | x <- [1..n], x^2 < n*5]
    
    two :: Int -> Int
    two n = sum (filter (<(5*n)) [x^2 | x <- [1..n]])
    
    three :: Int -> Int
    three n = sum (filter (<(5*n)) (map (^2) [1..n]))
    

    我得到了这些结果,在我看来这并没有太大的区别(如果有的话):

    % stack install --ghc-options='-O2'
    Copied executables to /Users/luis.casillas/.local/bin:
    - comprehension
    
    % comprehension
    benchmarking one/10
    time                 18.40 ns   (18.35 ns .. 18.45 ns)
                         1.000 R²   (1.000 R² .. 1.000 R²)
    mean                 18.38 ns   (18.33 ns .. 18.42 ns)
    std dev              143.7 ps   (116.9 ps .. 173.6 ps)
    
    benchmarking one/100
    time                 89.11 ns   (88.49 ns .. 89.72 ns)
                         1.000 R²   (1.000 R² .. 1.000 R²)
    mean                 88.78 ns   (88.42 ns .. 89.44 ns)
    std dev              1.582 ns   (1.231 ns .. 2.103 ns)
    variance introduced by outliers: 23% (moderately inflated)
    
    benchmarking one/1000
    time                 649.2 ns   (640.7 ns .. 658.7 ns)
                         0.998 R²   (0.998 R² .. 0.999 R²)
    mean                 647.6 ns   (637.8 ns .. 658.0 ns)
    std dev              31.40 ns   (24.70 ns .. 40.84 ns)
    variance introduced by outliers: 66% (severely inflated)
    
    benchmarking one/10000
    time                 6.197 μs   (6.079 μs .. 6.282 μs)
                         0.997 R²   (0.996 R² .. 0.998 R²)
    mean                 6.180 μs   (6.058 μs .. 6.295 μs)
    std dev              436.0 ns   (371.1 ns .. 531.8 ns)
    variance introduced by outliers: 77% (severely inflated)
    
    benchmarking two/10
    time                 20.23 ns   (19.89 ns .. 20.56 ns)
                         0.999 R²   (0.998 R² .. 0.999 R²)
    mean                 19.89 ns   (19.71 ns .. 20.11 ns)
    std dev              709.8 ps   (582.1 ps .. 939.1 ps)
    variance introduced by outliers: 58% (severely inflated)
    
    benchmarking two/100
    time                 83.95 ns   (83.14 ns .. 84.90 ns)
                         0.999 R²   (0.999 R² .. 1.000 R²)
    mean                 83.34 ns   (82.59 ns .. 83.99 ns)
    std dev              2.354 ns   (1.890 ns .. 3.043 ns)
    variance introduced by outliers: 44% (moderately inflated)
    
    benchmarking two/1000
    time                 645.3 ns   (635.8 ns .. 655.4 ns)
                         0.998 R²   (0.997 R² .. 0.999 R²)
    mean                 652.9 ns   (643.1 ns .. 664.5 ns)
    std dev              35.54 ns   (29.67 ns .. 46.19 ns)
    variance introduced by outliers: 71% (severely inflated)
    
    benchmarking two/10000
    time                 6.268 μs   (6.142 μs .. 6.385 μs)
                         0.998 R²   (0.997 R² .. 0.999 R²)
    mean                 6.200 μs   (6.099 μs .. 6.367 μs)
    std dev              397.6 ns   (261.9 ns .. 637.4 ns)
    variance introduced by outliers: 73% (severely inflated)
    
    benchmarking three/10
    time                 18.96 ns   (18.66 ns .. 19.32 ns)
                         0.998 R²   (0.998 R² .. 0.999 R²)
    mean                 19.17 ns   (18.92 ns .. 19.49 ns)
    std dev              990.6 ps   (774.2 ps .. 1.393 ns)
    variance introduced by outliers: 75% (severely inflated)
    
    benchmarking three/100
    time                 89.01 ns   (88.39 ns .. 89.78 ns)
                         0.998 R²   (0.997 R² .. 0.999 R²)
    mean                 92.60 ns   (90.78 ns .. 98.08 ns)
    std dev              9.138 ns   (5.755 ns .. 14.22 ns)
    variance introduced by outliers: 91% (severely inflated)
    
    benchmarking three/1000
    time                 638.9 ns   (627.9 ns .. 648.7 ns)
                         0.996 R²   (0.994 R² .. 0.998 R²)
    mean                 643.6 ns   (627.9 ns .. 660.6 ns)
    std dev              48.67 ns   (38.78 ns .. 61.57 ns)
    variance introduced by outliers: 83% (severely inflated)
    
    benchmarking three/10000
    time                 6.060 μs   (5.989 μs .. 6.119 μs)
                         0.998 R²   (0.997 R² .. 0.999 R²)
    mean                 6.124 μs   (6.036 μs .. 6.240 μs)
    std dev              359.7 ns   (294.9 ns .. 431.9 ns)
    variance introduced by outliers: 69% (severely inflated)
    

    【讨论】:

      猜你喜欢
      • 2010-12-07
      • 1970-01-01
      • 1970-01-01
      • 2017-05-19
      • 2020-08-25
      • 1970-01-01
      • 1970-01-01
      • 2020-02-22
      • 1970-01-01
      相关资源
      最近更新 更多