使用过滤器进行列表理解的最有效方法答案

【问题标题】：Most efficient way to do list comprehension with filter使用过滤器进行列表理解的最有效方法
【发布时间】：2016-03-11 00:37:03
【问题描述】：

我正在学习 Haskell 并来自 Python，所以列表推导很熟悉。接受这个列表理解（请）：

[x^2 | x <- [1..10], x^2 < 50]
[1,4,9,16,25,36,49]

表达式x^2 是否在此处对x 的每个值进行两次评估？有没有办法编写这种理解，使得表达式 x^2 只计算一次？这样做是否有意义：

filter (< 50) [x^2 | x <- [1..10]]
[1,4,9,16,25,36,49]

这是更“Haskell 方式”的做事方式吗？而且效率更高吗？

【问题讨论】：

这取决于GHC是否决定做common subexpression elimination。这对我来说看起来很简单，所以我希望 GHC 会为您优化它（当然，如果您启用了优化）并且这两个实现将执行相同的操作。比我有更好的低级 GHC 经验的人可能会向您展示生成的核心并指出优化。不过，我确实认为您的第二个实现更容易阅读。
对于它的价值，我可能会为此使用无点样式：filter (< 50) $ map (^ 2) [1..10]
@ChrisMartin 我想我不知道 Haskell 中的惯用理解如何。我在 Python 中到处使用它们。有什么想法吗？
列表推导也支持let绑定：[y | x <- [1..10], let y = x ^ 2, y < 50]。
@JonPurdy 说了什么。 --- 另外，你能猜到[y | x <- [1..], let y = x ^ 2, y < 50] 会做什么吗？（然后，看看Data.List.takeWhile）。

标签： haskell

【解决方案1】：

您可以在列表推导中使用let：

[ z | x <- [1..10], let z = x^2, z < 50]

然后x^2 只计算一次。

【讨论】：

【解决方案2】：

我会这样做，这与您的第二个示例类似：

filter (<50) (map (^2) [1..10])

我对列表理解有偏见。他们基本上只做三件事（映射、过滤和交叉产品），并且您希望拥有比这三件事更大的操作词汇。研究Data.List 模块。

至于性能，我们可以通过使用criterion 库轻松地对其进行基准测试，而无需花费太多精力。（我已经输入了a repo here——你可以使用the Stack tool 构建它。）

import Criterion.Main

main = defaultMain
       [ bgroup "one" [ bench "10"    $ nf one 10
                      , bench "100"   $ nf one 100
                      , bench "1000"  $ nf one 1000
                      , bench "10000" $ nf one 10000
                      ]
       , bgroup "two" [ bench "10"    $ nf two 10
                      , bench "100"   $ nf two 100
                      , bench "1000"  $ nf two 1000
                      , bench "10000" $ nf two 10000
                      ]
       , bgroup "three" [ bench "10"    $ nf three 10
                        , bench "100"   $ nf three 100
                        , bench "1000"  $ nf three 1000
                        , bench "10000" $ nf three 10000
                        ]
       ]

one :: Int -> Int
one n = sum [x^2 | x <- [1..n], x^2 < n*5]

two :: Int -> Int
two n = sum (filter (<(5*n)) [x^2 | x <- [1..n]])

three :: Int -> Int
three n = sum (filter (<(5*n)) (map (^2) [1..n]))

我得到了这些结果，在我看来这并没有太大的区别（如果有的话）：

% stack install --ghc-options='-O2'
Copied executables to /Users/luis.casillas/.local/bin:
- comprehension

% comprehension
benchmarking one/10
time                 18.40 ns   (18.35 ns .. 18.45 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 18.38 ns   (18.33 ns .. 18.42 ns)
std dev              143.7 ps   (116.9 ps .. 173.6 ps)

benchmarking one/100
time                 89.11 ns   (88.49 ns .. 89.72 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 88.78 ns   (88.42 ns .. 89.44 ns)
std dev              1.582 ns   (1.231 ns .. 2.103 ns)
variance introduced by outliers: 23% (moderately inflated)

benchmarking one/1000
time                 649.2 ns   (640.7 ns .. 658.7 ns)
                     0.998 R²   (0.998 R² .. 0.999 R²)
mean                 647.6 ns   (637.8 ns .. 658.0 ns)
std dev              31.40 ns   (24.70 ns .. 40.84 ns)
variance introduced by outliers: 66% (severely inflated)

benchmarking one/10000
time                 6.197 μs   (6.079 μs .. 6.282 μs)
                     0.997 R²   (0.996 R² .. 0.998 R²)
mean                 6.180 μs   (6.058 μs .. 6.295 μs)
std dev              436.0 ns   (371.1 ns .. 531.8 ns)
variance introduced by outliers: 77% (severely inflated)

benchmarking two/10
time                 20.23 ns   (19.89 ns .. 20.56 ns)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 19.89 ns   (19.71 ns .. 20.11 ns)
std dev              709.8 ps   (582.1 ps .. 939.1 ps)
variance introduced by outliers: 58% (severely inflated)

benchmarking two/100
time                 83.95 ns   (83.14 ns .. 84.90 ns)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 83.34 ns   (82.59 ns .. 83.99 ns)
std dev              2.354 ns   (1.890 ns .. 3.043 ns)
variance introduced by outliers: 44% (moderately inflated)

benchmarking two/1000
time                 645.3 ns   (635.8 ns .. 655.4 ns)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 652.9 ns   (643.1 ns .. 664.5 ns)
std dev              35.54 ns   (29.67 ns .. 46.19 ns)
variance introduced by outliers: 71% (severely inflated)

benchmarking two/10000
time                 6.268 μs   (6.142 μs .. 6.385 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 6.200 μs   (6.099 μs .. 6.367 μs)
std dev              397.6 ns   (261.9 ns .. 637.4 ns)
variance introduced by outliers: 73% (severely inflated)

benchmarking three/10
time                 18.96 ns   (18.66 ns .. 19.32 ns)
                     0.998 R²   (0.998 R² .. 0.999 R²)
mean                 19.17 ns   (18.92 ns .. 19.49 ns)
std dev              990.6 ps   (774.2 ps .. 1.393 ns)
variance introduced by outliers: 75% (severely inflated)

benchmarking three/100
time                 89.01 ns   (88.39 ns .. 89.78 ns)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 92.60 ns   (90.78 ns .. 98.08 ns)
std dev              9.138 ns   (5.755 ns .. 14.22 ns)
variance introduced by outliers: 91% (severely inflated)

benchmarking three/1000
time                 638.9 ns   (627.9 ns .. 648.7 ns)
                     0.996 R²   (0.994 R² .. 0.998 R²)
mean                 643.6 ns   (627.9 ns .. 660.6 ns)
std dev              48.67 ns   (38.78 ns .. 61.57 ns)
variance introduced by outliers: 83% (severely inflated)

benchmarking three/10000
time                 6.060 μs   (5.989 μs .. 6.119 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 6.124 μs   (6.036 μs .. 6.240 μs)
std dev              359.7 ns   (294.9 ns .. 431.9 ns)
variance introduced by outliers: 69% (severely inflated)

【讨论】：