Ruby if 语句优化重构最佳实践答案

【问题标题】：Ruby if statement optimization refactor best practiceRuby if 语句优化重构最佳实践
【发布时间】：2018-07-03 13:11:09
【问题描述】：

我在这里遇到了一个非常常见的重构情况，在浏览了几篇博客之后，我仍然没有得到任何满意的评论；所以在这里问一个问题。

h = {
  a: 'a',
  b: 'b'
}
new_hash = {}
new_hash[:a] = h[:a].upcase if h[:a].present?

据我朋友说，这段代码可以通过以下方式重构以提高性能。

a = h[:a]
new_hash[:a] = a.upcase if a.present?

乍一看，它看起来有点优化。但它会产生很大的不同还是过度优化？应该首选哪种风格？

寻求专家建议:)

更新Benchmark n = 1000

              user     system      total        real
hash lookup  0.000000   0.000000   0.000000 (  0.000014)
new var      0.000000   0.000000   0.000000 (  0.000005)
AND op       0.000000   0.000000   0.000000 (  0.000018)
try          0.000000   0.000000   0.000000 (  0.000046)

更新 Memory Benchmark 使用 gem benchmark-memory

Calculating -------------------------------------
         hash lookup    40.000  memsize (    40.000  retained)
                         1.000  objects (     1.000  retained)
                         1.000  strings (     1.000  retained)
             new var     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
              AND op    40.000  memsize (    40.000  retained)
                         1.000  objects (     1.000  retained)
                         1.000  strings (     1.000  retained)
                 try   200.000  memsize (    40.000  retained)
                         5.000  objects (     1.000  retained)
                         1.000  strings (     1.000  retained)

【问题讨论】：

取决于您打算做什么。更多的上下文肯定会有所帮助。如果你对 mutating h 没问题，你可以做 h[:a] && h[:a].upcase! 或 h[:a].try(:upcase!)
要使用第二种方法回答您关于性能和优化的问题，最好围绕它进行一些基准测试。您正在通过存储在变量中来补偿哈希查找。
@kiddorails upcase! 或 upcase 两者都会做同样的事情，因为无论如何我都会使用另一个哈希来存储它的值。还尝试使用try 方法进行基准测试。
他们不会做与upcase!相同的事情，可能会改变h中的值，也可能会返回nil。例如'A'.upcase! #=> nil @kiddorails 的观点是 new_hash 如果更改 h[:a] 并通过关联 h 直接是可接受的解决方案，则不需要。这种方式在技术上会更加高效，尽管它是无限小的。
@engineersmnky new_hash 对我来说很重要。

标签： ruby performance optimization refactoring ruby-hash

【解决方案1】：

根据您的情况，present? 之类的 Rails 方法可能很脏，并且肯定会影响性能。如果您只关心nil 检查而不是空Array 或空白String 之类的东西，那么使用纯ruby 方法将“快得多”（引号是为了强调性能在此完全无关紧要的事实基本示例）

因为我们正在对事物进行基准测试。

设置

h = {
  a: 'a',
  b: 'b'
}


class Object
  def present? 
    !blank?
  end
  def blank?
    respond_to?(:empty?) ? !!empty? : !self
  end
end

def hash_lookup(h)
  new_hash = {}
  new_hash[:a] = h[:a].upcase if h[:a].present?
  new_hash
end

def new_var(h)
  new_hash = {}
  a = h[:a]
  new_hash[:a] = a.upcase if a.present?
  new_hash
end

def hash_lookup_w_safe_nav(h)
  new_hash = {}
  new_hash[:a] = h[:a]&.upcase
  new_hash
end

def hash_lookup_wo_rails(h)
  new_hash = {}
  new_hash[:a] = h[:a].upcase if h[:a]
  new_hash
end

def new_var_wo_rails(h)
  new_hash = {}
  a = h[:a]
  new_hash[:a] = a.upcase if a
  new_hash
end

基准测试

N = [1_000,10_000,100_000]
require 'benchmark'
N.each do |n|
  puts "OVER #{n} ITERATIONS"
  Benchmark.bm do |x|
    x.report(:new_var) { n.times {new_var(h)}}
    x.report(:hash_lookup) { n.times {hash_lookup(h)}}
    x.report(:hash_lookup_w_safe_nav) { n.times {hash_lookup_w_safe_nav(h)}}
    x.report(:hash_lookup_wo_rails) { n.times {hash_lookup_wo_rails(h)}}
    x.report(:new_var_wo_rails) { n.times {new_var_wo_rails(h)}}
  end
end

输出

OVER 1000 ITERATIONS
                        user     system      total        real
new_var                 0.001075   0.000159   0.001234 (  0.001231)
hash_lookup             0.002441   0.000000   0.002441 (  0.002505)
hash_lookup_w_safe_nav  0.001077   0.000000   0.001077 (  0.001077)
hash_lookup_wo_rails    0.001100   0.000000   0.001100 (  0.001145)
new_var_wo_rails        0.001015   0.000000   0.001015 (  0.001016)
OVER 10000 ITERATIONS
                        user     system      total        real
new_var                 0.010321   0.000000   0.010321 (  0.010329)
hash_lookup             0.010104   0.000015   0.010119 (  0.010123)
hash_lookup_w_safe_nav  0.007211   0.000000   0.007211 (  0.007213)
hash_lookup_wo_rails    0.007508   0.000000   0.007508 (  0.017302)
new_var_wo_rails        0.008186   0.000026   0.008212 (  0.016679)
OVER 100000 ITERATIONS
                        user     system      total        real
new_var                 0.099400   0.000249   0.099649 (  0.192481)
hash_lookup             0.101419   0.000009   0.101428 (  0.199788)
hash_lookup_w_safe_nav  0.078156   0.000010   0.078166 (  0.140796)
hash_lookup_wo_rails    0.078743   0.000000   0.078743 (  0.166815)
new_var_wo_rails        0.073271   0.000000   0.073271 (  0.125869)

【讨论】：

感谢您提供更详细的基准测试。看起来new_var_wo_rails 更好，并且在代码中看起来也更具描述性。我将删除present 检查。

【解决方案2】：

优化有不同的含义，有内存优化、性能优化，还有可读性和代码的结构。

性能：对速度和性能几乎没有任何影响，因为在 O(1) 中访问哈希。尝试使用benchmark 看看自己几乎没有区别

You can check this article about hash lookup and why it's so fast

内存：你朋友的代码不如你的优化，因为他初始化了另一个对象a，而你的却没有。

可读性和风格：乍一看，您朋友的代码看起来行数更少，描述性更强。但请记住，您可能需要对哈希中的每个键/值执行此操作，因此您可能需要 a、b，并且它会随着您的哈希继续进行（当它像那样当然更好地迭代哈希）。在这里看的不多

【讨论】：

我已经用 1000 计数的基准更新了问题。似乎分配新变量太快了。在内存基准测试中以及分配新变量似乎消耗的内存更少。
感谢您提供基准！不确定0.000014 和0.000005 之间的区别是否可以说是“太快了”。两种情况下的时间“几乎”相同，因此很容易忽略对速度的影响