功能性Scala的重构/布局答案

【问题标题】：Refactoring / layout of functional Scala功能性Scala的重构/布局
【发布时间】：2010-11-21 16:30:56
【问题描述】：

这一个班轮...

 Console.println(io.Source.fromFile("names.txt").getLines.mkString.split(",").map{x:String => x.slice(1, x.length -1)}.sortBy { x => x}.zipWithIndex.map{t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}}.sum);

... 是我对Project Euler problem 22 的解决方案。它似乎有效，而且它是用（我尝试的）功能风格编写的。

这个例子有点极端，但我的问题有点笼统——你更喜欢如何编写/格式化/注释函数式风格的代码？函数式方法似乎鼓励了一系列方法调用，我发现这可能会变得不可读，而且也没有明显的地方可以放置 cmets。

另外，当我编写过程代码时，我发现我编写的每个小方法都有一个目的和有意义的名称。当我编写函数式代码时，我似乎正在养成一种习惯，它产生的行有点像上面的行，其中（对我而言）含义很难破译——而且单个计算也很难在其他地方重用。我在网上看到的很多功能代码示例都同样简洁（对我而言）晦涩难懂。

我该怎么办？为计算的每个部分编写在当前上下文中有意义的名称的小函数？（即使它们只不过是地图的包装器，比方说？）

对于我给出的示例，有什么更好的编写和呈现方式？

（像所有风格问题一样，这个问题是主观的。不过，它没有理由引起争论！）

【问题讨论】：

我没有足够的 Scala 经验给你很好的建议，但如果你将每个 map 分成不同的行可能会有所帮助？我很想知道真正的 Scala 人对此有什么看法。
一个非常重要且与语言/范式无关的规则是行长度不要超过 80 个（或其他合理值）字符。 190 个字符简直太疯狂了（如果你在打高尔夫球，至少把它降到 130，否则你永远不会打败 perl！;)）。
同意，这是因为教学原因。但我的问题部分是关于如何最好地将它分成更小的块，而不是选择中断只是为了使行

标签： scala refactoring functional-programming coding-style

【解决方案1】：

整理它的一个简单的第一次尝试是删除前导 Console. 尾随 ; 和显式 :String 类型 - 所有这些都是不必要的 - 添加一些缩进并导入 io.Source:

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",").map{
    x => x.slice(1, x.length -1)
  }.sortBy {x => x}.zipWithIndex.map{
    t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}
  }.sum
)

下一步是稍微清理一下，在映射元组列表时使用模式匹配和identity 而不是x=>x。字符也不需要toChar，可以用单引号来表示字符字面量。

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",").map {
    x => x.slice(1, x.length -1)
  }.sortBy(identity).zipWithIndex.map {
    case (v, idx) =>{ (idx+1)*(v.map{_ - 'A' + 1}.sum)}
  }.sum
)

更多更改也有助于使代码的意图更加清晰：

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",")
  .map { _.stripPrefix("\"").stripSuffix("\"") }
  .sortBy(identity)
  .map { _.map{_ - 'A' + 1}.sum }
  .zipWithIndex
  .map { case (v, idx) => (idx+1) * v }
  .sum
)

下一步，为了使其更“实用”，是将其分解为“功能”（偷偷摸摸，嗯？）。理想情况下，每个函数都有一个清楚地表达其用途的名称，并且它会很短（目标是单个表达式，因此不需要大括号）：

import io.Source

def unquote(s:String) = s.stripPrefix("\"").stripSuffix("\"")

def wordsFrom(fname:String) =
  Source.fromFile(fname).getLines.mkString.split(",").map(unquote)

def letterPos(c:Char) = c - 'A' + 1

println(
  wordsFrom("names.txt")
  .sortBy(identity)
  .map { _.map(letterPos).sum }
  .zipWithIndex
  .map { case (v, idx) => (idx+1) * v }
  .sum
)

wordsFrom 是一个明显的 1-liner，但我将其拆分以便在 stackOverflow 上进行格式化

【讨论】：

谢谢。我仍然是 Scala 的新手，所以这是非常有用的东西。重构几乎就是我会做的。不过，我以某种方式获得了（可能是错误的）观点，即这种小型的单一表达式函数方法不是做事的方法。
在全屏查看代码后，我不得不将其回滚几个档次，在我将所有内容分解为函数后，它看起来并没有比原来的好多少:)
+1 这与我认为的这个问题的完美答案非常接近。轻微的挑剔：sorted 不是比sortBy(identity) 更可取吗？
@Aaron Novstrup，是的，排序更好。
轻微挑剔：我不喜欢fname。请记住，从 Scala 2.8 开始，由于命名参数，参数名称是方法的公共合同的一部分。 path呢？

【解决方案2】：

以下是我认为可能是更好的布局方式：

Console.println(
    io.Source.fromFile("names.txt")
    .getLines.mkString.split(",")
    .map{x:String => x.slice(1, x.length -1)}
    .sortBy { x => x}
    .zipWithIndex
    .map{t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}}
    .sum);

我觉得在我的大脑深处，有一些算法可以决定代码布局在水平和垂直空间之间的权衡，但我似乎无法直接访问以使我能够清楚地表达它。 :)

关于引入名称而不是使用 lambda，我认为我通常会做的是，如果我很想发表简短的评论来描述代码的意图，但是一个好的标识符名称可能会这样做，那么我可能会考虑一次性将 lambda 转换为命名函数，以获得标识符名称的可读性优势。 toChar 调用的那一行是上面唯一一个对我来说看起来像候选人的行。（为了清楚起见，我会在 map 中考虑（部分）lambda，但 map 调用本身。）或者，垂直空格的引入为您提供了一个悬挂 //comment 的地方，这是一个替代引入标识符名称。

（免责声明：我不编写 Scala，所以如果我说的任何内容与样式约定冲突，请忽略我 :)，但我想很多建议大多与语言无关。）

【讨论】：

【解决方案3】：

严格来说如何格式化该代码，而不进行任何结构更改，我会这样做：

Console println (
  (
    io.Source
    fromFile "names.txt"
    getLines ()
    mkString ""
    split ","
    map (x => x.slice(1, x.length - 1))
    sortBy (x => x)
    zipWithIndex
  )
  map (t => (t._2 + 1) * (t._1 map (_.toChar - "A"(0).toChar + 1) sum) )
  sum
)

或者，也许，为了绕过无参数方法，我会这样做：

Console println (
  io.Source
  .fromFile("names.txt")
  .getLines
  .mkString
  .split(",")
  .map(x => x.slice(1, x.length - 1))
  .sortBy(x => x)
  .zipWithIndex
  .map(t => (t._2 + 1) * (t._1 map (_.toChar - "A"(0).toChar + 1) sum) )
  .sum
)

请注意，cmets 有足够的空间，但是，一般来说，什么正在做的事情通常很清楚。不习惯的人有时会迷路中途，没有变量来跟踪的含义/类型转换后的值。

现在，我会做一些不同的事情：

println ( // instead of Console println
  Source // put import elsewhere
  .fromFile("names.txt")
  .mkString // Unless you want to get rid of /n, which is unnecessary here
  .split(",")
  .map(x => x.slice(1, x.length - 1))
  .sorted // instead of sortBy
  .zipWithIndex
  .map { // use { only for multiple statements and, as in this case, pattern matching
    case (c, index) => (index + 1) * (c map (_ - 'A' + 1) sum) // chars are chars
  }
  .sum
)

我也不会在同一步骤中进行求和和乘法，所以：

  .sorted
  .map(_ map (_ - 'A' + 1) sum)
  .zipWithIndex
  .map { case (av, index) => av * (index + 1) }
  .sum

最后，我不太喜欢调整字符串大小，所以我可能会求助于正则表达式。添加一点重构，这就是我可能会写的东西：

  import scala.io.Source
  def names = Source fromFile "names.txt" mkString

  def wordExtractor = """"(.*?)"""".r
  def words = for {
    m <- wordExtractor findAllIn names matchData
  } yield m group 1

  def alphabeticValue(s: String) = s map (_ - 'A' + 1) sum
  def wordsAV = words.toList.sorted map alphabeticValue

  def multByIndex(t: (Int, Int)) = t match {
    case (av, index) => av * (index + 1)
  }
  def wordsAVByIndex = wordsAV.zipWithIndex map multByIndex

  println(wordsAVByIndex.sum)

编辑

下一步将是重命名重构——选择能够更好地传达代码各部分功能的名称。 Ken 建议在 cmets 中使用更好的名称，我会将它们用于另一种变体（它还很好地展示了 更好的 命名在多大程度上提高了可读性）。

import scala.io.Source
def rawData = Source fromFile "names.txt" mkString

// I'd rather write "match" than "m" in the next snippet, but
// the former is a keyword in Scala, so "m" has become more
// common in my code than "i". Also, make the return type of
// getWordsOf clear, because iterators can be tricky.
// Returning a list, however, makes a much less cleaner
// definition.

def wordExtractor = """"(.*?)"""".r
def getWordsOf(input: String): Iterator[String] = for {
  m <- wordExtractor findAllIn input matchData
} yield m group 1
def wordList = getWordsOf(rawData).toList

// I stole letterPosition from Kevin's solution. There, I said it. :-)

def letterPosition(c: Char) = c.toUpper - 'A' + 1 // .toUpper isn't necessary
def alphabeticValueOfWord(word: String) = word map letterPosition sum
def alphabeticValues = wordList.sorted map alphabeticValueOfWord

// I don't like multAVByIndex, but I haven't decided on a better
// option yet. I'm not very fond of declaring methods that return
// functions either, but I find that better than explicitly handling
// tuples (oh, for the day tuples/arguments are unified!).

def multAVByIndex = (alphabeticValue: Int, index: Int) => 
  alphabeticValue * (index + 1)
def scores = alphabeticValues.zipWithIndex map multAVByIndex.tupled

println(scores sum)

【讨论】：

我最喜欢这个。我会重命名一些变量以使事情更清晰：names -> rawData（反映它是一个大字符串），wordsAV -> alphabeticalValues（避免使用首字母缩写词），wordsAVByIndex -> scores（避免使用首字母缩写词，称它们为分数，因为这就是欧拉计划打电话给他们）
@Ken 我没有重命名重构。 :-) 事实上，我更喜欢你的所有建议，而不是目前命名的变量。也许我什至应该更改代码，使其更具可读性？
是的，请更改它（通过在您的答案中添加另一个变体），以便该页面的未来读者可以从您对此的看法中受益。

【解决方案4】：

除了凯文的解决方案，

关键是要清楚整齐地拆分功能，同时考虑到可重用性和可读性。

为了让代码更短更简洁，好像可以用for表达式。


def inputString(inputFile: String) = io.Source.fromFile(inputFile).getLines.mkString

def inputWords(input: String) = input.split("[,\"]").filter("" != )

Console.println {
    (for { (s, idx) <- inputWords(inputString("names.txt")).sorted.zipWithIndex }
        yield s.map(_ - 'A' + 1).sum * (idx + 1)).sum
}

s.map(_-'A'+1) 部分可以进一步放入一个函数中，比如 wordSum，如果你想让它更明显的话。

【讨论】：

谢谢。当然，这完全取决于个人喜好，但我发现 Kevin 的返工更清晰。