Haskell 递归数据类型的“默认行为”答案

【问题标题】：'Default Behavior' for Haskell recursive data typesHaskell 递归数据类型的“默认行为”
【发布时间】：2015-05-25 20:48:37
【问题描述】：

我正在尝试在 Haskell 中编写一个命题逻辑求解器。我用称为“句子”的递归数据类型表示逻辑表达式，该数据类型具有用于不同操作的多个子类型-“AndSentence”、“OrSentence”等。所以我猜这是一棵树，其中有几种类型的节点，每个节点都有 0、1、或 2 个孩子。

它似乎可以工作，但有些代码有点重复，我认为应该有更好的方式来表达它。基本上我有几个函数，其中“默认行为”是让函数递归地作用于节点的子节点，在某些节点类型（通常是“AtomicSentences”，即叶子）上触底。所以我写了一个这样的函数：

imply_remove :: Sentence Symbol -> Sentence Symbol
imply_remove (ImplySentence s1 s2) = OrSentence (NotSentence (imply_remove s1)) (imply_remove s2)
imply_remove (AndSentence s1 s2) = AndSentence (imply_remove s1) (imply_remove s2)
imply_remove (OrSentence s1 s2) = OrSentence (imply_remove s1) (imply_remove s2)
imply_remove (NotSentence s1) = NotSentence (imply_remove s1)
imply_remove (AtomicSentence s1) = AtomicSentence s1

我想要一种更简洁的方式来编写“AndSentence”、“OrSentence”和“NotSentence”的行。

仿函数似乎与我想要的相似，但没有成功...我想对子树进行操作，而不是对子树的每个节点中包含的某些值进行操作。

有没有正确的方法来做到这一点？或者更自然的方式来构建我的数据？

【问题讨论】：

我建议使用 syntactic 库。

标签： haskell functor recursive-datastructures

【解决方案1】：

这看起来是recursion-schemes 的一个很好的案例。

首先，我们将您的Sentence sym 类型描述为类型级不动点一个合适的函子。

{-# LANGUAGE DeriveFunctor, LambdaCase #-}

import Data.Functor.Foldable  -- from the recursion-schemes package

-- The functor describing the recursive data type
data SentenceF sym r
   = AtomicSentence sym
   | ImplySentence r r
   | AndSentence r r
   | OrSentence r r
   | NotSentence r
   deriving (Functor, Show)

-- The original type recovered via a fixed point
type Sentence sym = Fix (SentenceF sym)

上面的Sentence sym 类型几乎与您原来的类型相同，除了所有内容都必须包含在Fix 中。修改原始代码以使用这种类型是完全机械的：我们使用(Constructor ...)，现在使用Fix (Constructor ...)。比如

type Symbol = String

-- A simple formula: not (p -> (p || q))
testSentence :: Sentence Symbol
testSentence = 
   Fix $ NotSentence $
      Fix $ ImplySentence
         (Fix $ AtomicSentence "p")
         (Fix $ OrSentence
            (Fix $ AtomicSentence "p")
            (Fix $ AtomicSentence "q"))

这是您的原始代码，其中包含冗余（额外的Fixes 使情况变得更糟）。

-- The original code, adapted
imply_remove :: Sentence Symbol -> Sentence Symbol
imply_remove (Fix (ImplySentence s1 s2)) =
  Fix $ OrSentence (Fix $ NotSentence (imply_remove s1)) (imply_remove s2)
imply_remove (Fix (AndSentence s1 s2)) =
  Fix $ AndSentence (imply_remove s1) (imply_remove s2)
imply_remove (Fix (OrSentence s1 s2)) =
  Fix $ OrSentence (imply_remove s1) (imply_remove s2)
imply_remove (Fix (NotSentence s1)) =
  Fix $ NotSentence (imply_remove s1)
imply_remove (Fix (AtomicSentence s1)) =
  Fix $ AtomicSentence s1

让我们通过评估imply_remove testSentence 来执行测试：结果是我们所期望的：

 -- Output: not ((not p) || (p || q))
 Fix (NotSentence
   (Fix (OrSentence
      (Fix (NotSentence (Fix (AtomicSentence "p"))))
      (Fix (OrSentence
         (Fix (AtomicSentence "p"))
         (Fix (AtomicSentence "q")))))))

现在，让我们使用从递归方案中借来的核武器：

imply_remove2 :: Sentence Symbol -> Sentence Symbol
imply_remove2 = cata $ \case
   -- Rewrite ImplySentence as follows
   ImplySentence s1 s2 -> Fix $ OrSentence (Fix $ NotSentence s1) s2
   -- Keep everything else as it is (after it had been recursively processed)
   s -> Fix s

如果我们运行测试imply_remove2 testSentence，我们会得到与原始代码相同的输出。

cata 是做什么的？非常粗略地，当应用于类似的函数时在cata f 中，它构建了一个catamorphism，即一个函数

将公式分解为子组件
递归地将cata f 应用于找到的子组件
将转换后的组件重新组合成公式
将最后一个公式（带有已处理的子公式）传递给f，这样最上面的连接词就会受到影响

最后一步是做实际工作的那一步。上面的\case 只执行所需的转换。其他一切都由cata 处理（以及自动生成的Functor 实例）。

综上所述，我不建议任何人轻易搬到 recursion-schemes。使用cata 可以产生非常优雅的代码，但它需要理解所涉及的机制，这可能不是立即掌握（这肯定不适合我）。

【讨论】：

【解决方案2】：

您可以编写一个默认函数来定义符号在没有应用转换时应如何处理：

default_transformation :: (Sentence Symbol -> Sentence Symbol) -> Sentence Symbol -> Sentence Symbol
default_transformation f (ImplySentence s1 s2) = ImplySentence (f s1) (f s2)
default_transformation f (AndSentence s1 s2) = AndSentence (f s1) (f s2)
default_transformation f (OrSentence s1 s2) = OrSentence (f s1) (f s2)
default_transformation f (NotSentence s1) = NotSentence (f s1)
default_transformation f (AtomicSentence s1) = AtomicSentence s1

函数将具体的变换作为参数。

如果您编写特定的转换，您只需要编写与默认值不同的情况并将默认值添加为最后一个情况：

imply_remove :: Sentence Symbol -> Sentence Symbol
imply_remove (ImplySentence s1 s2) = OrSentence (NotSentence (imply_remove s1)) (imply_remove s2)
imply_remove s = default_transformation imply_remove s

这种方法的优点是它可能更容易实现，因为它不需要任何依赖项。

【讨论】：

【解决方案3】：

您正在寻找在 Haskell 中称为“通用编程”的内容：https://wiki.haskell.org/Generics;一个早期的表格叫做“Scrap Your Boilerplate”，你也可能想去谷歌搜索一下。我没有对此进行测试，但我认为如果您使用Uniplate 的Data.Generics.Uniplate 和Data.Generics.Uniplate.Data 模块，您可以将imply_remove 定义为

imply_remove = transform w where
    w (ImplySentence s1 s2) = OrSentence (NotSentence s1) s2
    w s = s

transform 为您执行递归。

【讨论】：