在 Python 中测试数学表达式的等价性答案

【问题标题】：Testing equivalence of mathematical expressions in Python在 Python 中测试数学表达式的等价性
【发布时间】：2011-09-03 12:57:53
【问题描述】：

我在 Python 中有两个字符串，

A m * B s / (A m + C m)

和

C m * B s / (C m + A m)

它们都是无序集（A，C）和无序集（B）的等价函数。 m 和 s 表示可以在同一单元之间交换但不能与另一个单元交换的单元。

到目前为止，我正在对 A、B 和 C 进行排列，并使用 eval 和 SymPy 的 == 运算符对其进行测试。这有多个缺点：

对于更复杂的表达式，我必须生成大量排列（在我的例子中是 8 个嵌套的 for 循环）
我需要将 A、B、C 定义为符号，当我不知道我将拥有哪些参数时，这不是最佳选择（因此我必须生成所有这些参数 -> 效率极低并弄乱了我的变量命名空间）

有没有一种pythonian方法来测试这种等价性？它应该适用于任意表达式。

【问题讨论】：

“Pythonians”来自 Python 国度。我们将我们的编码习惯称为“Pythonic”。
我不知道 SymPy，但你不能生成你的排列 without for loops？
您还有其他有关字符串的示例吗？我倾向于建议一个与它们匹配的正则表达式 - 我将为所演示的两个添加一个答案。
我不太明白你的问题（可能是关于我的英语不好:)），但在 itertools 模块中有一个名为 permutations() 的好函数。
我正在使用 itertools 排列，但我必须遍历我生成的所有排列。

标签： python equality sympy mathematical-expressions

【解决方案1】：

这是基于我之前回答的简化方法。

这个想法是，如果两个表达式在排列下是等价的，那么携带一个到另一个的排列必须将第一个字符串中的第 i 个符号（按第一次出现的索引排序）映射到第二个字符串中的第 i 个符号（再次排序按第一次出现的索引）。此原理可用于构造置换，将其应用于第一个字符串，然后检查与第二个字符串是否相等 - 如果相等，则相等，否则不相等。

这是一种可能的实现方式：

import re

# Unique-ify list, preserving order
def uniquify(l):
    return reduce(lambda s, e: s + ([] if e in s else [e]), l, [])

# Replace all keys in replacements with corresponding values in str
def replace_all(str, replacements):
    for old, new in replacements.iteritems():
        str = str.replace(old, new)
    return str

class Expression:
    units = ["m", "s"]

    def __init__(self, exp):
        self.exp = exp

    # Returns a list of symbols in the expression that are preceded
    # by the given unit, ordered by first appearance. Assumes the
    # symbol and unit are separated by a space. For example:
    # Expression("A m * B s / (A m + C m)").symbols_for_unit("m")
    # returns ['A', 'C']
    def symbols_for_unit(self, unit):
        sym_re = re.compile("(.) %s" % unit)
        symbols = sym_re.findall(self.exp)
        return uniquify(symbols)

    # Returns a string with all symbols that have units other than
    # unit "muted", that is replaced with the empty string. Example:
    # Expression("A m * B s / (A m + C m)").mute_symbols_for_other_units("m")
    # returns "A m *  s / (A m + C m)"
    def mute_symbols_for_other_units(self, unit):
        other_units = "".join(set(self.units) - set(unit))
        return re.sub("(.) ([%s])" % "".join(other_units), " \g<2>", self.exp)

    # Returns a string with all symbols that have the given unit
    # replaced with tokens of the form $0, $1, ..., by order of their
    # first appearance in the string, and all other symbols muted. 
    # For example:
    # Expression("A m * B s / (A m + C m)").canonical_form("m")
    # returns "$0 m *  s / ($0 m + $1 m)"
    def canonical_form(self, unit):
        symbols = self.symbols_for_unit(unit)
        muted_self = self.mute_symbols_for_other_units(unit)
        for i, sym in enumerate(symbols):
            muted_self = muted_self.replace("%s %s" % (sym, unit), "$%s %s" % (i, unit))
        return muted_self

    # Define a permutation, represented as a dictionary, according to
    # the following rule: replace $i with the ith distinct symbol
    # occurring in the expression with the given unit. For example:
    # Expression("C m * B s / (C m + A m)").permutation("m")
    # returns {'$0':'C', '$1':'A'}
    def permutation(self, unit):
        enum = enumerate(self.symbols_for_unit(unit))
        return dict(("$%s" % i, sym) for i, sym in enum)

    # Return a string produced from the expression by first converting it
    # into canonical form, and then performing the replacements defined
    # by the given permutation. For example:
    # Expression("A m * B s / (A m + C m)").permute("m", {"$0":"C", "$1":"A"})
    # returns "C m *  s / (C m + A m)"
    def permute(self, unit, permutation):
        new_exp = self.canonical_form(unit)
        return replace_all(new_exp, permutation) 

    # Test for equality under permutation and muting of all other symbols 
    # than the unit provided. 
    def eq_under_permutation(self, unit, other_exp):
        muted_self = self.mute_symbols_for_other_units(unit)        
        other_permuted_str = other_exp.permute(unit, self.permutation(unit))
        return muted_self == other_permuted_str    

    # Test for equality under permutation. This is done for each of
    # the possible units using eq_under_permutation
    def __eq__(self, other):
        return all([self.eq_under_permutation(unit, other) for unit in self.units])

e1 = Expression("A m * B s / (A m + C m)")
e2 = Expression("C m * B s / (C m + A m)")
e3 = Expression("A s * B s / (A m + C m)")

f1 = Expression("A s * (B s + D s) / (A m + C m)")
f2 = Expression("A s * (D s + B s) / (C m + A m)")
f3 = Expression("D s")

print "e1 == e2: ", e1 == e2 # True
print "e1 == e3: ", e1 == e3 # False
print "e2 == e3: ", e2 == e3 # False

print "f1 == f2: ", f1 == f2 # True
print "f1 == f3: ", f1 == f3 # False

正如您所指出的，这会在不考虑数学等价的情况下检查排列下的字符串等价，但这是成功的一半。如果您有数学表达式的规范形式，您可以在规范形式的两个表达式上使用这种方法。也许 sympy 的Simplify 之一可以解决问题。

【讨论】：

这是一个经过深思熟虑的答案，有很好的想法 - 我只是觉得有必要评论它不是一个适用于所有任意 sympy 表达式的强大解决方案。

【解决方案2】：

与其迭代所有可能的排列，不如假设一个存在并尝试构建它。我相信以正确的方式完成，算法失败将意味着不存在置换。

以下是应用于上述表达式的思想大纲：

让：

str1 = "A m * B s / (A m + C m)"
str2 = "C m * B s / (C m + A m)"

我们正在寻找可以使表达式相同的集合 (A, C) 的排列。根据 A 和 C 在 str2 中首次出现的顺序，将 A 和 C 重新标记为 X1 和 X2，因此：

X1 = C
X2 = A

因为 C 在 str2 中出现在 A 之前。接下来，创建数组 Y，使得 y[i] 是第 i 个符号 A 或 C，按 str1 中的首次出现顺序。所以：

Y[1] = A
Y[2] = C

因为 A 在 str1 中出现在 C 之前。

现在通过用 X1 和 X2 替换 A 和 C，从 str2 构造 str3：

str3 = "X1 m * B s / (X1 m + X2 m)"

然后开始用 Xi 代替 Y[i]。首先，X1 变为 Y[1]=A：

str3_1 = "A m * Bs / (A m + X2 m)"

在这个阶段，比较 str3_1 和 str1 直到第一次出现的任何 Xi，在本例中为 X2，所以因为这两个字符串相等：

str3_1[:18] = "A m * B s / (A m + " 
str1[:18] = "A m * B s / (A m + "

你有机会构建排列。如果它们不相等，则您将证明不存在合适的排列（因为任何排列都必须至少进行该替换）并且可以推断出不等价。但是它们是相等的，所以你继续下一步，用 X2 代替 Y[2]=C：

str3_2 = "A m * B s / (A m + C m)"

这等于str1，所以你有你的排列（A->C，C->A）并且已经证明了表达式的等价性。

这只是算法在特定情况下的演示，但应该概括。不确定你能把它降到的最低顺序是多少，但它应该比 n!在 n 个变量上生成所有排列。

如果我正确理解单位的重要性，它们会限制哪些变量可以通过排列交换哪些其他变量。因此，在上述表达式中，A 可以用 C 代替，因为两者都有“m”单位，但不能用具有“s”单位的 B。您可以通过以下方式处理此问题：

通过删除所有没有m个单元的符号，从str1和str2构造表达式str1_m和str2_m，然后对str1_m和str2_m执行上述算法。如果构造失败，则不存在置换。如果构造成功，则保留该排列（称为 m 排列）并通过删除所有没有 s 单元的符号从 str1 和 str2 构造 str1_s 和 str2_s，然后对 str1_s 和 str2_s 再次执行该算法。如果构造失败，它们是不等价的。如果成功，最终的排列将是 m-permutation 和 s-permutation 的组合（尽管您可能甚至不需要构造它，您只需要关心它是否存在）。

【讨论】：

有趣的方法。明天我会更彻底地阅读它。
好的，我看到的主要问题是它测试相同的表达式，而不是数学上等价的表达式。例如，它不会找到 2*X = X+X。
sympy 提供的代数表达式是否有一些规范形式？例如，您可以通过 sympy.together 运行，然后在结果字符串上尝试上述算法。我整理了一些实现上述算法的简化版本的代码，我一会儿贴出来。

【解决方案3】：

如果您将字符串传递给 SymPy 的 sympify() 函数，它将自动为您创建符号（无需全部定义）。

>>> from sympy import *
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> sympify("x**2 + cos(x)")
x**2 + cos(x)
>>> sympify("diff(x**2 + cos(x), x)")
2*x - sin(x)

【讨论】：

这个功能很棒！非常感谢。

【解决方案4】：

我做过一次，在一个数学研究的模拟器中。好吧，就我而言，我知道将使用哪些变量。

所以，我测试了将值放入变量中的结果。

A = 10
B = 20
C = 30
m = Math.e
s = Math.pi

于是，我们解决了：

s1 = 'A m * B s / (A m + C m)'
s2 = 'C m * B s / (C m + A m)'

如果 s1 != s2，证明不等价

用这种方法不可能说两个表达式是等价的，但是你可以说两者不等价

if s1 != s2:
   print "Not equivalent"
else:
   print "Try with another sample"

嗯..我希望这可以帮助你。

【讨论】：

我也想过这个，但是我正在测试大量的表达式并且不能容忍错误：-/
另外，不同的乘法顺序可以给出不同的浮点数结果。
Rosh Oxymoron，我不确定它是否发生在 python 中，因为谈到高级语言我想这种错误在 python 中不是问题。我在 C 中知道这一点， Pascal、Delphi 和直到 java，但我在 python 中没有看到类似的错误..
@JonatasTeixeira 运算顺序始终是浮点值的问题，与语言无关，因为它们不精确。如果您的语言提供或您正在使用库来提供无限精度（例如符号表示或理性定义并且您没有非理性值），那么顺序就无关紧要了。但是，使用浮点点，错误的传播会困扰您。

【解决方案5】：

与迄今为止的所有其他答案一样，这并不是问题的可靠解决方案，而是包含更多有用的信息，供我们未来细心的朋友解决它。

我提供了一个使用欧拉公式https://en.wikipedia.org/wiki/Euler%27s_formula的困难示例

我确信迄今为止所有其他溢出答案在我的示例中都不会成功。

我表明 sympy 网站上的所有建议在我的示例中也失败了。 (https://github.com/sympy/sympy/wiki/Faq)

#SOURCE FOR HELPERS:    https://github.com/sympy/sympy/wiki/Faq 
import sympy
import sympy.parsing.sympy_parser
ExampleExpressionString1 = 'exp( i*( (x0 - 1)*(x0 + 2) ) )'
ExampleExpressionSympy1 = sympy.parsing.sympy_parser.parse_expr(ExampleExpressionString1)

ExampleExpressionString2 = 'i*sin( (x0 - 1)*(x0 + 2) ) + cos( (x0 - 1)*(x0 + 2) )'
ExampleExpressionSympy2 = sympy.parsing.sympy_parser.parse_expr(ExampleExpressionString2)

print '(ExampleExpressionSympy1 == ExampleExpressionSympy2):'
print ' ', (ExampleExpressionSympy1 == ExampleExpressionSympy2)

print '(ExampleExpressionSympy1.simplify() == ExampleExpressionSympy2.simplify()):'
print ' ', (ExampleExpressionSympy1.simplify() == ExampleExpressionSympy2.simplify())

print '(ExampleExpressionSympy1.expand() == ExampleExpressionSympy2.expand()):'
print ' ', (ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp())

print '(ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp()):'
print ' ', (ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp())

print '(ExampleExpressionSympy1.simplify().expand().trigsimp() == ExampleExpressionSympy2.simplify().expand().trigsimp()):'
print ' ', (ExampleExpressionSympy1.simplify().expand().trigsimp() == ExampleExpressionSympy2.simplify().expand().trigsimp())

更多注释：

我怀疑这是一个难以通用且稳健地解决的问题。要正确检查数学等价性，您不仅必须尝试顺序排列，还必须拥有数学等价变换库并尝试所有这些排列。

但我确实相信这可能是一个可解决的问题，因为 Wolfram Alpha 似乎有“替代表达式”部分，这似乎可以在大多数时间使用这些等价来为任意表达式提供所有排列。

总结：

我建议以下内容，希望它会中断：

import sympy
import sympy.parsing.sympy_parser
Expression.simplify().expand().trigsimp()

【讨论】：

一丝不苟的未来先生在哪里？