将 sympy 表达式转换为 numpy 数组的函数答案

【问题标题】：Convert sympy expressions to function of numpy arrays将 sympy 表达式转换为 numpy 数组的函数
【发布时间】：2016-05-27 14:37:31
【问题描述】：

我有一个用 sympy 编写的 ODE 系统：

from sympy.parsing.sympy_parser import parse_expr

xs = symbols('x1 x2')
ks = symbols('k1 k2')
strs = ['-k1 * x1**2 + k2 * x2', 'k1 * x1**2 - k2 * x2']
syms = [parse_expr(item) for item in strs]

我想将其转换为向量值函数，接受 x 值的 1D numpy 数组，k 值的 1D numpy 数组，返回在这些点评估的方程的 1D numpy 数组。签名看起来像这样：

import numpy as np
x = np.array([3.5, 1.5])
k = np.array([4, 2])
xdot = my_odes(x, k)

我想要这样的东西的原因是把这个函数给scipy.integrate.odeint，所以它需要很快。

尝试 1：潜艇

当然，我可以为subs写一个包装器：

def my_odes(x, k):
    all_dict = dict(zip(xs, x))
    all_dict.update(dict(zip(ks, k)))
    return np.array([sym.subs(all_dict) for sym in syms])

但这超级慢，尤其是对于我的真实系统，它要大得多并且运行了很多次。我需要将此操作编译为 C 代码。

尝试 2：theano

我可以接近sympy's integration with theano：

from sympy.printing.theanocode import theano_function

f = theano_function(xs + ks, syms)

def my_odes(x, k):
    return np.array(f(*np.concatenate([x,k]))))

这会编译每个表达式，但是输入和输出的所有这些打包和解包都会减慢它的速度。 theano_function 返回的函数接受 numpy 数组作为参数，但它需要每个符号一个数组，而不是每个符号一个元素。 functify 和 ufunctify 的行为也相同。我不需要广播行为；我需要它将数组的每个元素解释为不同的符号。

尝试 3：DeferredVector

如果我使用DeferredVector，我可以创建一个接受 numpy 数组的函数，但我无法将其编译为 C 代码或返回一个 numpy 数组而不自己打包。

import numpy as np
import sympy as sp
from sympy import DeferredVector

x = sp.DeferredVector('x')
k =  sp.DeferredVector('k')
deferred_syms = [s.subs({'x1':x[0], 'x2':x[1], 'k1':k[0], 'k2':k[1]}) for s in syms]
f = [lambdify([x,k], s) for s in deferred_syms]

def my_odes(x, k):
    return np.array([f_i(x, k) for f_i in f])

使用DeferredVector 我不需要解包输入，但我仍然需要打包输出。另外，我可以使用lambdify，但ufuncify 和theano_function 版本已失效，因此不会生成快速C 代码。

from sympy.utilities.autowrap import ufuncify
f = [ufuncify([x,k], s) for s in deferred_syms] # error

from sympy.printing.theanocode import theano_function
f = theano_function([x,k], deferred_syms) # error

【问题讨论】：

标签： python numpy scipy sympy

【解决方案1】：

我写了a module named JiTCODE，它是为你这样的问题量身定做的。它接受符号表达式，将它们转换为 C 代码，在其周围包装 Python 扩展，编译并加载它以供 scipy.integrate.ode 或 scipy.integrate.solve_ivp 使用。

您的示例如下所示：

from jitcode import y, jitcode
from sympy.parsing.sympy_parser import parse_expr
from sympy import symbols

xs = symbols('x1 x2')
ks = symbols('k1 k2')
strs = ['-k1 * x1**2 + k2 * x2', 'k1 * x1**2 - k2 * x2']
syms = [parse_expr(item) for item in strs]

substitutions = {x_i:y(i) for i,x_i in enumerate(xs)}
f = [sym.subs(substitutions) for sym in syms]

ODE = jitcode(f,control_pars=ks)

然后您可以像使用 scipy.integrate.ode 的实例一样使用 ODE。

虽然您的应用程序不需要它，但您也可以提取并使用编译后的函数：

ODE.compile_C()
import numpy as np
x = np.array([3.5, 1.5])
k = np.array([4, 2])
print(ODE.f(0.0,x,*k))

请注意，与您的规范相反，k 不作为 NumPy 数组传递。对于大多数 ODE 应用程序，这应该无关紧要，因为硬编码控制参数更有效。

最后，请注意，对于这个小示例，由于scipy.integrate.ode 或scipy.integrate.solve_ivp 的开销，您可能无法获得最佳性能（另请参阅SciPy Issue #8257 或this answer of mine）。对于大型微分方程（如您所见），此开销变得无关紧要。

【讨论】：

【解决方案2】：

您可以使用 sympy 函数lambdify。例如，

from sympy import symbols, lambdify
from sympy.parsing.sympy_parser import parse_expr
import numpy as np

xs = symbols('x1 x2')
ks = symbols('k1 k2')
strs = ['-k1 * x1**2 + k2 * x2', 'k1 * x1**2 - k2 * x2']
syms = [parse_expr(item) for item in strs]

# Convert each expression in syms to a function with signature f(x1, x2, k1, k2):
funcs = [lambdify(xs + ks, f) for f in syms]


# This is not exactly the same as the `my_odes` in the question.
# `t` is included so this can be used with `scipy.integrate.odeint`.
# The value returned by `sym.subs` is wrapped in a call to `float`
# to ensure that the function returns python floats and not sympy Floats.
def my_odes(x, t, k):
    all_dict = dict(zip(xs, x))
    all_dict.update(dict(zip(ks, k)))
    return np.array([float(sym.subs(all_dict)) for sym in syms])

def lambdified_odes(x, t, k):
    x1, x2 = x
    k1, k2 = k
    xdot = [f(x1, x2, k1, k2) for f in funcs]
    return xdot


if __name__ == "__main__":
    from scipy.integrate import odeint

    k1 = 0.5
    k2 = 1.0
    init = [1.0, 0.0]
    t = np.linspace(0, 1, 6)
    sola = odeint(lambdified_odes, init, t, args=((k1, k2),))
    solb = odeint(my_odes, init, t, args=((k1, k2),))
    print(np.allclose(sola, solb))

True 在脚本运行时打印出来。

快多了（注意计时结果的单位变化）：

In [79]: t = np.linspace(0, 10, 1001)

In [80]: %timeit sol = odeint(my_odes, init, t, args=((k1, k2),))
1 loops, best of 3: 239 ms per loop

In [81]: %timeit sol = odeint(lambdified_odes, init, t, args=((k1, k2),))
1000 loops, best of 3: 610 µs per loop

【讨论】：

这确实比我的任何一个版本都快得多。 subs 为 120 毫秒，theano_function 为 8.7 毫秒，lambdify 为 0.6 毫秒。
如果我们能够弄清楚如何在 C 中进行整个评估，而不是每次迭代都使用 python 列表打包和解包，我们应该能够节省更多。
如果我将 lambdafied_odes 中的调用转换为使用 splatting [f(*np.concatenate([x,k])) for f in funcs]，当状态数量变化时这是必要的，时间会上升到 1.4 毫秒——仍然是最好的。跨度>
如果你真的想要速度，你可以试试autowrap或ufuncify。
据我所知，ufuncify 和 autowrap 如果你得到一个 numpy 数组，都需要解包输入参数，如果你想要一个 numpy 数组，则需要重新打包输出值。