我在这里假设So how can i choose a generic set of samples each time I need to compute the gradient at a given point? 的意思是,函数的维度是固定的,可以从您的起点推导出来。
考虑这是一个演示,使用 scipy 的 approx_fprime,这是一种更易于使用 wrapper-method 进行数值微分的方法,并且在需要 jacobian 时也用于 scipy 的优化器,但没有给出。
当然不能忽略参数epsilon,它可以根据数据有所不同。
(此代码也忽略了优化的 args 参数,这通常是个好主意;我使用的是 A 和 b 在此处范围内的事实;肯定不是最佳实践)
import numpy as np
from scipy.optimize import approx_fprime, minimize
np.random.seed(1)
# Synthetic data
A = np.random.random(size=(1000, 20))
noiseless_x = np.random.random(size=20)
b = A.dot(noiseless_x) + np.random.random(size=1000) * 0.01
# Loss function
def fun(x):
return np.linalg.norm(A.dot(x) - b, 2)
# Optimize without any explicit jacobian
x0 = np.zeros(len(noiseless_x))
res = minimize(fun, x0)
print(res.message)
print(res.fun)
# Get numerical-gradient function
eps = np.sqrt(np.finfo(float).eps)
my_gradient = lambda x: approx_fprime(x, fun, eps)
# Optimize with our gradient
res = res = minimize(fun, x0, jac=my_gradient)
print(res.message)
print(res.fun)
# Eval gradient at some point
print(my_gradient(np.ones(len(noiseless_x))))
输出:
Optimization terminated successfully.
0.09272331925776327
Optimization terminated successfully.
0.09272331925776327
[15.77418041 16.43476772 15.40369129 15.79804516 15.61699104 15.52977276
15.60408688 16.29286766 16.13469887 16.29916573 15.57258797 15.75262356
16.3483305 15.40844536 16.8921814 15.18487358 15.95994091 15.45903492
16.2035532 16.68831635]
使用:
# Get numerical-gradient function with a way too big eps-value
eps = 1e-3
my_gradient = lambda x: approx_fprime(x, fun, eps)
表明 eps 是一个关键参数,导致:
Desired error not necessarily achieved due to precision loss.
0.09323354898565098