在固定域中查找单个变量函数的最小值/最大值的算法答案

【问题标题】：Algorithms to find min/max of a single variable function in fixed domain在固定域中查找单个变量函数的最小值/最大值的算法
【发布时间】：2019-06-08 12:01:46
【问题描述】：

我正在寻找一种 numerical 算法来查找“给定区间 [a, b]”中函数的 global 最小值或最大值，例如查找最小值和最大值功能的

f(x) = sin(x)

在域 [3*pi/4, 5*pi/4] 中。

我知道如何使用梯度下降或梯度上升找到多变量函数的全局最小值/最大值，但我只能在整个函数域上使用这些算法，例如当我在函数 sin( x)，它给了我-1，这对于域 [0, 2*pi] 而不是 [3*pi/4, 5*pi/4] 是正确的，有什么帮助吗？

到目前为止，我已经达到了这个解决方案（python 2.7 中的代码，语言并不重要，我的问题是关于算法的）：

import math
import random

# function
def f(x):
    return math.sin(x)

# xmin-xmax interval
xmin = 3.0 * math.pi / 4.0
xmax = 5.0 * math.pi / 4.0

# find ymin-ymax
steps = 10000
ymin = f(xmin)
ymax = ymin

for i in range(steps):
    x = xmin + (xmax - xmin) * float(i) / steps
    y = f(x)
    if y < ymin: ymin = y
    if y > ymax: ymax = y

print ymin
print ymax

回答

感谢@BlackBear，我写了一个程序来做我真正需要的，这个函数使用梯度下降算法搜索区间 [a, b]，在每个循环中它从 a 和 b 之间的一个新的随机起点开始，然后比较值，最后返回最小值出现的 x

double gradientDescentInterval(const char *expression, double a, double b, double ete, double ere, double gamma,
                               unsigned int maxiter, int mode) {
    /*
     * Gradient descent is a first-order iterative optimization algorithm for finding the minimum of a function.
     * To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of
     * the gradient (or approximate gradient) of the function at the current point.
     *
     * This function searches minimum on an interval [a, b]
     *
     * ARGUMENTS:
     * expressions  the function expression, it must be a string array like "x^2+1"
     * a            starting point of interval [a, b]
     * b            ending point of interval [a, b]
     * ete          estimated true error
     * ere          estimated relative error
     * gamma        step size (also known as learning rate)
     * maxiter      maximum iteration threshold
     * mode         show process {0: no, 1: yes}
     *
     */

    // fix interval reverse
    if (a > b) {
        double temp = a;
        a = b;
        b = temp;
    } // end of if

    // check error thresholds
    if (ere < 0 || ete < 0) {
        printf("\nError: ete or ere argument is not valid\n");
        Exit();
        exit(EXIT_FAILURE);
    } // end of if

    // check mode
    if (mode != 0 && mode != 1) {
        printf("\nError: mode argument is not valid\n");
        Exit();
        exit(EXIT_FAILURE);
    } // end of if

    // check maxiter to be more than zero
    if (maxiter <= 0) {
        printf("Error: argument maxiter must be more than zero!\n");
        Exit();
        exit(EXIT_FAILURE);
    } // end of maxiter check

    // initializing variables
    unsigned int iter = 0, innerIter = 0;
    // choose an arbitrary result at midpoint between a and b to be updated later
    double coefficient = (b - a), result = a + coefficient / 2;
    double x, past_x, fx, fresult;
    double ete_err, ere_err;
    double fa = function_1_arg(expression, a);
    double fb = function_1_arg(expression, b);

    // set the seed for random number generator
    seed();

    while (iter < maxiter) {
        // try maxiter times to find minimum in given interval [a, b] and return lowest result
        // update fresult with new result
        fresult = function_1_arg(expression, result);
        // choose a random starting point
        x = a + coefficient * zeroToOneUniformRandom();

        // set inner iter to zero before new loop
        innerIter = 0;
        // go in a loop to find a minimum with random starting point
        while (innerIter < maxiter) {
            // calculate new x by subtracting the derivative of function at x multiplied by gamma from x
            past_x = x;
            x -= firstDerivative_1_arg(expression, x, DX) * gamma;
            fx = function_1_arg(expression, x);

            // calculate errors
            ete_err = fabs(past_x - x);
            ere_err = fabs(ete_err / x);

            if (mode) {
                printf("\nIn this iteration [#%d][#%d], x = %.5e f(x) = %.5e\n"
                       "and estimated true error = %.5e and estimated relative error = %.5e,\n",
                       iter, innerIter, x, fx, ete_err, ere_err);
            } // end if(mode)

            // Termination Criterion
            // if new x goes beyond interval lower than a
            if (x < a) {
                if (mode) {
                    printf("\nIn this iteration the calculated x is less than a : %.5e < %f"
                           "so minimum of the function occurs at a\n",
                           x, a);
                } // end if(mode)

                // if fa is lower than f(result), then a is where the minimum occurs
                if (fa < fresult) {
                    result = a;
                } // end of if
                break;
            } // end of if

            // if new x goes beyond interval bigger than b
            if (x > b) {
                if (mode) {
                    printf("\nIn this iteration the calculated x is bigger than b : %.5e > %f"
                           "so minimum of the function occurs at b\n",
                           x, b);
                } // end if(mode)

                // if fb is lower than f(result), then b is where the minimum occurs
                if (fb < fresult) {
                    result = b;
                } // end of if
                break;
            } // end of if

            // if calculated error is less than estimated true error threshold
            if (ete != 0 && ete_err < ete) {
                if (mode) {
                    printf("\nIn this iteration the calculated estimated true error is less than the threshold\n"
                           "(estimated true error) %.5e < %.5e (threshold)\n"
                           "so the calculated x is the point on domain that minimum of the function happens\n",
                           ete_err, ete);
                } // end if(mode)

                // if fx is lower than f(result), then x is where the minimum occurs
                if (fx < fresult) {
                    result = x;
                } // end of if
                break;
            } // end of estimated true error check

            // if calculated error is less than estimated relative error threshold
            if (ere != 0 && ere_err < ere) {
                if (mode) {
                    printf("\nIn this iteration the calculated estimated real error is less than the threshold\n"
                           "(estimated real error) %.5e < %.5e (threshold)\n"
                           "so the calculated x is the point on domain that minimum of the function happens\n",
                           ere_err, ere);
                } // end if(mode)

                // if fx is lower than f(result), then x is where the minimum occurs
                if (fx < fresult) {
                    result = x;
                } // end of if
                break;
            } // end of estimated relative error check
            innerIter++;
        } // end of inner while loop
        iter++;
    } // end of while loop

    // return result
    return result;
}

这里的许多功能您可能不知道，它们被编码在单独的文件中。你可以在my Github repository看到他们。

【问题讨论】：

你的问题完全不清楚。固定域是什么？你是在 global 还是 local maxima 之后？解析解还是数值解？一维还是 nD？
如果您正在寻找一种通用、防弹数值优化算法，请注意这并不存在。
@YvesDaoust 抱歉，这是我关于堆栈溢出 [和互联网] 的第一个问题，我不知道从哪里开始。
了解应用程序是什么以及您真正需要什么会有所帮助。例如，您是否需要解析函数的准确值或近似值，您是否需要所述全局最小值/最大值的位置？对于周期性函数，您将拥有一组最小值/最大值，捕获所有它们是否重要？我已经实现了许多算法来查找数据中的全局最大值，如果您的要求不太严格，有些算法非常简单。
我会说考虑使用Brent's method，但由于你想要一个全局最大值，它不会满足你的要求。 @YvesDaoust 的评论很中肯。

标签： python algorithm max min numerical-methods

【解决方案1】：

梯度上升/下降只能找到 local 最优值，为了找到“全局”最优值，您只需使用随机初始化多次运行该过程，并获取您找到的最佳值。

你也可以在你的情况下做同样的事情：随机取初始点并跟随梯度，在收敛时停止或当你走出域时。

您可以通过在退出域时动态限制域来加快速度。例如，假设您在 -10 和 10 之间最大化，并选择 6 作为初始点；您运行梯度上升并达到 10。您现在可以从随机初始化中排除区间 [6,10]，因为您知道最终会达到 10 并停在那里。

但我实际上建议您使用Bayesian optimization。它相对于梯度上升/下降的优点是：

不需要渐变
为全局优化而生
允许设置参数范围
需要更少的函数评估

最后，必须提醒一句：这个问题在一般情况下无法解决，考虑例如一个等于1x=3.4131242351和0的函数。但是，在实践中，您应该没问题。

【讨论】：

关于“排除”区间：有一个区间列表可能更容易，最初只包含整个区间。然后，从该列表中随机弹出一个区间，从该列表中随机选择一个起点，并向列表中添加一个或两个新区间（从区间的下限到 min(start, best) 和从 max(start, optimium)向上）
@tobias_k 这是一个不错的扩展。可悲的是，在更高维度上无法消除间隔
“您现在可以排除区间 [6,10]”：不，除非函数具有某些特殊属性，否则您可以永远通过简单采样排除区间。
@YvesDaoust 这不是简单的采样，因为我们使用梯度上升，我们知道函数在 [6,10] 中增加
@BlackBear：不，你不知道。一点也不。采样不会告诉您有关函数的任何信息。