MiniMax算法的困惑答案

【问题标题】：Confusion on MiniMax algorithmMiniMax算法的困惑
【发布时间】：2017-10-12 17:55:11
【问题描述】：

因此，我目前正在处理一项围绕 MiniMax 算法展开的任务，该游戏是 Mancala 和 NIM 的组合。程序的工作方式是向用户询问棋盘的当前状态，并且假设程序会吐出用户赢得比赛应该采取的第一步。我只是很困惑，我是否想用所有可能的解决方案生成整个游戏树，并且叶节点首先具有效用函数，然后让 MiniMax 算法递归地运行它，还是在 MiniMax 算法中创建树?如果这个问题非常不清楚，我很抱歉，但我只是坚持这个想法，我似乎无法理解。

【问题讨论】：

在实践中：这棵树是即时生成的。有一个重要的原因：您不会使用纯粹的 min-max，而是使用一些 alpha-beta 之类的修剪，因此可能不会搜索整个树（重要：良好的移动排序）。第二个原因：你将无法在大多数游戏中搜索所有状态（无限深度）；所以迭代加深用于将搜索限制在某个固定的深度/层（当时间离开时增加）
游戏树不是显式生成的，只是对其进行遍历。在 minimax 执行期间，您永远不会在内存中拥有整个树。正如 sascha 所提到的，事情是即时完成的，因为在任何节点（任何板配置）上，您都可以轻松地生成其后继状态。这里的关键方面是，当您在棋盘配置上应用移动（从而获得另一个棋盘配置）时，您实际上是在这个概念游戏树中移动。

标签： algorithm minimax

【解决方案1】：

编写极小极大函数的正确方法是通过移动和解除移动来遍历搜索树。您一次只能存储一个游戏状态，并且通过在该游戏状态上进行和取消移动，您可以遍历整个树。如果这令人困惑，那么查看一些极小极大伪代码会很有帮助。请注意，minimax 有两种常用的变体，即常规 minimax 和 negamax。 psudeocode 是 minimax，因为它更直观，但在实践中我会推荐 negamax 变体，因为它更简单：

int max(int depth){
    if(this state is terminal){//won, lost, drawn, or desired search depth is reached
        return value
    }
    //if the state is non terminal
    //we want to examine all child nodes. We do this by making all possible moves from this state, calling the min function 
    //(all childs of max nodes are min nodes) and then unmaking the moves. 
    int bestVal = -infinity;
    generate move list;
    for(all moves in move list){
        makeMove(this move in move list);
        int val = min(depth -1);
        unMakeMove(this move in move list);
        bestVal = max(val,bestVal);
    }
    return bestVal;
}

int min(int depth){
    if(this state is terminal){//won, lost, drawn, or desired search depth is reached
        return value
    }
    //if the state is non terminal
    //we want to examine all child nodes. We do this by making all possible moves from this state, calling the max function 
    //(all childs of min nodes are max nodes) and then unmaking the moves. 
    int bestVal = +infinity;
    generate move list;
    for(all moves in move list){
        makeMove(this move in move list);
        int val = min(depth -1);
        unMakeMove(this move in move list);
        bestVal = min(val,bestVal);
    }
    return bestVal;
}

因此，您通过跟踪一个游戏状态并在该游戏状态上递归地进行和取消移动来遍历整个树。一旦你理解了这个，看看 alpha beta 剪枝。另请注意，此函数仅返回最佳移动的值，而不是移动本身。您将需要一个特殊的函数来跟踪最佳移动以及在根处调用。

【讨论】：