python中使用tic tac toe的Minimax算法答案

【问题标题】：Minimax algorithm in python using tic tac toepython中使用tic tac toe的Minimax算法
【发布时间】：2020-11-02 11:17:29
【问题描述】：

我正在尝试制作 tic tac toe AI，它通过使用 minimax 算法以最佳方式玩游戏。我让它工作只是注意到它没有做出最佳动作并且将它与自己对抗结果总是为'X'玩家获胜（它应该导致平局）。这是我的算法代码：

def getBestMove(state, player):
    '''
    Minimax Algorithm
    '''
    winner_loser , done = check_current_state(state)
    if done == "Done" and winner_loser == 'O': # If AI won
        return 1
    elif done == "Done" and winner_loser == 'X': # If Human won
        return -1
    elif done == "Draw":    # Draw condition
        return 0
        
    moves = []
    empty_cells = []
    for i in range(3):
        for j in range(3):
            if state[i][j] is ' ':
                empty_cells.append(i*3 + (j+1))
    
    for empty_cell in empty_cells:
        move = {}
        move['index'] = empty_cell
        new_state = copy_game_state(state)
        play_move(new_state, player, empty_cell)
        
        if player == 'O':    # If AI
            result = getBestMove(new_state, 'X')    # make more depth tree for human
            move['score'] = result
        else:
            result = getBestMove(new_state, 'O')    # make more depth tree for AI
            move['score'] = result
        
        moves.append(move)

    # Find best move
    best_move = None
    if player == 'O':   # If AI player
        best = -infinity
        for move in moves:
            if move['score'] > best:
                best = move['score']
                best_move = move['index']
    else:
        best = infinity
        for move in moves:
            if move['score'] < best:
                best = move['score']
                best_move = move['index']
                
    return best_move

我可以在这里做些什么来解决它？

【问题讨论】：

我不是 Python 人，但return best_move - 不应该返回分数吗？
什么分数？你是说return move['score']
在例如if move['score'] > best、move['score'] 似乎是一个数值，但使用 return best_move 您似乎返回了一个移动（而不是分数），而它又在递归调用之后被存储。这对我来说似乎是错误的，但也许 Python 在这里隐含地做了一些我不知道的事情。

标签： python algorithm minimax

【解决方案1】：

我认为如果你遵循标准的 minimax 算法会更容易，例如here。我还建议添加 alpha-beta 修剪以使其更快一些，即使在井字游戏中这并不是真正必要的。这是我很久以前制作的一个游戏示例，您可以从中获得灵感，它基本上取自链接的 Wikipedia 页面，并进行了一些小的调整，例如 if beta <= alpha 用于 alpha-beta 修剪：

move, evaluation = minimax(board, 8, -math.inf, math.inf, True)

def minimax(board, depth, alpha, beta, maximizing_player):

    if depth == 0 or board.is_winner() or board.is_board_full():
        return None, evaluate(board)

    children = board.get_possible_moves(board)
    best_move = children[0]
    
    if maximizing_player:
        max_eval = -math.inf        
        for child in children:
            board_copy = copy.deepcopy(board)
            board_copy.board[child[0]][child[1]].player = 'O'
            current_eval = minimax(board_copy, depth - 1, alpha, beta, False)[1]
            if current_eval > max_eval:
                max_eval = current_eval
                best_move = child
            alpha = max(alpha, current_eval)
            if beta <= alpha:
                break
        return best_move, max_eval

    else:
        min_eval = math.inf
        for child in children:
            board_copy = copy.deepcopy(board)
            board_copy.board[child[0]][child[1]].player = 'X'
            current_eval = minimax(board_copy, depth - 1, alpha, beta, True)[1]
            if current_eval < min_eval:
                min_eval = current_eval
                best_move = child
            beta = min(beta, current_eval)
            if beta <= alpha:
                break
        return best_move, min_eval

def evaluate(board):
    if board.is_winner('X'):
        return -1
    if board.is_winner('O'):
        return 1
    return 0

请注意，对棋盘进行深度复制（或在递归 minimax 调用后取消移动函数）很重要，否则您正在更改原始棋盘的状态并会出现一些奇怪的行为。

【讨论】：