Question

我在Unity3D（C＃）中使用AI开发Connect Four游戏。我根据这个（德语）Pseudocode使用MiniMax算法。

人工智能仍然很糟糕。虽然只有三个自由字段，但它试图连续获得4个。 AI总是经过行并仅在需要时阻塞。不幸的是，它也不会永远阻止。

在哪里隐藏问题？我忘记了什么？

如果在下次移动中没有人获胜或失败，我如何整合AI的随机移动。

这是我的源代码：

minimaxDepth = 4

函数调用：Max（minimaxDepth，fieldCopy）;

int Max (int depth, int[,] fieldCopy)
{
    int maxValue = -9999;
    int moveValue;
    bool winAI = false;
    bool winHuman = false;
    bool isStoneBelow = false;

    for (int y = 0; y < 6; y++) {
        for (int x = 0; x < 7; x++) {
            if (y > 0) {
                //is a stone under it?
                if (fieldCopy [x, y - 1] == 1 || fieldCopy [x, y - 1] == 2) {
                    isStoneBelow = true;
                } else {
                    isStoneBelow = false;
                }
            } else {
                isStoneBelow = true;
            }

            // possible move?
            if (fieldCopy [x, y] != 1 && fieldCopy [x, y] != 2 && isStoneBelow == true) {   
                isStoneBelow = false;
                fieldCopy [x, y] = 2; //simulate move
                winAI = false;
                winHuman = false;

                //Is there a winner?
                if (CheckWin (x, y, 2, fieldCopy)) {
                    winAI = true;
                    winHuman = false;
                }

                //No more moves possible?
                if (depth <= 1 || winAI == true) { 
                    moveValue = evaluationFunction (winAI, winHuman);       //evaluate the move
                } else {
                    moveValue = Min (depth - 1, fieldCopy);
                }

                fieldCopy [x, y] = 0; //Reset simulated move

                if (moveValue > maxValue) {
                    maxValue = moveValue;
                    if (depth == minimaxDepth) {
                        aiMoveX = x; // next move
                    }
                }
            }
        }
    }
    return maxValue;
}

int Min (int depth, int[,] fieldCopy)
{
    int minValue = 9999;
    int moveValue;
    bool winAI = false;
    bool winHuman = false;
    bool isStoneBelow = false;
    bool loopBreak = false;

    for (int y = 0; y < 6; y++) {
        for (int x = 0; x < 7; x++) {
            if (y > 0) {
                //is a stone under it?
                if (fieldCopy [x, y - 1] == 1 || fieldCopy [x, y - 1] == 2) {
                    isStoneBelow = true;
                } else {
                    isStoneBelow = false;
                }
            } else {
                isStoneBelow = true;
            }

            // possible move?
            if (fieldCopy [x, y] != 1 && fieldCopy [x, y] != 2 && isStoneBelow == true) {   
                isStoneBelow = false;
                fieldCopy [x, y] = 1; //simulate move
                winHuman = false;
                winAI = false;

                //Is there a winner?    
                if (CheckWin (x, y, 1, fieldCopy)) {
                    winHuman = true;
                    winAI = false;
                }

                //No more moves possible?
                if (depth <= 1 || winHuman == true) {  
                    moveValue = evaluationFunction (winAI, winHuman);       //evaluate the move
                } else {
                    moveValue = Max (depth - 1, fieldCopy);
                }

                fieldCopy [x, y] = 0; //Reset simulated move

                if (moveValue < minValue) {
                    minValue = moveValue;
                }
            }
        }
    }
    return minValue;
}


int evaluationFunction (bool winAI, bool winHuman)
{
    if (winAI) { 
        return 1;
    } else if (winHuman) {
        return -1;
    } else {
        return 0;
    }
}

感谢您的帮助！

Answer 1

我认为问题在于您的评估功能。您应该评估游戏中的状态，而不是评估其中一个玩家是否赢得。一个可能的考虑因素是那些运动员作品中最长的连续链的长度。这准确地显示了每个玩家获胜的程度，而不是即使玩家即将获胜也不会返回相同的值，但还没有。

为清晰起见，这是伪代码：

int evaluationFunction (int depth)
{
    // max_ai_chain and max_human_chain are global variables that 
    // should be updated at each piece placement

    if (depth % 2 == 0) // assuming AI gets calculated on even depth (you mentioned your depth is 4)
    { 
        return max_ai_chain;
    } else {
        return max_human_chain;
    }
}

Answer 2

前段时间，我还使用Connect 4作为minimaxing算法的示例，并在评估函数和搜索深度之间进行了权衡。我惊讶地发现，你几乎不需要搜索就能获得相当合理的游戏（对于这个游戏），但是对于评估功能来说是一个强大的启发式游戏。这似乎比评估函数的弱启发式和深度搜索表现得更好。

连续4次获胜是游戏的胜利条件，所以要实现这一目标，首先必须连续3次（垂直，水平或对角线）或者像O-OO或OO-O这样的模式（水平方向）或对角线）。这些'连续4个'中的越多，评估得分应该越高。通过连续获得2，可以为评估函数贡献较小的分数。将计数器放置在电路板中心也是有利的，因为形成4线的可能性更大，因此评估功能还应奖励靠近中心的计数器。根据比赛状况，其他改进也是可能的。

例如，一个重要的考虑因素是，如果您在董事会的同一栏中有两名潜在的获胜者，您可以强制获胜。你可以强迫你的对手阻挡第一个，然后在同一列中弹出你的石头赢得胜利。

如果您的评估功能对这些想法进行编码，那么在与迷你最大化结合使用时，您应该获得一些合理的游戏。

Connect Four的Minimax（Unity3D和C＃）：问题

2 个答案: