阿尔法贝塔剪枝在井字游戏中无法阻止对手,是评估函数的问题吗?

我已经调试了几天了,我不知道我在这个井字游戏和AI的代码片段中犯了什么错误(好吧,我知道这不是真正的AI,但是…)我选择使用的是阿尔法贝塔剪枝。因为这是7×7的棋盘,单纯的极小极大算法计算量太大。

我遇到的问题是,我无法弄清楚为什么阿尔法贝塔剪枝无法阻止玩家拖延游戏,等待玩家的移动并利用有利于自己的正确移动,或者只是简单地打成平局。

我决定棋盘中心的分数(用于最终得分)会比边缘的分数高。我认为更靠近中心的移动比边缘的移动更有成功的几率,这就是为什么我创建了AddScoreToMove函数来评估那个移动。为了确保评估函数会检查棋盘上的每一个可能的移动,我没有让那个函数只查找第一个xxx(例如在第0行和第0列、第1列、第2列)然后返回(因为可能有4个X或4个O)。另外,4个X或4个O的得分远高于其他情况,并且应该被视为胜利。
此时,我的PC玩家(O)像这样玩enter image description here

有人能告诉我我哪里做错了吗?这是我写的第二个AI程序,第一个是用极小极大算法在3×3棋盘上运行的,运行得很好。

C#代码如下

VB.NET代码:

Public Class Form1    Dim board As Char(,) = {            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "},            {" ", " ", " ", " ", " ", " ", " "}}    Class Move        Public row, col As Integer    End Class        Dim BestMoveRow As Integer = 0    Dim BestMoveCol As Integer = 0    Dim BestMoveScore As Integer = 0    Shared player As Char = "X", opponent As Char = "O"    Shared Function AddScoreToMove(thatMove As Move) As Integer        Dim row As Integer = thatMove.row        Dim col As Integer = thatMove.col        '0分,移动在边界上         If ((row >= 1 And row <= 5) And col = 0) Then            Return 0        ElseIf ((row >= 1 And row <= 5) And col = 6) Then            Return 0        ElseIf (row = 0 And (col >= 0 And col <= 6)) Then            Return 0        ElseIf (row = 6 And (col >= 0 And col <= 6)) Then            Return 0        End If        '1分,移动在边界上+1        If ((row >= 2 And row <= 4) And col = 1) Then            Return 1        ElseIf ((row >= 2 And row <= 4) And col = 5) Then            Return 1        ElseIf (row = 1 And (col >= 1 And col <= 5)) Then            Return 1        ElseIf (row = 5 And (col >= 1 And col <= 5)) Then            Return 1        End If        '2分,移动在边界上+2        If (row = 2 And col = 2) Then            Return 2        ElseIf (row = 2 And col = 4) Then            Return 2        ElseIf (row = 2 And (col >= 2 And col <= 4)) Then            Return 2        ElseIf (row = 4 And (col >= 2 And col <= 4)) Then            Return 2        End If        '3分,移动在中心         If (row = 3 And col <= 3) Then            Return 3        End If        Return 0 '错误,未添加路径    End Function    Private Shared Function eval(ByVal b As Char(,)) As Integer        Dim playerScorerow As Integer = 0        Dim playerScorecol As Integer = 0        Dim playerScorecross As Integer = 0        Dim pcScorerow As Integer = 0        Dim pcScorecol As Integer = 0        Dim pcScorecross As Integer = 0                ''评估行         For row As Integer = 0 To 3            For col As Integer = 0 To 6            '初始化要评估的移动                Dim move3 As New Move With {                                        .row = row + 3,                                        .col = col                                    }                Dim move2 As New Move With {                                        .row = row + 2,                                        .col = col                                    }                Dim move1 As New Move With {                                        .row = row + 1,                                        .col = col                                    }                Dim move0 As New Move With {                                        .row = row,                                        .col = col                                    }                                                    If Not b(row, col) = " " Then '不是空的 - 玩家或电脑在这里移动                    Dim moveScore As Integer = AddScoreToMove(move0) '评估那个移动                     If b(row, col) = b(row + 1, col) Then '有2个X或2个O                        Dim move1Score As Integer = AddScoreToMove(move1)                        If b(row + 1, col) = b(row + 2, col) Then '有3个X或3个O                            Dim move2Score As Integer = AddScoreToMove(move2)                            If b(row + 2, col) = b(row + 3, col) Then '有4个X或4个O                                Dim move3Score As Integer = AddScoreToMove(move3)                                If b(row, col) = player Then '玩家在这里有4个X                                    playerScorerow = Math.Max(playerScorerow, 100 + move3Score + move2Score + move1Score + moveScore) '获取所有评估中最高的                                ElseIf b(row, col) = opponent Then '电脑在这里有4个O                                    pcScorerow = Math.Min(pcScorerow, -100 - move3Score - move2Score - move1Score - moveScore)                                End If                            End If                            If b(row, col) = player Then                                playerScorerow = Math.Max(playerScorerow, 5 + move2Score + move1Score + moveScore)                            ElseIf b(row, col) = opponent Then                                pcScorerow = Math.Min(pcScorerow, -5 - move2Score - move1Score - moveScore)                            End If                        End If                        If b(row, col) = player Then                            playerScorerow = Math.Max(playerScorerow, 2 + move1Score + moveScore)                        ElseIf b(row, col) = opponent Then                            pcScorerow = Math.Min(pcScorerow, -2 - move1Score - moveScore)                        End If                    End If                    If b(row, col) = player Then                        playerScorerow = Math.Max(playerScorerow, moveScore)                    ElseIf b(row, col) = opponent Then                        pcScorerow = Math.Min(pcScorerow, -moveScore)                    End If                End If            Next        Next        ''列胜利        For row As Integer = 0 To 6            For col As Integer = 0 To 3                Dim move3 As New Move With {                                        .row = row + 3,                                        .col = col                                    }                Dim move2 As New Move With {                                        .row = row + 2,                                        .col = col                                    }                Dim move1 As New Move With {                                        .row = row + 1,                                        .col = col                                    }                Dim move0 As New Move With {                                        .row = row,                                        .col = col                                    }                If Not b(row, col) = " " Then                    Dim moveScore As Integer = AddScoreToMove(move0)                    If b(row, col) = b(row, col + 1) Then                        Dim moveScore1 As Integer = AddScoreToMove(move1)                        If b(row, col + 1) = b(row, col + 2) Then                            Dim moveScore2 As Integer = AddScoreToMove(move2)                            If b(row, col + 2) = b(row, col + 3) Then                                Dim moveScore3 As Integer = AddScoreToMove(move3)                                If b(row, col) = player Then                                    playerScorerow = Math.Max(playerScorerow, 100 + moveScore3 + moveScore2 + moveScore1 + moveScore)                                ElseIf b(row, col) = opponent Then                                    pcScorerow = Math.Min(pcScorerow, -100 - moveScore3 - moveScore2 - moveScore1 - moveScore)                                End If                            End If                            If b(row, col) = player Then                                playerScorerow = Math.Max(playerScorerow, 5 + moveScore2 + moveScore1 + moveScore)                            ElseIf b(row, col) = opponent Then                                pcScorerow = Math.Min(pcScorerow, -5 - moveScore2 - moveScore1 - moveScore)                            End If                        End If                        If b(row, col) = player Then                            playerScorerow = Math.Max(playerScorerow, 2 + moveScore1 + moveScore)                        ElseIf b(row, col) = opponent Then                            pcScorerow = Math.Min(pcScorerow, -2 - moveScore1 - moveScore)                        End If                    End If                    If b(row, col) = player Then                        playerScorerow = Math.Max(playerScorerow, moveScore)                    ElseIf b(row, col) = opponent Then                        pcScorerow = Math.Min(pcScorerow, -moveScore)                    End If                End If            Next        Next        '未完全实现        '斜线胜利        For row As Integer = 0 To 3            For col As Integer = 0 To 3                If Not b(row, col) = " " Then                    If (b(row, col) = b(row + 1, col + 1) AndAlso b(row + 1, col + 1) = b(row + 2, col + 2) AndAlso b(row + 2, col + 2) = b(row + 3, col + 3)) Then                        If b(row, col) = player Then                            Return +10                        ElseIf b(row, col) = opponent Then                            Return -10                        End If                    End If                End If            Next        Next        '未完全实现        '斜线胜利        For row As Integer = 0 To 3            For col As Integer = 3 To 6                If Not b(row, col) = " " Then                    If (b(row, col) = b(row + 1, col - 1) AndAlso b(row + 1, col - 1) = b(row + 2, col - 2) AndAlso b(row + 2, col - 2) = b(row + 3, col - 3)) Then                        If b(row, col) = player Then                            Return +10                        ElseIf b(row, col) = opponent Then                            Return -10                        End If                    End If                End If            Next        Next        Dim scoreValues() As Integer = {playerScorerow, playerScorecol, playerScorecross, pcScorerow, pcScorecol, pcScorecross}        Dim max = scoreValues.OrderByDescending(Function(z) Math.Abs(z)).FirstOrDefault()                Return max            End Function    Private Shared Function MiniMax(ByVal board As Char(,), ByVal machineMove As Boolean, ByVal depth As Integer) As Integer        Const alpha As Integer = -10_000        Const beta As Integer = 10_000        Return AlphaBetaPruning(board, machineMove, depth, beta, alpha)    End Function    Private Shared Function AlphaBetaPruning(ByVal board As Char(,), ByVal machineMove As Boolean, ByVal depth As Integer, ByVal beta As Integer, ByVal alpha As Integer) As Integer            If depth = 0 Then Return eval(board)        If machineMove Then '电脑移动            For i As Integer = 0 To 6                For j As Integer = 0 To 6                    If board(i, j) = " " Then                        board(i, j) = opponent                        Dim score As Integer = Math.Min(AlphaBetaPruning(board, Not machineMove, depth - 1, beta, alpha), eval(board))                        board(i, j) = " "                        If score < beta Then                            Form1.BestMoveRow = i                            Form1.BestMoveCol = j                            Form1.BestMoveScore = score                            beta = score                        End If                        If alpha >= beta Then Exit For '剪枝                    End If                Next            Next            Return beta        Else '玩家移动            For i As Integer = 0 To 6                For j As Integer = 0 To 6                    If board(i, j) = " " Then                        board(i, j) = player                        Dim score As Integer = Math.Max(AlphaBetaPruning(board, Not machineMove, depth - 1, beta, alpha), eval(board))                        board(i, j) = " "                        If score > alpha Then                            alpha = score                        End If                        If alpha >= beta Then Exit For                    End If                Next            Next            Return alpha        End If    End FunctionEnd Class

C#代码

public class Form1{    private char[,] board = new[] { { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " }, { " ", " ", " ", " ", " ", " ", " " } };    class Move    {        public int row, col;    }    private int BestMoveRow = 0;    private int BestMoveCol = 0;    private int BestMoveScore = 0;    private static char player = "X";    private static char opponent = "O";    public static int AddScoreToMove(Move thatMove)    {        int row = thatMove.row;        int col = thatMove.col;        // 0分,移动在边界上         if (((row >= 1 & row <= 5) & col == 0))            return 0;        else if (((row >= 1 & row <= 5) & col == 6))            return 0;        else if ((row == 0 & (col >= 0 & col <= 6)))            return 0;        else if ((row == 6 & (col >= 0 & col <= 6)))            return 0;        // 1分,移动在边界上+1        if (((row >= 2 & row <= 4) & col == 1))            return 1;        else if (((row >= 2 & row <= 4) & col == 5))            return 1;        else if ((row == 1 & (col >= 1 & col <= 5)))            return 1;        else if ((row == 5 & (col >= 1 & col <= 5)))            return 1;        // 2分,移动在边界上+2        if ((row == 2 & col == 2))            return 2;        else if ((row == 2 & col == 4))            return 2;        else if ((row == 2 & (col >= 2 & col <= 4)))            return 2;        else if ((row == 4 & (col >= 2 & col <= 4)))            return 2;        // 3分,移动在中心         if ((row == 3 & col <= 3))            return 3;        return 0; // 错误,未添加路径    }    private static int eval(char[,] b)    {        int playerScorerow = 0;        int playerScorecol = 0;        int playerScorecross = 0;        int pcScorerow = 0;        int pcScorecol = 0;        int pcScorecross = 0;        // 评估行         for (int row = 0; row <= 3; row++)        {            for (int col = 0; col <= 6; col++)            {                // 初始化要评估的移动                Move move3 = new Move()                {                    row = row + 3,                    col = col                };                Move move2 = new Move()                {                    row = row + 2,                    col = col                };                Move move1 = new Move()                {                    row = row + 1,                    col = col                };                Move move0 = new Move()                {                    row = row,                    col = col                };                if (!b[row, col] == " ")                {                    int moveScore = AddScoreToMove(move0); // 评估那个移动                     if (b[row, col] == b[row + 1, col])                    {                        int move1Score = AddScoreToMove(move1);                        if (b[row + 1, col] == b[row + 2, col])                        {                            int move2Score = AddScoreToMove(move2);                            if (b[row + 2, col] == b[row + 3, col])                            {                                int move3Score = AddScoreToMove(move3);                                if (b[row, col] == player)                                    playerScorerow = Math.Max(playerScorerow, 100 + move3Score + move2Score + move1Score + moveScore); // 获取所有评估中最高的                                else if (b[row, col] == opponent)                                    pcScorerow = Math.Min(pcScorerow, -100 - move3Score - move2Score - move1Score - moveScore);                            }                            if (b[row, col] == player)                                playerScorerow = Math.Max(playerScorerow, 5 + move2Score + move1Score + moveScore);                            else if (b[row, col] == opponent)                                pcScorerow = Math.Min(pcScorerow, -5 - move2Score - move1Score - moveScore);                        }                        if (b[row, col] == player)                            playerScorerow = Math.Max(playerScorerow, 2 + move1Score + moveScore);                        else if (b[row, col] == opponent)                            pcScorerow = Math.Min(pcScorerow, -2 - move1Score - moveScore);                    }                    if (b[row, col] == player)                        playerScorerow = Math.Max(playerScorerow, moveScore);                    else if (b[row, col] == opponent)                        pcScorerow = Math.Min(pcScorerow, -moveScore);                }            }        }        // 列胜利        for (int row = 0; row <= 6; row++)        {            for (int col = 0; col <= 3; col++)            {                Move move3 = new Move()                {                    row = row + 3,                    col = col                };                Move move2 = new Move()                {                    row = row + 2,                    col = col                };                Move move1 = new Move()                {                    row = row + 1,                    col = col                };                Move move0 = new Move()                {                    row = row,                    col = col                };                if (!b[row, col] == " ")                {                    int moveScore = AddScoreToMove(move0);                    if (b[row, col] == b[row, col + 1])                    {                        int moveScore1 = AddScoreToMove(move1);                        if (b[row, col + 1] == b[row, col + 2])                        {                            int moveScore2 = AddScoreToMove(move2);                            if (b[row, col + 2] == b[row, col + 3])                            {                                int moveScore3 = AddScoreToMove(move3);                                if (b[row, col] == player)                                    playerScorerow = Math.Max(playerScorerow, 100 + moveScore3 + moveScore2 + moveScore1 + moveScore);                                else if (b[row, col] == opponent)                                    pcScorerow = Math.Min(pcScorerow, -100 - moveScore3 - moveScore2 - moveScore1 - moveScore);                            }                            if (b[row, col] == player)                                playerScorerow = Math.Max(playerScorerow, 5 + moveScore2 + moveScore1 + moveScore);                            else if (b[row, col] == opponent)                                pcScorerow = Math.Min(pcScorerow, -5 - moveScore2 - moveScore1 - moveScore);                        }                        if (b[row, col] == player)                            playerScorerow = Math.Max(playerScorerow, 2 + moveScore1 + moveScore);                        else if (b[row, col] == opponent)                            pcScorerow = Math.Min(pcScorerow, -2 - moveScore1 - moveScore);                    }                    if (b[row, col] == player)                        playerScorerow = Math.Max(playerScorerow, moveScore);                    else if (b[row, col] == opponent)                        pcScorerow = Math.Min(pcScorerow, -moveScore);                }            }        }        // 未完全实现        // 斜线胜利        for (int row = 0; row <= 3; row++)        {            for (int col = 0; col <= 3; col++)            {                if (!b[row, col] == " ")                {                    if ((b[row, col] == b[row + 1, col + 1] && b[row + 1, col + 1] == b[row + 2, col + 2] && b[row + 2, col + 2] == b[row + 3, col + 3]))                    {                        if (b[row, col] == player)                            return +10;                        else if (b[row, col] == opponent)                            return -10;                    }                }            }        }        // 未完全实现        // 斜线胜利        for (int row = 0; row <= 3; row++)        {            for (int col = 3; col <= 6; col++)            {                if (!b[row, col] == " ")                {                    if ((b[row, col] == b[row + 1, col - 1] && b[row + 1, col - 1] == b[row + 2, col - 2] && b[row + 2, col - 2] == b[row + 3, col - 3]))                    {                        if (b[row, col] == player)                            return +10;                        else if (b[row, col] == opponent)                            return -10;                    }                }            }        }        int[] scoreValues = new[] { playerScorerow, playerScorecol, playerScorecross, pcScorerow, pcScorecol, pcScorecross };        var max = scoreValues.OrderByDescending(z => Math.Abs(z)).FirstOrDefault();        return max;    }    private static int MiniMax(char[,] board, bool machineMove, int depth)    {        const int alpha = -10_000;        const int beta = 10_000;        return AlphaBetaPruning(board, machineMove, depth, beta, alpha);    }    private static int AlphaBetaPruning(char[,] board, bool machineMove, int depth, int beta, int alpha)    {        if (depth == 0)            return eval(board);        if (machineMove)        {            for (int i = 0; i <= 6; i++)            {                for (int j = 0; j <= 6; j++)                {                    if (board[i, j] == " ")                    {                        board[i, j] = opponent;                        int score = Math.Min(AlphaBetaPruning(board, !machineMove, depth - 1, beta, alpha), eval(board));                        board[i, j] = " ";                        if (score < beta)                        {                            Form1.BestMoveRow = i;                            Form1.BestMoveCol = j;                            Form1.BestMoveScore = score;                            beta = score;                        }                        if (alpha >= beta)                            break; // 剪枝                    }                }            }            return beta;        }        else        {            for (int i = 0; i <= 6; i++)            {                for (int j = 0; j <= 6; j++)                {                    if (board[i, j] == " ")                    {                        board[i, j] = player;                        int score = Math.Max(AlphaBetaPruning(board, !machineMove, depth - 1, beta, alpha), eval(board));                        board[i, j] = " ";                        if (score > alpha)                            alpha = score;                        if (alpha >= beta)                            break;                    }                }            }            return alpha;        }    }}

回答:

经过又一天的调试,我有点迷失了,但我找到了解决这个问题的办法,这可能是一个真正的解决方案。阿尔法贝塔剪枝只关注赢棋。这实际上与评估函数有关,如果我们考虑更多因素来评估函数,那么这个函数就会更好。这就是为什么我们有

  1. 基本赢棋因素 -> 这是评估赢棋移动的
  2. 阻挡对手因素 -> 用来拖延游戏
  3. 还有一个我尚未实现的分叉因素。相关信息在这里:https://en.wikipedia.org/wiki/Tic-tac-toe#Strategy以及来自https://stackoverflow.com/a/19406575/9283016

简而言之 – 只有简单赢棋评估的简单阿尔法贝塔函数不考虑游戏时间。我们需要编写合适的阻挡和分叉函数。

Private Shared Function evalBlock(ByVal b As Char(,)) As Move    Dim blockingMove As New Move With {         .row = -1,        .col = -1        }    ''行阻挡    For row As Integer = 0 To 4        For col As Integer = 0 To 6            If Not b(row, col) = " " Then                If b(row, col) = b(row + 1, col) Then '2个X或2个O                    If b(row, col) = player Then                        If b(row + 2, col) = " " Then                            blockingMove.row = row + 2                            blockingMove.col = col                            Return blockingMove                        End If                        If row > 0 Then                            If b(row - 1, col) = " " Then                                blockingMove.row = row - 1                                blockingMove.col = col                                Return blockingMove                            End If                        End If                    End If                End If            End If        Next    Next    ''列阻挡    For row As Integer = 0 To 6        For col As Integer = 0 To 4            If Not b(row, col) = " " Then                If b(row, col) = b(row, col + 1) Then '2个X或2个O                    If b(row, col) = player Then                        If b(row, col + 2) = " " Then                            blockingMove.row = row                            blockingMove.col = col + 2                            Return blockingMove                        End If                        If col > 1 Then                            If b(row, col - 1) = " " Then                                blockingMove.row = row                                blockingMove.col = col - 1                                Return blockingMove                            End If                        End If                    End If                End If            End If        Next    Next    '\ 斜线阻挡    For row As Integer = 0 To 4        For col As Integer = 0 To 4            If Not b(row, col) = " " Then                If (b(row, col) = b(row + 1, col + 1)) Then                    If b(row, col) = player Then                        If b(row + 2, col + 2) = " " Then                            blockingMove.row = row + 2                            blockingMove.col = col + 2                        End If                        If (row > 0 And col > 0) Then                            If b(row - 1, col - 1) = " " Then                                blockingMove.row = row - 1                                blockingMove.col = col - 1                            End If                        End If                    End If                End If            End If        Next    Next    '/ 斜线阻挡    For row As Integer = 0 To 4        For col As Integer = 2 To 6            If Not b(row, col) = " " Then                If (b(row, col) = b(row + 1, col - 1)) Then                    If b(row, col) = player Then                        If b(row, col) = " " Then                            blockingMove.row = row + 2                            blockingMove.col = col - 2                        End If                    End If                End If            End If        Next    Next    blockingMove.row = -1    blockingMove.col = -1    Return blockingMoveEnd Function

当然,返回阻挡移动并不会起作用。在阿尔法贝塔函数中,我们必须为此分配适当的值。所以我发现我们的AI会在玩家做出2个赢棋移动后阻挡他,而且阻挡比赢棋更优先。另外,在玩家移动2次后阻挡比在3次后阻挡更有意义,因为如果我们使用4个移动获胜的规则,可能会太晚了。

像这样:

                Dim score As Integer = Math.Min(AlphaBetaPruning(board, Not machineMove, depth - 1, beta, alpha), eval(board) + depth)                Dim BlockingMove As Move = evalBlock(board)                If BlockingMove.row <> -1 Then                    Dim blockingMoveScore As Integer = 5000                    If score < blockingMoveScore Then                        Form1.BestMoveRow = BlockingMove.row                        Form1.BestMoveCol = BlockingMove.col                        Form1.BestMoveScore = score                    End If                End If

这就是我在与我的AI的第一场比赛中输掉的方式。enter image description here

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注