我在这个AI上做错了什么？

我正在创建一个非常简单的AI（也许甚至不应该称之为AI，因为它只是尝试很多可能性，并为自己选择最佳的一个），用于我正在制作的棋盘游戏。这是为了简化我需要做的平衡游戏的手动测试数量。

AI独自进行游戏，执行以下操作：在每一回合中，AI使用其中一个英雄攻击战场上的最多9个怪物之一。他的目标是尽可能快地结束战斗（在最少的回合数内）并尽量减少怪物的激活次数。

为此，我为AI实现了一个预先思考的算法，而不是在当前执行最佳可能的移动，而是根据其他英雄未来可能的移动结果选择一个移动。这是他在执行此操作的代码片段，用PHP编写：

/** Perform think ahead moves * * @params int         $thinkAheadLeft      (the number of think ahead moves left) * @params int         $innerIterator       (the iterator for the move) * @params array       $performedMoves      (the moves performed so far) * @param  Battlefield $originalBattlefield (the previous state of the Battlefield) */public function performThinkAheadMoves($thinkAheadLeft, $innerIterator, $performedMoves, $originalBattlefield, $tabs) {    if ($thinkAheadLeft == 0) return $this->quantify($originalBattlefield);    $nextThinkAhead = $thinkAheadLeft-1;    $moves = $this->getPossibleHeroMoves($innerIterator, $performedMoves);    $Hero = $this->getHero($innerIterator);    $innerIterator++;    $nextInnerIterator = $innerIterator;    foreach ($moves as $moveid => $move) {        $performedUpFar = $performedMoves;        $performedUpFar[] = $move;        $attack = $Hero->getAttack($move['attackid']);        $monsters = array();        foreach ($move['targets'] as $monsterid) $monsters[] = $originalBattlefield->getMonster($monsterid)->getName();        if (self::$debug) echo $tabs . "Testing sub move of " . $Hero->Name. ": $moveid of " . count($moves) . "  (Think Ahead: $thinkAheadLeft | InnerIterator: $innerIterator)\n";        $moves[$moveid]['battlefield']['after']->performMove($move);        if (!$moves[$moveid]['battlefield']['after']->isBattleFinished()) {            if ($innerIterator == count($this->Heroes)) {                $moves[$moveid]['battlefield']['after']->performCleanup();                $nextInnerIterator = 0;            }            $moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->performThinkAheadMoves($nextThinkAhead, $nextInnerIterator, $performedUpFar, $originalBattlefield, $tabs."\t", $numberOfCombinations);        } else $moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->quantify($originalBattlefield);    }    usort($moves, function($a, $b) {        if ($a['quantify'] === $b['quantify']) return 0;        else return ($a['quantify'] > $b['quantify']) ? -1 : 1;    });    return $moves[0]['quantify'];}

这个函数递归地检查未来的移动，直到达到$thinkAheadleft值，或者找到一个解决方案（即，所有怪物都被击败）。当它达到退出参数时，它会计算战场的状态，并与$originalBattlefield（第一次移动前的战场状态）进行比较。计算方式如下：

 /** Quantify the current state of the battlefield * * @param Battlefield $originalBattlefield (the original battlefield) * * returns int (returns an integer with the battlefield quantification) */public function quantify(Battlefield $originalBattlefield) {    $points = 0;    foreach ($originalBattlefield->Monsters as $originalMonsterId => $OriginalMonster) {        $CurrentMonster = $this->getMonster($originalMonsterId);        $monsterActivated = $CurrentMonster->getActivations() - $OriginalMonster->getActivations();        $points+=$monsterActivated*($this->quantifications['activations'] + $this->quantifications['activationsPenalty']);        if ($CurrentMonster->isDead()) $points+=$this->quantifications['monsterKilled']*$CurrentMonster->Priority;        else {            $enragePenalty = floor($this->quantifications['activations'] * (($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left'])/$CurrentMonster->Enrage['max']));            $points+=($OriginalMonster->Health['left'] - $CurrentMonster->Health['left']) * $this->quantifications['health'];            $points+=(($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left']))*$enragePenalty;        }    }    return $points;}

在量化时，有些因素会给状态带来正分，有些则带来负分。AI所做的不是使用当前移动后计算的分数来决定采取哪个移动，而是使用预先思考部分后计算的分数，并根据其他英雄可能的移动选择一个移动。

基本上，AI所做的是说，目前攻击怪物1并不是最佳选择，但如果其他英雄采取这些行动，从长远来看，这将是最佳结果。

选择一个移动后，AI会使用英雄执行一个移动，然后为下一个英雄重复这个过程，计算增加1次移动的情况。

问题：我的问题是，我原本以为，一个能预先思考3-4步的AI，应该能找到比只执行当前最佳可能移动的AI更好的解决方案。但我的测试案例显示并非如此，在某些情况下，一个不使用预先思考选项的AI，即只在当前执行最佳可能移动的AI，会击败一个预先思考1步的AI。有时，预先思考3步的AI会击败预先思考4或5步的AI。为什么会这样？我的假设是否错误？如果是，为什么会这样？我是否使用了错误的权重？我调查了这个问题，并运行了一个测试，自动计算要使用的权重，测试了一系列可能的权重，并尝试使用最佳结果（即，产生最少回合数和/或最少激活次数的那些），但上述问题在使用这些权重时仍然存在。

在当前版本的脚本中，我只能进行5步的预先思考，因为任何更大的预先思考数都会使脚本变得非常慢（使用5步预先思考，大约4分钟就能找到一个解决方案，但使用6步预先思考，6小时内甚至找不到第一个可能的移动）

战斗方式：战斗的方式如下：由AI控制的若干英雄（2-4个），每个英雄有若干不同的攻击（1-x个），这些攻击可以在战斗中使用一次或多次，攻击若干怪物（1-9个）。根据攻击的值，怪物会失去生命值，直到死亡。每一次攻击后，如果怪物没有死亡，它会变得愤怒，每个英雄执行完一个移动后，所有怪物都会变得愤怒。当怪物达到它们的愤怒极限时，它们会激活。

免责声明：我知道PHP不是用于这种操作的语言，但由于这只是一个内部项目，我更愿意牺牲速度，以便能够用我熟悉的编程语言尽快编写代码。

更新：我们目前使用的量化值看起来像这样：

$Battlefield->setQuantification(array( 'health'                   =>  16, 'monsterKilled'            =>  86, 'activations'              =>  -46, 'activationsPenalty'       =>  -10));

回答：

如果你的游戏中有随机性，那么任何事情都可能发生。指出这一点是因为从你在这里发布的材料中并不清楚这一点。

如果没有随机性且参与者可以看到游戏的完整状态，那么更长的预先思考绝对应该表现得更好。当它没有表现得更好时，这清楚地表明你的评估函数提供了对状态价值的错误估计。

在查看你的代码时，你的量化值没有列出，在你的模拟中看起来你只是让同一个玩家重复进行移动，而没有考虑其他参与者的可能行动。你需要运行一个完整的模拟，逐步进行，以便产生准确的未来状态，你需要查看不同状态的价值估计，看看你是否同意这些估计，并相应地调整你的量化值。

估计价值的另一种方法是明确预测你赢得这一轮的几率，以0.0到1.0的比例表示，然后选择给你最高赢得几率的移动。计算到目前为止造成的伤害和杀死的怪物数量并不能告诉你为了赢得游戏还需要做多少事情。

学技术

我在这个AI上做错了什么？

发表回复取消回复

相关文章：

当走到迷宫死胡同时，如何以编程方式遍历迷宫

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复