我正在尝试构建一个神经网络来玩贪吃蛇游戏。以下是训练代码:
def train(self): self.build_model() for episode in range(self.max_episodes): self.current_episode = episode env = SnakeEnv(self.screen) episode_reward = 0 for timestep in range(self.max_steps): env.render(self.screen) state = env.get_state() action = None epsilon = self.current_eps if epsilon > random.random(): action = np.random.choice(env.action_space) #探索 else: values = self.policy_model.predict(env.get_state()) #利用 action = np.argmax(values) #print(action) experience = env.step(action) if(experience['done'] == True): break episode_reward += experience['reward'] if(experience['done'] == True): continue if(len(self.memory) < self.memory_size): self.memory.append(Experience(experience['state'], experience['action'], experience['reward'], experience['next_state'])) else: self.memory[self.push_count % self.memory_size] = Experience(experience['state'], experience['action'], experience['reward'], experience['next_state']) self.push_count += 1 self.decay_epsilon(episode) if self.can_sample_memory(): memory_sample = self.sample_memory() #q_pred = np.zeros((self.batch_size, 1)) #q_target = np.zeros((self.batch_size, 1)) #i = 0 for memory in memory_sample: memstate = memory.state action = memory.action next_state = memory.next_state reward = memory.reward max_q = reward + self.discount_rate * self.replay_model.predict(next_state) #q_pred[i] = q_value #q_target[i] = max_q #i += 1 self.policy_model.fit(memstate, max_q, epochs=1, verbose=0) env.render(self.screen) print("Episode: ", episode, " Total Reward: ", episode_reward) if episode % self.target_update == 0: self.replay_model.set_weights(self.policy_model.get_weights()) pygame.quit()
屏幕初始化代码如下所示:
pygame.init()self.screen = pygame.display.set_mode((600, 600))pygame.display.set_caption("Snake")
环境渲染代码如下所示:
def render(self, screen): screen.fill((0, 0, 0)) for i in range(20): pygame.draw.line(screen, (255, 255, 255), (0, 30*i), (600, 30*i)) pygame.draw.line(screen, (255, 255, 255), (30*i, 0), (30*i, 600)) self.food.render() self.snake.render() pygame.display.flip()
食物和蛇的渲染方法只是在相应坐标处绘制简单的方块。当我运行训练代码时,我只看到一个白屏。当我通过按下Ctrl+C结束程序时,我会短暂看到屏幕正确渲染,然后它突然关闭。我该如何让它正确渲染呢?
回答:
你的代码可能在其他操作系统上可以工作,但通常,你需要通过调用pygame.event.get()
(或.pump()
)来让pygame处理你的窗口管理器的事件。否则,屏幕上不会显示任何内容。
因此,在你的循环中,你应该处理事件队列中的事件,并且至少处理QUIT
事件,例如:
def render(self, screen): ... # 或者创建一个新函数,这取决于你,只要在每一帧执行一次 events = pygame.events.get() for e in events: if e.type == pygame.QUIT: sys.exit() # 或者其他退出程序的方式
你也可以做一些更复杂的事情来分离你的训练代码和绘图代码,比如使用回调函数或协程,但那是另一个话题了。