如何在OpenAI中创建一个新的健身房环境?

时间:2017-07-12 22:32:21

标签: machine-learning artificial-intelligence openai-gym

我的任务是制作一个AI代理,学习使用ML玩视频游戏。我想使用OpenAI Gym创建一个新环境,因为我不想使用现有环境。如何创建新的自定义环境?

另外,在没有OpenAI Gym的帮助下,还有其他方法可以让我开始让AI Agent玩特定的视频游戏吗?

2 个答案:

答案 0 :(得分:71)

在极小的环境中查看我的banana-gym

创建新环境

请参阅存储库的主页:

https://github.com/openai/gym/blob/master/docs/creating-environments.md

步骤如下:

  1. 使用PIP包结构创建新存储库
  2. 看起来应该是这样的

    gym-foo/
      README.md
      setup.py
      gym_foo/
        __init__.py
        envs/
          __init__.py
          foo_env.py
          foo_extrahard_env.py
    

    有关内容,请点击上面的链接。那里没有提到的细节特别是foo_env.py中的某些函数应该是什么样子。查看示例并在gym.openai.com/docs/帮助。这是一个例子:

    class FooEnv(gym.Env):
        metadata = {'render.modes': ['human']}
    
        def __init__(self):
            pass
    
        def _step(self, action):
            """
    
            Parameters
            ----------
            action :
    
            Returns
            -------
            ob, reward, episode_over, info : tuple
                ob (object) :
                    an environment-specific object representing your observation of
                    the environment.
                reward (float) :
                    amount of reward achieved by the previous action. The scale
                    varies between environments, but the goal is always to increase
                    your total reward.
                episode_over (bool) :
                    whether it's time to reset the environment again. Most (but not
                    all) tasks are divided up into well-defined episodes, and done
                    being True indicates the episode has terminated. (For example,
                    perhaps the pole tipped too far, or you lost your last life.)
                info (dict) :
                     diagnostic information useful for debugging. It can sometimes
                     be useful for learning (for example, it might contain the raw
                     probabilities behind the environment's last state change).
                     However, official evaluations of your agent are not allowed to
                     use this for learning.
            """
            self._take_action(action)
            self.status = self.env.step()
            reward = self._get_reward()
            ob = self.env.getState()
            episode_over = self.status != hfo_py.IN_GAME
            return ob, reward, episode_over, {}
    
        def _reset(self):
            pass
    
        def _render(self, mode='human', close=False):
            pass
    
        def _take_action(self, action):
            pass
    
        def _get_reward(self):
            """ Reward is given for XY. """
            if self.status == FOOBAR:
                return 1
            elif self.status == ABC:
                return self.somestate ** 2
            else:
                return 0
    

    使用您的环境

    import gym
    import gym_foo
    env = gym.make('MyEnv-v0')
    

    实施例

    1. https://github.com/openai/gym-soccer
    2. https://github.com/openai/gym-wikinav
    3. https://github.com/alibaba/gym-starcraft
    4. https://github.com/endgameinc/gym-malware
    5. https://github.com/hackthemarket/gym-trading
    6. https://github.com/tambetm/gym-minecraft
    7. https://github.com/ppaquette/gym-doom
    8. https://github.com/ppaquette/gym-super-mario
    9. https://github.com/tuzzer/gym-maze

答案 1 :(得分:14)

绝对有可能。他们在文档页面中这么说,接近结束。

https://gym.openai.com/docs

至于如何操作,您应该查看现有环境的源代码以获取灵感。它可以在github中找到:

https://github.com/openai/gym#installation

他们的大多数环境都没有从头开始实现,而是创建了一个围绕现有环境的包装器,并为它提供了一个便于强化学习的界面。

如果你想自己做,你应该朝这个方向努力,并尝试适应健身房界面已经存在的东西。虽然这很有可能非常耗时。

还有一个选项可能对您有用。它是OpenAI的宇宙

https://universe.openai.com/

它可以与网站集成,以便您在kongregate游戏中训练模型。但宇宙并不像健身房那么容易使用。

如果您是初学者,我建议您首先在标准环境中使用vanilla实现。在你通过基础知识的问题后,继续增加......