How to use trained A2C agent #326

ahsteven · 2018-03-12T14:55:33Z

Note: There is a small bug on line 7 of run_atari.py for a2c. It reads:
from baselines.ppo2.policies import CnnPolicy, LstmPolicy, LnLstmPolicy
but should be:
from baselines.a2c.policies import CnnPolicy, LstmPolicy, LnLstmPolicy

I have trained the A2C agent for several hours on breakout but I cannot get good results when I try to use the agent.

The way the CNN is set up is to input a batch however when playing the game I just want to feed one observation. So I changed ob_shape in a2c/policies from
ob_shape = (nbatch, nh, nw, nc)
to
ob_shape = (None, nh, nw, nc)

However I am afraid I am perhaps causing a problem with the weights being reused in the train_model.

I wrote the following function to play the game but have never gotten more than one point on breakout.

def play_episode(env_name, model, seed):
    env = gym.make(env_name)
    env = wrap_deepmind(env, frame_stack=True, scale=True)
    obs, states, done = env.reset(), None, False
    episode_rew = 0
    while not done:
        obs = np.reshape(obs, (1,84,84,4))
        action, value, states, _ = model.step(obs, states, done)# states used for lstm model only
        obs, rew, done, _ = env.step(action)
        episode_rew += rew
    env.close
    return episode_rew

What is the proper way to load and play the game to get the reported scores?
I used the step_model to play the game. Is this correct? Or should I be using the train_model?

Also, how are the weights of the step_model updated?

The text was updated successfully, but these errors were encountered:

* Issue openai#317 [feature request] filter_size can be a array instead of one value * Issues openai#326 [Feature] filter_size can be a array * Issue openai#326 [Feature] filter_size can be a array * Issues openai#326 [Feature] filter_size can be a array: Line too long * Update changelog.rst * Issue openai#326 [Feature] filter_size can be a array, the added test code is test_a2c_conv.py * Issues openai#326 [Feature] filter_size can be a array, remove the unused variables * Issues openai#326 [Feature] filter_size can be a array, remove the unused library * Issue openai#326, [Feature] filter_size can be a array. Clean up the test code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use trained A2C agent #326

How to use trained A2C agent #326

ahsteven commented Mar 12, 2018 •

edited

Loading

How to use trained A2C agent #326

How to use trained A2C agent #326

Comments

ahsteven commented Mar 12, 2018 • edited Loading

ahsteven commented Mar 12, 2018 •

edited

Loading