Use gym and atari wrappers instead of chainerrl.envs.ale #253

muupan · 2018-04-06T07:01:45Z

Resolves #251

Compare performance of agents
Compare speed
Apply to all examples under examples/ale

The model learned for Breakout with both FireResetEnv and EpisodicLifeEnv would not generalize to the env without EpisodicLifeEnv.

which is used in http://arxiv.org/abs/1509.06461

muupan · 2018-04-06T07:07:58Z

atari_wrappers.py in openai/baselines use FireResetEnv to send FIRE after every reset, but there is some doubt that DeepMind uses the same trick. openai/baselines#240 Thus I disabled it by default.

muupan · 2018-05-07T12:06:11Z

Below are elapsed time (sec) on 10000 steps of chainerrl.envs.ale.ALE and new atari_wrappers (https://github.com/muupan/chainerrl/blob/atari-wrappers-speed/examples/ale/check_atari_wrappers_speed.py).
Measured on MacBook Pro (Retina, 15-inch, Mid 2014).
New atari_wrappers is significantly faster.

$ python examples/ale/check_atari_wrappers_speed.py
rom name: pong
env name: PongNoFrameskip-v4
old env
.............
15.631451486144215
.............
15.614390858914703
.............
15.652993701864034
new env
.............
10.40902072796598
.............
10.401655629044399
.............
10.374119991902262
old env (test)
.............
15.4871492870152
.............
15.508433992043138
.............
15.478705200133845
new env (test)
.............
10.310282153077424
.............
10.431455522077158
.............
10.324188519036397

muupan · 2018-05-07T12:22:16Z

A3C (16 processes, 80M steps)

muupan · 2018-05-07T12:23:08Z

CategoricalDQN

muupan · 2018-05-07T12:31:50Z

PPO, DQN, ACER (only with atari_wrappers)

muupan · 2018-05-07T12:36:38Z

Since DeepMind papers use 5 min (= 5 * 60 * 60 // 4 steps) to evaluation episodes (see https://arxiv.org/abs/1507.04296 etc), I applied it to all the examples. It seems it affect performance, especially on BeamRider.

toslunar · 2018-05-17T05:39:39Z

It seems to me that the wrappers assume NoFrameskip envs. I'd like to have

chainerrl.envs.ale_unwrapped('pong')  # => gym.make('PongNoFrameSkip-v4')

muupan · 2018-05-17T07:11:57Z

chainerrl.envs.ale_unwrapped('pong') # => gym.make('PongNoFrameSkip-v4')

Don't you think stating that only `*NoFrameskip-v4' is supported is sufficient? The rule to convert 'pong' to 'PongNoFrameskip-v4' may not be simple, e.g., underscore and version upgrade.

It would be helpful to list up possible env names in the error message if the user specifies invalid one.

toslunar

LGTM except that hacking 1.1 warns gym.undo_logger_setup()

Add noqa to `gym.undo_logger_setup()`

muupan added 12 commits March 20, 2018 23:39

Copy atari_wrappers.py and comment on the license

b4be931

Fix style

1241675

Support old gym

e41d760

Use atari_wrappers

6ffeec9

Merge branch 'c51' into atari-wrappers

59abb04

Use make_atari as well as wrap_deepmind

85e0356

Turn FireResetEnv off by default

cda70e8

The model learned for Breakout with both FireResetEnv and EpisodicLifeEnv would not generalize to the env without EpisodicLifeEnv.

Add eval_max_episode_len as 5 minutes

8c448f9

which is used in http://arxiv.org/abs/1509.06461

Merge branch 'c51' into atari-wrappers

5b409ad

Merge branch 'c51' into atari-wrappers

6b83f18

Merge branch 'c51' into atari-wrappers

4287965

Merge branch 'c51' into atari-wrappers

eb60d7a

muupan added 17 commits April 16, 2018 18:33

Merge branch 'master' into atari-wrappers

d7e57b3

Add --logging-level to train_a3c_ale.py

5b98f3f

Merge branch 'logger-level-for-a3c' into atari-wrappers-a3c

10911c7

Use atari_wrappers in train_a3c_ale.py

caec04f

Cast to int since np.int64 doesn't work

a8dac37

Set --max-episode-len to 30 min by default

1362086

Merge branch 'master' into atari-wrappers

1bcffb6

Merge branch 'atari-wrappers' into atari-wrappers-a3c

883ec7e

Add --render and --monitor

739b401

Merge branch 'atari-wrappers-a3c' into atari-wrappers

162b8dd

Support both hwc and chw layout

f5e8fce

Add --render --monitor

9502232

Merge branch 'master' into atari-wrappers

cf12f9f

Use save_best_so_far_agent=False

23d7c3a

Use atari_wrappers for PPO

534757a

Set save_best_so_far_agent=False

d616d93

Remove --use-sdl

fba1e33

muupan added 8 commits April 21, 2018 02:51

Use atari_wrappers for ACER

457b3fa

Use atari_wrappers for NSQ

6fb96e7

Use 10 ** 7 steps

7cc2fb4

Add opencv-python as an optional dependency

068bdf0

Use 8 * 10 ** 8 steps again for A3C and NSQ

fa1e359

Merge branch 'master' into atari-wrappers

0dd1ae7

Replace pong with --env option

9e25356

Limit each episode 5min everywhere

c9b3ae1

muupan changed the title ~~[WIP] Use gym and atari wrappers instead of chainerrl.envs.ale~~ Use gym and atari wrappers instead of chainerrl.envs.ale May 7, 2018

muupan mentioned this pull request May 14, 2018

How do we get the rom for ale examples? #268

Closed

muupan mentioned this pull request May 17, 2018

Set ROM name to lowercase as expected in atari_py #269

Closed

toslunar and others added 2 commits May 29, 2018 13:56

Merge branch 'master' into atari-wrappers

3d07043

Add noqa as 731d6dd

493f7c7

toslunar reviewed May 29, 2018

View reviewed changes

Merge pull request #1 from toslunar/pr253

943869e

Add noqa to `gym.undo_logger_setup()`

toslunar approved these changes May 29, 2018

View reviewed changes

toslunar merged commit 7dc90e5 into chainer:master May 29, 2018

muupan added enhancement example and removed enhancement labels Jul 23, 2018

muupan added this to the v0.4 milestone Jul 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use gym and atari wrappers instead of chainerrl.envs.ale #253

Use gym and atari wrappers instead of chainerrl.envs.ale #253

muupan commented Apr 6, 2018 •

edited

Loading

muupan commented Apr 6, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018 •

edited

Loading

toslunar commented May 17, 2018

muupan commented May 17, 2018

toslunar left a comment

Use gym and atari wrappers instead of chainerrl.envs.ale #253

Use gym and atari wrappers instead of chainerrl.envs.ale #253

Conversation

muupan commented Apr 6, 2018 • edited Loading

muupan commented Apr 6, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018

muupan commented May 7, 2018 • edited Loading

toslunar commented May 17, 2018

muupan commented May 17, 2018

toslunar left a comment

Choose a reason for hiding this comment

muupan commented Apr 6, 2018 •

edited

Loading

muupan commented May 7, 2018 •

edited

Loading