Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mujoco Action Space #604

Closed
RuofanKong opened this issue May 30, 2017 · 3 comments
Closed

Mujoco Action Space #604

RuofanKong opened this issue May 30, 2017 · 3 comments

Comments

@RuofanKong
Copy link

I was using Mujoco simulator 'InvertedPendulum-v1' on Open AI Gym, and I noticed an interesting issue. I checked the action space bounds with the following code:

env = gym.make('InvertedPendulum-v1')
print("high: ", env.action_space.high)
print("low: ", env.action_space.low)

It showed that the valid action should be within the range of [-3, 3]. However, if action is not within the range, it looks still working. For example,

action = 40
observation, reward, done, info = env.step(action)

So I wonder if there is any reasons for this issue, and if the issue is for all Mujoco simulators.

@DanielTakeshi
Copy link

@RuofanKong I think MuJoCo will internally clip the action to be 3 in that case (the maximum). However, I couldn't find the answer by browsing in the source code. It seems logical but it would be great if someone could confirm!

@fiberleif
Copy link

Other Mujoco tasks, such as "Swimmer-v1", "Walker2d-v1", action space is also bounded too, specifically [-1,1] high dimension cube. can not confirm whether clip suppose is right here.

@christopherhesse
Copy link
Contributor

I proposed on this other issue: #1442 that there could be a wrapper that checks for invalid obs, rew, act values and produce an error for the user. There's some more discussion on baselines about how the agent should deal with this: openai/baselines#121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants