Giving action to environments out of the limits still work and may mislead the users #1442

moguzozcan · 2019-04-14T13:55:33Z

The action space in "mountaincarcontinuous" environment accepts rewards between [-1, 1]. However, in the code, there is no control for that. If the user wants to take action like 3, the environment successfully returns the next state and reward. However, this may lead to some problems in terms of the user if he/she made a mistake during discretization. I think I would be good to add this control into the code base and warn the user.

reward = 0

if done:

    reward = 100.0

reward -= math.pow(action[0],2)*0.1

The text was updated successfully, but these errors were encountered:

abhinavsagar · 2019-04-19T13:49:48Z

@moguzozcan Sounds good. It would be much better if you open a PR so that we could test it upright.

christopherhesse · 2019-04-19T21:53:40Z

I believe we have avoiding checking the validity of observations, rewards, and actions in existing environments. It should be possible to construct a wrapper that checks these and throws an exception if they are outside of the expected range.

If someone wants to make that wrapper we can link it from the wrappers page: https://github.com/openai/gym/blob/master/docs/wrappers.md

There's some more discussion around this topic over on baselines: openai/baselines#121

christopherhesse closed this as completed Apr 19, 2019

christopherhesse mentioned this issue Apr 19, 2019

Mujoco Action Space #604

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Giving action to environments out of the limits still work and may mislead the users #1442

Giving action to environments out of the limits still work and may mislead the users #1442

moguzozcan commented Apr 14, 2019 •

edited

Loading

abhinavsagar commented Apr 19, 2019

christopherhesse commented Apr 19, 2019

Giving action to environments out of the limits still work and may mislead the users #1442

Giving action to environments out of the limits still work and may mislead the users #1442

Comments

moguzozcan commented Apr 14, 2019 • edited Loading

abhinavsagar commented Apr 19, 2019

christopherhesse commented Apr 19, 2019

moguzozcan commented Apr 14, 2019 •

edited

Loading