-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104
Conversation
Reviewer: @vwxyzjn |
7c2d346
to
53772da
Compare
@lebrice thanks for contributing the PR. Would you mind fixing the linting error? Also, would you mind giving a short code sample to test it out? Maybe it's just me but I had a hard time understanding the |
@lebrice could you please fix tests? |
@lebrice and fix the merge conflicts? |
Yes, will do! Currently in the rush for ICLR. |
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
bcc5c54
to
846d026
Compare
@lebrice Do you have any plans for this PR? |
Hey @pseudo-rnd-thoughts! |
Thanks @lebrice could you fix the lint issue. |
This can be used to do Chunking. Chunking means having multiple sequential operations within each asynchronous process. This is useful in reducing the overhead of multiprocessing, and allowing users to use large batch sizes, even with a limited number of cores. In this case here, this means being able to use a very large number of vectorized environments. For example, in my experience, with only a small number of cpus, e.g. 4 cpus, laptops start struggling really hard when using pure AsyncVectorEnvs with batch sizes >~ 32, even for simple envs like CartPole, since that involves creating 32 python processes.
Because this is more of a bugfix than an extension, and it's an improvement to the existing components in gym, not a new component. Hope that clears it up, thanks for taking a look! |
Signed-off-by: Fabrice Normandin <[email protected]>
Hey @pseudo-rnd-thoughts @vwxyzjn @jkterry1 there seems to be a bug in the test setup here: https://github.com/openai/gym/runs/6114261693?check_suite_focus=true#step:4:96 Looks like a pygame error in an unrelated test, due to there not being a display? Not sure why it would have passed for python 3.8 and not python 3.9 though.. Otherwise I think this is good to go on my end. |
tests/vector/test_vector_env.py
Outdated
|
||
assert batch_size(env.action_space) == n_outer_envs | ||
|
||
with env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using environment as a context manager here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it gets closed and the resources are freed.
This isn't absolutely necessary, since the env going out of scope should have the same effect.
I like it since it's more explicit, and also made it easier for me to spot if there were some errors when closing the nested AsyncVectorEnvs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense. My issue is that it is a code style that we do not follow anywhere else in the tests so could you just have a env.close()
at the end of the script to make it simpler for anyone looking at the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no problem.
Fixed in f47596f .
However, just for the record, I disagree:
I think having to remember to close the env at the end of the test is ugly, error-prone, and also unnecessary, since in all cases the env is closed when it gets out of scope. In my opinion, the only tests where an env should be closed explicitly are 1) the tests related to closing the envs, and 2) tests that require creating a temporary environment just to check spaces (as in the case here).
For example, consider this:
https://github.com/lebrice/gym/blob/49ee20904ac3a4a1dba3020d1ebd11076848f376/tests/vector/test_vector_env.py#L52-L54
@lebrice that is a strange bug as I can't replicate it locally, neither from terminal or docker environment. |
I believe you can re-run the workflows that failed from the Actions view @pseudo-rnd-thoughts . This would be better IMO than me making an empty commit. |
tests/vector/test_vector_env.py
Outdated
import numpy as np | ||
import pytest | ||
|
||
from gym.spaces import Tuple | ||
from gym import Space, spaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove spaces as I dont think you should need to import the module and add have the following line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lebrice Additionally, could you update the gym documentation to explain this update |
Signed-off-by: Fabrice Normandin <[email protected]>
Hey @pseudo-rnd-thoughts. Sure, I guess we could mention that this is now possible (or that the bug has been fixed, depending on your perspective) in the "advanced" portion of the VectorEnv API documentation. However, I think it's probably best that we hold off until #2072, so that we don't encourage users to use chunking manually, and instead properly document the new VectorEnv subclass in the docs. Does that sounds reasonable? |
Signed-off-by: Fabrice Normandin <[email protected]>
Hi @lebrice, that sounds reasonable to me. It is just at some point, we need to update the vector API for these changes |
Signed-off-by: Fabrice Normandin <[email protected]>
Hey @pseudo-rnd-thoughts can we run the workflows again? |
Hey, @lebrice, will do, fyi it will probably be at the weekend when this gets fully merged I hope as @RedTachyon is the other reviewer for this PR and is at a conference all week. |
No rush, thanks @pseudo-rnd-thoughts ! |
gym/vector/async_vector_env.py
Outdated
@@ -635,6 +635,16 @@ def _worker(index, env_fn, pipe, parent_pipe, shared_memory, error_queue): | |||
assert shared_memory is None | |||
env = env_fn() | |||
parent_pipe.close() | |||
|
|||
def step_fn(actions): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we defining this function? It doesn't seem to be called anywhere. If it's actually used, it'd need a type hint and some comments/docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in fcebd76
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Actually fixed in 15208e3
gym/vector/async_vector_env.py
Outdated
if isinstance(env, VectorEnv): | ||
# VectorEnvs take care of resetting the envs that are done. | ||
pass | ||
elif done: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it maybe be better to just do a single condition if done and isinstance(env, Env)
or if done and not isinstance(env, VectorEnv)
? (not sure which would be clearer, but the pass
gave me a bit of a pause since it's rarely present in released code, and it felt more like something unfinished)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 055087b
self._rewards = np.zeros((self.num_envs,), dtype=np.float64) | ||
self._dones = np.zeros((self.num_envs,), dtype=np.bool_) | ||
shape = (self.num_envs,) | ||
if isinstance(self.envs[0].unwrapped, VectorEnv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ever possible that this condition would be different for self.envs[0]
and self.envs[1]
? Also, do we actually need to unwrap it? (I'm not sure off the top of my head how wrappers interact with VectorEnvs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see how this would only be true for some envs.. I mean, it would mean that they are passing a mix of Envs and VectorEnvs to a VectorEnv, which I can't imagine being useful..
Yeah unwrapping it is necessary, since most wrappers can work the same way for both envs and VectorEns.
if isinstance(env, VectorEnv): | ||
# VectorEnvs take care of resetting the envs that are done. | ||
pass | ||
elif done: | ||
info["terminal_observation"] = observation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need this with the AutoReset wrapper introduced recently? And wouldn't it cause a redundant double reset in some cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is still necessary, since the if done
check doesn't work with VectorEnvs. This check is there so we don't reset the env when the episode is done (as is currently still done in the case of a single env here).
Not sure if/how the AutoReset wrapper relates to this.
Lmk if that wasn't clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 055087b
tests/vector/test_vector_env.py
Outdated
@@ -58,3 +62,113 @@ def test_custom_space_vector_env(): | |||
|
|||
assert isinstance(env.single_action_space, CustomSpace) | |||
assert isinstance(env.action_space, Tuple) | |||
|
|||
|
|||
@pytest.mark.parametrize("base_env", ["Pendulum-v1", "CartPole-v1"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why parametrize only over the two envs instead of all of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean over all gym envs?
I mean, sure, I'm all for it, but the current vectorenv tests (i.e. the only test above) is only using CartPole-v1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parametrized the test with all classic_control + toy_text envs in 9c0e308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @RedTachyon is this good? (testing with all envs where should_skipp_env_spec_for_test = False
)?
As-is, the parametrization of this test generates 579 tests, which take about 1:20 to run on my end (with pytest xdist and 4 parallel workers). I think this will probably take something like 5 minutes to run on GitHub, depending on the machine's hardware.
(here's a link for that function btw: https://github.com/lebrice/gym/blob/e913bc81b83b43ae8ca9b3a02c981b74d31017ea/tests/envs/spec_list.py#L20)
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
@lebrice Could you fix the CI issues? |
Hey there @pseudo-rnd-thoughts ! Yes, will do! |
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Hey @pseudo-rnd-thoughts I think I fixed everything on my end, but I'd need your approval so the CI hooks can run. |
@lebrice There still seems to be an issue with the PR plus looking at the runtime, the CI has jumped from around 1 to 2 minutes to 10+ minutes. Could you investigate why your PR is causing the time to increase |
Yes that's what I was mentioning here: I must have misinterpreted what he meant by "using all of them" (gym envs)? In any case, I'd be happy to turn it down, if that's not what he meant. |
Sorry, I hadn't seen that comment. Personally, I think that is overkill and we could test with a single env type, just to see if it works. I will message @RedTachyon about it |
@lebrice In the meantime, could you fix the CI so it works with the current tests as the number of environments can be modified easily |
Allows for the nesting of Async/Sync VectorEnvs
*: When nesting Async/Async environments, only the innermost env can have
daemon=True
, since daemonic processes cannot have children.Somewhat related to #2072 (based on a suggestion from @tristandeleu ). A new Wrapper could also be eventually be introduced in another PR to "unchunk" the observations/actions/rewards/dones, (flatten the first two batch dimensions).
There currently aren't any tests specifically about the handling of Tuple/Dict/non-standard observation/action spaces. I could add some, if those cases aren't already covered by
test_[sync/async]_vector_env.py
.Signed-off-by: Fabrice Normandin [email protected]