-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206
Conversation
@@ -0,0 +1,118 @@ | |||
Why do we need Env Wrapper | |||
------------------------------------------------------ | |||
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
。should be changed to .
@@ -0,0 +1,118 @@ | |||
Why do we need Env Wrapper | |||
------------------------------------------------------ | |||
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
( in chineese version should also be changed into (
Why do we need Env Wrapper | ||
------------------------------------------------------ | ||
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment. | ||
For instance:When we are training intelligences, we usually need to change the definition of the environment in order to achieve better training results, and these processing techniques are somewhat universal.For some environments, normalising the observed state is a very common pre-processing method. This processing makes training faster and more stable. If we extract this common part and put this preprocessing in an Env Wrapper, we avoid duplicate development. That is, if we want to change the way we normalise the observation state in the future, we can simply change it in this Env Wrapper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe intelligence should be replaced with agent.
Env Wrapper offered by DI-engine | ||
============================================== | ||
|
||
DI-engine provides a large number of defined and generic Env Wrapper,The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation,we refered `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ,and folloiw the form of gym.Wrapper,which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ,In total, these include the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined -> pre-defined
,-> .
folloiw -> follow
|
||
DI-engine provides a large number of defined and generic Env Wrapper,The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation,we refered `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ,and folloiw the form of gym.Wrapper,which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ,In total, these include the following: | ||
|
||
- NoopResetEnv:add a reset method to the environment. Resets the environment after some no-operations.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resets -> reset
|
||
- MaxAndSkipEnv: Each ``skip`` frame(doing the same action)returns the maximum value of the two most recent frames.(for max pooling across time steps)。 | ||
|
||
- WarpFrame: Convert the size of the image frame to 84x84, such as `Nature‘s paper <https://www.deepmind.com/publications/human-level-control-through-deep-reinforcement-learning>`_ and it's following work。(Note that this registrar also converts RGB images to GREY images) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such as -> according to
。 -> .
|
||
- ScaledFloatFrame: Normalize status values to 0~1。 | ||
|
||
- ClipRewardEnv: Cuts the reward to {+1, 0, -1} by the positive or negative of the reward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cuts -> Clip
|
||
- ClipRewardEnv: Cuts the reward to {+1, 0, -1} by the positive or negative of the reward. | ||
|
||
- FrameStack: Set the nearest state frame of the stacked n_frames to the current state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the current state to be the stacked nearest n frames.
|
||
- RamWrapper: Converting the ram state of the original environment into an image-like state by extending the dimensionality of the observed state. | ||
|
||
- EpisodicLifeEnv:Let the death of an intelligence in the environment mark the end of an episode (game over), and only reset the game when the real game is over. In general, this helps the algorithm to estimate the value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intelligence -> agent
|
||
How to use Env Wrapper | ||
------------------------------------ | ||
The next question is how should we wrap the environment with Env Wrapper. One solution is to wrap the environment manually and explicitly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how should we -> how to
In particular, the Env Wrappers in the list are wrapped outside the environment in order. In the example above, the Env Wrapper wraps a layer of MaxAndSkipWrapper and then a layer of ScaledFloatFrameWrapper, while the Env Wrapper serves to add functionality but does not change the original functionality. | ||
|
||
|
||
How to customise Env Wrapper (Example) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to customise an Env Wrapper
|
||
How to customise Env Wrapper (Example) | ||
----------------------------------------- | ||
Taking ObsNormEnv wrapper as an example。In order to normalis the observed state,we only need to change two methods in the original environment class:step method and reset method,The rest of the method remains the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalis -> normalise
。 -> .
, -> ,
: -> :
How to customise Env Wrapper (Example) | ||
----------------------------------------- | ||
Taking ObsNormEnv wrapper as an example。In order to normalis the observed state,we only need to change two methods in the original environment class:step method and reset method,The rest of the method remains the same. | ||
Note that in some cases, as the normalised bounds of the observed state change, info will need to be modified accordingly.Please also note that the essence of the ObsNormEnv wrapper is to add additional functionality to the original environment, which is what the wrapper is all about. \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info is needed to be modified accordingly
No description provided.