Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

Closed
wants to merge 15 commits into from

Conversation

walterwhd
Copy link
Contributor

No description provided.

@PaParaZz1 PaParaZz1 changed the title translation(whd): add env_wrapper english file and correct mistake in its Chinese version translation(whd): add env_wrapper english file and correct mistake in its Chinese version Nov 25, 2022
@walterwhd
Copy link
Contributor Author

@TuTuHuss

@@ -0,0 +1,118 @@
Why do we need Env Wrapper
------------------------------------------------------
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

。should be changed to .

@@ -0,0 +1,118 @@
Why do we need Env Wrapper
------------------------------------------------------
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( in chineese version should also be changed into (

Why do we need Env Wrapper
------------------------------------------------------
Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments (such as atari or mujoco)reinforcement learning may also include a variety of custom environments.Overall,The essence of the Env Wrapper is to add certain generic additional features to our custom environment.
For instance:When we are training intelligences, we usually need to change the definition of the environment in order to achieve better training results, and these processing techniques are somewhat universal.For some environments, normalising the observed state is a very common pre-processing method. This processing makes training faster and more stable. If we extract this common part and put this preprocessing in an Env Wrapper, we avoid duplicate development. That is, if we want to change the way we normalise the observation state in the future, we can simply change it in this Env Wrapper.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe intelligence should be replaced with agent.

Env Wrapper offered by DI-engine
==============================================

DI-engine provides a large number of defined and generic Env Wrapper,The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation,we refered `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ,and folloiw the form of gym.Wrapper,which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ,In total, these include the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined -> pre-defined
,-> .
folloiw -> follow


DI-engine provides a large number of defined and generic Env Wrapper,The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation,we refered `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ,and folloiw the form of gym.Wrapper,which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ,In total, these include the following:

- NoopResetEnv:add a reset method to the environment. Resets the environment after some no-operations..
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resets -> reset


- MaxAndSkipEnv: Each ``skip`` frame(doing the same action)returns the maximum value of the two most recent frames.(for max pooling across time steps)。

- WarpFrame: Convert the size of the image frame to 84x84, such as `Nature‘s paper <https://www.deepmind.com/publications/human-level-control-through-deep-reinforcement-learning>`_ and it's following work。(Note that this registrar also converts RGB images to GREY images)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

such as -> according to
。 -> .


- ScaledFloatFrame: Normalize status values to 0~1。

- ClipRewardEnv: Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cuts -> Clip


- ClipRewardEnv: Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.

- FrameStack: Set the nearest state frame of the stacked n_frames to the current state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the current state to be the stacked nearest n frames.


- RamWrapper: Converting the ram state of the original environment into an image-like state by extending the dimensionality of the observed state.

- EpisodicLifeEnv:Let the death of an intelligence in the environment mark the end of an episode (game over), and only reset the game when the real game is over. In general, this helps the algorithm to estimate the value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intelligence -> agent


How to use Env Wrapper
------------------------------------
The next question is how should we wrap the environment with Env Wrapper. One solution is to wrap the environment manually and explicitly:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should we -> how to

In particular, the Env Wrappers in the list are wrapped outside the environment in order. In the example above, the Env Wrapper wraps a layer of MaxAndSkipWrapper and then a layer of ScaledFloatFrameWrapper, while the Env Wrapper serves to add functionality but does not change the original functionality.


How to customise Env Wrapper (Example)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to customise an Env Wrapper


How to customise Env Wrapper (Example)
-----------------------------------------
Taking ObsNormEnv wrapper as an example。In order to normalis the observed state,we only need to change two methods in the original environment class:step method and reset method,The rest of the method remains the same.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalis -> normalise
。 -> .
, -> ,
: -> :

How to customise Env Wrapper (Example)
-----------------------------------------
Taking ObsNormEnv wrapper as an example。In order to normalis the observed state,we only need to change two methods in the original environment class:step method and reset method,The rest of the method remains the same.
Note that in some cases, as the normalised bounds of the observed state change, info will need to be modified accordingly.Please also note that the essence of the ObsNormEnv wrapper is to add additional functionality to the original environment, which is what the wrapper is all about. \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info is needed to be modified accordingly

@PaParaZz1 PaParaZz1 changed the title translation(whd): add env_wrapper english file and correct mistake in its Chinese version translation(whd): add env_wrapper English file and correct mistake in its Chinese version Dec 1, 2022
@walterwhd walterwhd changed the title translation(whd): add env_wrapper English file and correct mistake in its Chinese version translation(whd): add env_wrapper english file and correct mistake in its Chinese version Dec 7, 2022
@walterwhd walterwhd closed this Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants