translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

walterwhd · 2022-11-24T09:15:47Z

No description provided.

source/04_best_practice/env_wrapper.rst

walterwhd · 2022-11-25T13:24:51Z

kxzxvbk · 2022-11-30T02:51:50Z

source/04_best_practice/env_wrapper.rst

@@ -0,0 +1,118 @@
+Why do we need Env Wrapper
+------------------------------------------------------
+Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments （such as atari or mujoco）reinforcement learning may also include a variety of custom environments.Overall，The essence of the Env Wrapper is to add certain generic additional features to our custom environment.


。should be changed to .

kxzxvbk · 2022-11-30T02:53:21Z

source/04_best_practice/env_wrapper.rst

@@ -0,0 +1,118 @@
+Why do we need Env Wrapper
+------------------------------------------------------
+Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments （such as atari or mujoco）reinforcement learning may also include a variety of custom environments.Overall，The essence of the Env Wrapper is to add certain generic additional features to our custom environment.


（ in chineese version should also be changed into (

kxzxvbk · 2022-11-30T03:00:27Z

source/04_best_practice/env_wrapper.rst

+Why do we need Env Wrapper
+------------------------------------------------------
+Environment module is one of the most vital modules in reinforcement learning。 We train our agents in these environmnets and we allow them to explore and learn in these envirnments。In addition to a number of benchmark environments （such as atari or mujoco）reinforcement learning may also include a variety of custom environments.Overall，The essence of the Env Wrapper is to add certain generic additional features to our custom environment.
+For instance：When we are training intelligences, we usually need to change the definition of the environment in order to achieve better training results, and these processing techniques are somewhat universal.For some environments, normalising the observed state is a very common pre-processing method. This processing makes training faster and more stable. If we extract this common part and put this preprocessing in an Env Wrapper, we avoid duplicate development. That is, if we want to change the way we normalise the observation state in the future, we can simply change it in this Env Wrapper.


Maybe intelligence should be replaced with agent.

kxzxvbk · 2022-11-30T03:09:53Z

source/04_best_practice/env_wrapper.rst

+Env Wrapper offered by DI-engine
+==============================================
+
+DI-engine provides a large number of defined and generic Env Wrapper，The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation，we refered  `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ，and folloiw the form of gym.Wrapper，which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ，In total, these include the following:


defined -> pre-defined
，-> .
folloiw -> follow

kxzxvbk · 2022-11-30T03:10:14Z

source/04_best_practice/env_wrapper.rst

+
+DI-engine provides a large number of defined and generic Env Wrapper，The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation，we refered  `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ，and folloiw the form of gym.Wrapper，which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ，In total, these include the following:
+
+- NoopResetEnv：add a reset method to the environment. Resets the environment after some no-operations..


resets -> reset

kxzxvbk · 2022-11-30T03:11:15Z

source/04_best_practice/env_wrapper.rst

+
+- MaxAndSkipEnv： Each ``skip`` frame（doing the same action）returns the maximum value of the two most recent frames.(for max pooling across time steps)。
+
+- WarpFrame： Convert the size of the image frame to 84x84, such as `Nature‘s paper <https://www.deepmind.com/publications/human-level-control-through-deep-reinforcement-learning>`_  and it's following work。(Note that this registrar also converts RGB images to GREY images)


such as -> according to
。 -> .

kxzxvbk · 2022-11-30T03:12:03Z

source/04_best_practice/env_wrapper.rst

+
+- ScaledFloatFrame： Normalize status values to 0~1。
+
+- ClipRewardEnv： Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.


Cuts -> Clip

kxzxvbk · 2022-11-30T03:21:49Z

source/04_best_practice/env_wrapper.rst

+
+- ClipRewardEnv： Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.
+
+- FrameStack： Set the nearest state frame of the stacked n_frames to the current state.


Set the current state to be the stacked nearest n frames.

kxzxvbk · 2022-11-30T03:23:24Z

source/04_best_practice/env_wrapper.rst

+
+- RamWrapper： Converting the ram state of the original environment into an image-like state by extending the dimensionality of the observed state.
+
+- EpisodicLifeEnv：Let the death of an intelligence in the environment mark the end of an episode (game over), and only reset the game when the real game is over. In general, this helps the algorithm to estimate the value.


intelligence -> agent

kxzxvbk · 2022-11-30T03:24:57Z

source/04_best_practice/env_wrapper.rst

+
+How to use Env Wrapper
+------------------------------------
+The next question is how should we wrap the environment with Env Wrapper. One solution is to wrap the environment manually and explicitly：


how should we -> how to

kxzxvbk · 2022-11-30T03:26:37Z

source/04_best_practice/env_wrapper.rst

+In particular, the Env Wrappers in the list are wrapped outside the environment in order. In the example above, the Env Wrapper wraps a layer of MaxAndSkipWrapper and then a layer of ScaledFloatFrameWrapper, while the Env Wrapper serves to add functionality but does not change the original functionality.
+
+
+How to customise Env Wrapper （Example）


How to customise an Env Wrapper

kxzxvbk · 2022-11-30T03:27:50Z

source/04_best_practice/env_wrapper.rst

+
+How to customise Env Wrapper （Example）
+-----------------------------------------
+Taking ObsNormEnv wrapper as an example。In order to normalis the observed state，we only need to change two methods in the original environment class：step method and reset method，The rest of the method remains the same.


normalis -> normalise
。 -> .
， -> ,
： -> :

kxzxvbk · 2022-11-30T03:28:37Z

source/04_best_practice/env_wrapper.rst

+How to customise Env Wrapper （Example）
+-----------------------------------------
+Taking ObsNormEnv wrapper as an example。In order to normalis the observed state，we only need to change two methods in the original environment class：step method and reset method，The rest of the method remains the same.
+Note that in some cases, as the normalised bounds of the observed state change, info will need to be modified accordingly.Please also note that the essence of the ObsNormEnv wrapper is to add additional functionality to the original environment, which is what the wrapper is all about. \


info is needed to be modified accordingly

walterwhd added 13 commits November 23, 2022 14:18

Update env_wrapper_zh.rst

fe70f99

Create env_wrapper.rst

60ea1b3

Update env_wrapper.rst

fad4b8f

Update env_wrapper.rst

1090b51

Update env_wrapper.rst

29ac19e

Update env_wrapper.rst

670b768

Update env_wrapper.rst

9209bb7

Update env_wrapper.rst

7ad9190

Update env_wrapper.rst

5ef746f

Update env_wrapper.rst

5fd7700

Update env_wrapper.rst

d94f1eb

Update index.rst

adf31a1

Update env_wrapper.rst

23b3d6b

TuTuHuss reviewed Nov 24, 2022

View reviewed changes

source/04_best_practice/env_wrapper.rst Outdated Show resolved Hide resolved

TuTuHuss mentioned this pull request Nov 24, 2022

translation(zyc): add gym_super_mario_bros English file #205

Merged

PaParaZz1 changed the title ~~translation（whd）： add env_wrapper english file and correct mistake in its Chinese version~~ translation(whd): add env_wrapper english file and correct mistake in its Chinese version Nov 25, 2022

Update env_wrapper.rst

eca041d

kxzxvbk reviewed Nov 30, 2022

View reviewed changes

PaParaZz1 changed the title ~~translation(whd): add env_wrapper english file and correct mistake in its Chinese version~~ translation(whd): add env_wrapper English file and correct mistake in its Chinese version Dec 1, 2022

Update env_wrapper.rst

ce73599

walterwhd changed the title ~~translation(whd): add env_wrapper English file and correct mistake in its Chinese version~~ translation(whd): add env_wrapper english file and correct mistake in its Chinese version Dec 7, 2022

walterwhd closed this Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

walterwhd commented Nov 24, 2022

walterwhd commented Nov 25, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022

kxzxvbk Nov 30, 2022


		DI-engine provides a large number of defined and generic Env Wrapper，The user can wrap it directly on top of the environment they need to use according to their needs.In the process of implementation，we refered `OpenAI Baselines <https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py>`_ ，and folloiw the form of gym.Wrapper，which is `Gym.Wrapper <https://www.gymlibrary.dev/api/wrappers/>`_ ，In total, these include the following:

		- NoopResetEnv：add a reset method to the environment. Resets the environment after some no-operations..


		- MaxAndSkipEnv： Each ``skip`` frame（doing the same action）returns the maximum value of the two most recent frames.(for max pooling across time steps)。

		- WarpFrame： Convert the size of the image frame to 84x84, such as `Nature‘s paper <https://www.deepmind.com/publications/human-level-control-through-deep-reinforcement-learning>`_ and it's following work。(Note that this registrar also converts RGB images to GREY images)


		- ScaledFloatFrame： Normalize status values to 0~1。

		- ClipRewardEnv： Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.


		- ClipRewardEnv： Cuts the reward to {+1, 0, -1} by the positive or negative of the reward.

		- FrameStack： Set the nearest state frame of the stacked n_frames to the current state.


		- RamWrapper： Converting the ram state of the original environment into an image-like state by extending the dimensionality of the observed state.

		- EpisodicLifeEnv：Let the death of an intelligence in the environment mark the end of an episode (game over), and only reset the game when the real game is over. In general, this helps the algorithm to estimate the value.

		In particular, the Env Wrappers in the list are wrapped outside the environment in order. In the example above, the Env Wrapper wraps a layer of MaxAndSkipWrapper and then a layer of ScaledFloatFrameWrapper, while the Env Wrapper serves to add functionality but does not change the original functionality.


		How to customise Env Wrapper （Example）

translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

translation(whd): add env_wrapper english file and correct mistake in its Chinese version #206

Conversation

walterwhd commented Nov 24, 2022

walterwhd commented Nov 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment