The environmental action limits seems not work in Tianshou PPO #92

terrancelu92 · 2021-12-12T05:23:50Z

It seems that the action selected by PPO algorithms is not confined in the limits defined in the environments.
For example, the actionspace for the below testcase is self.action_space = spaces.Box(np.array([0, 0]).astype(np.float32), np.array([1, 1]).astype(np.float32)). But history actions from PPO are outside the boundary of [0,1].
mpc-drl-tl/testcases/gym-environments/five-zones-air/test_v2/test_ppo_tianshou.py

It is strange since the Tianshou PPO has the attributes action_scaling and action_bound_method. They have been activated in the above testcase but somehow does not work.

map_action is here in Tianshou.

Similar issues are reported in other RL libraries.
openai/baselines#121
rlworkgroup/garage#710

The text was updated successfully, but these errors were encountered:

terrancelu92 · 2021-12-12T06:50:37Z

In Tianshou for action mapping,

This function is called in :meth:`~tianshou.data.Collector.collect` and only
        affects action sending to env. Remapped action will not be stored in buffer
        and thus can be viewed as a part of env (a black box action transformation).
        Action mapping includes 2 standard procedures: bounding and scaling. Bounding
        procedure expects original action range is (-inf, inf) and maps it to [-1, 1],
        while scaling procedure expects original action range is (-1, 1) and maps it
        to [action_space.low, action_space.high]. Bounding procedure is applied first.

terrancelu92 · 2021-12-12T06:57:51Z

since the remapped action will not be stored in buffer and viewed as a black box action transformation, the action output from the buffer should also do the transformation.
First bounding to [-1,1]

act = np.clip(act, -1.0, 1.0)

Then scaling

low, high = self.action_space.low, self.action_space.high
                act = low + (high - low) * (act + 1.0) / 2.0

Close this issue.

terrancelu92 self-assigned this Dec 12, 2021

terrancelu92 changed the title ~~The environmental action limits are not considered in Tianshou PPO~~ The environmental action limits do not work in Tianshou PPO Dec 12, 2021

terrancelu92 changed the title ~~The environmental action limits do not work in Tianshou PPO~~ The environmental action limits seems not work in Tianshou PPO Dec 12, 2021

terrancelu92 closed this as completed Dec 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The environmental action limits seems not work in Tianshou PPO #92

The environmental action limits seems not work in Tianshou PPO #92

terrancelu92 commented Dec 12, 2021 •

edited

Loading

terrancelu92 commented Dec 12, 2021 •

edited

Loading

terrancelu92 commented Dec 12, 2021 •

edited

Loading

The environmental action limits seems not work in Tianshou PPO #92

The environmental action limits seems not work in Tianshou PPO #92

Comments

terrancelu92 commented Dec 12, 2021 • edited Loading

terrancelu92 commented Dec 12, 2021 • edited Loading

terrancelu92 commented Dec 12, 2021 • edited Loading

terrancelu92 commented Dec 12, 2021 •

edited

Loading

terrancelu92 commented Dec 12, 2021 •

edited

Loading

terrancelu92 commented Dec 12, 2021 •

edited

Loading