You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The source code of baselines and PPO is download use git and left untouched, I spend quite some time adjust the hyperparameter and it doesn't seem have much effect on result if there any,
Does any one have any idea what's could go wrong?
The text was updated successfully, but these errors were encountered:
MrForExample
changed the title
[Classic Control Promble] Training baselines TF2-PPO2 to solve Pendulum-v0 extremely slow and unstable
[Classic Control Promble] Training baselines branch TF2-PPO2 to solve Pendulum-v0 extremely slow and unstable
May 5, 2020
@DanielTakeshi Thanks for your information, I want find the PPO implement use TF2, and stable-baselines is use TF1.
But never mind, I spent 2 days and write my own PPO implement use TF2 and now seems work fine!
Command:
python -m baselines.run --alg=ppo2 --env=Pendulum-v0 --num_timesteps=1e6 --reward_scale=0.001 --save_path=./models/Pendulm_ppo2
The hyperparameter use in baselines/ppo2/defaults.py:
Example result:
The source code of baselines and PPO is download use git and left untouched, I spend quite some time adjust the hyperparameter and it doesn't seem have much effect on result if there any,
Does any one have any idea what's could go wrong?
The text was updated successfully, but these errors were encountered: