A2C some feedback #413

simoninithomas · 2018-05-27T09:28:19Z

Hi!

First of all thank you very much for your awesome work, the baselines are really good, but please comment your code because some part of it are not intuitive at all.

I think it can be a good idea to have a sort of documentation for each architecture and a small tutorial to explain why your architecture is implemented like that..

By the way I wanted to know why you implement some functions instead of using Tensorflow? For instance cat_entropy. Is there a reason or it's just because you prefer implement them?

Thanks again for your work!
Have a great day!

simoninithomas changed the title ~~A2C some feedback and what are step_policy and train_policy?~~ A2C some feedback and some questions May 27, 2018

simoninithomas changed the title ~~A2C some feedback and some questions~~ A2C some feedback May 31, 2018

lhk mentioned this issue Jun 17, 2018

PPO some feedback #445

Open

araffin mentioned this issue Jul 10, 2018

why every algo reimplements policies that show up in others as well #460

Closed

araffin mentioned this issue Jul 28, 2018

Deobfuscation of the code base + pep8 and fixes #481

Closed

AdamGleave pushed a commit to HumanCompatibleAI/baselines that referenced this issue Jul 24, 2019

Add missing dependencies for documentation build (openai#413)

5cbdd31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A2C some feedback #413

A2C some feedback #413

simoninithomas commented May 27, 2018 •

edited

Loading

A2C some feedback #413

A2C some feedback #413

Comments

simoninithomas commented May 27, 2018 • edited Loading

simoninithomas commented May 27, 2018 •

edited

Loading