-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deobfuscation of the code base + pep8 and fixes #481
Conversation
This reverts commit 54e6079.
HTML Docs
Hi @hill-a and @araffin! Any chance you guys are still interested in merging your changes? The amount of work that you have done on improving the code quality is incredible; and we'd love to cooperate instead of competing. If you are, let's reopen this and work on the merge conflicts / divergence points between refactors together. Putting @joschu, @miramurati and @jachiam as principal stakeholders in the loop on this as well. |
Hello @pzhokhov, We are glad you finally give some sign of life ;) (almost two months!) With @hill-a, we would love those changes to be merged (even though the two codebases have a bit diverged, and it will need some work to fix the conflicts). One of my first question is a general one: how is OpenAI Baselines managed, what is the roadmap and what is the importance that Openai gives to Baselines ? From what I understood, you are now the only one that is fully in charge of Baselines (as @unixpickle was some months ago), and the repo comes from different people for each algo (e.g. @joschu for TRPO, @matthiasplappert for HER, ...) which explains the code duplication and the different codestyles. Another question is that there seems to be an internal Baselines repo that differs from the github version. What are the differences and why is it needed? (I think that is related but could you explain me some weird commits like this one (lots of commit text for almost no changes): e791565) Finally, if we merged, we obviously would like to be credited in the README ;) |
Chiming in to say hello, introduce myself, and to answer a few of the questions. Intro: I'm Josh, and I'm a researcher on the safety team at OpenAI! Lately I've been thinking about getting more involved in making sure our public software releases are more user-friendly. On how Baslines is managed: You're correct that @pzhokhov is the primary owner of Baselines, and that the algorithms were originally implemented by many different people. OpenAI is currently thinking about the long term state of this project (and others), and will hopefully be able to discuss a complete plan in the near future. Since @pzhokhov has been writing the roadmap for Baselines, I'll leave it to him to talk about that. On the internal version of Baselines: recall that OpenAI is first and foremost a research organization. The internal repo changes rapidly to support research use cases, and includes code from in-progress research projects which aren't ready for release. Regarding credit: of course we would gratefully acknowledge such contributions. :) |
Hi @araffin and @hill-a ! Thanks for maintaining your interest in the project!
|
Hello @pzhokhov, Thank you for your answer. My main concern is still the internal repo. I understand perfectly the usefulness of a private repo for in-progress research. However, as for now, it impacts the public repo in an unpredictable way: commits messages coming from the subrepo do not reflect the changes and some unsused code (in the public repo) is also added. A solution for that would be to have a proper fork of baselines and do PR from time to time to update public repo. |
Public roadmap - fair point; we are working on finalizing it (should be done and published in a few days). |
Turns out the roadmap of baselines is a high enough visibility that bosses stepped in and wanted to polish it and make a press-release with it, so cannot publish it before that. |
Hi, @pzhokhov what does that mean in terms of timing? A month delay? |
Coming into this conversation from the outside... as a user of both pieces of software, I have found stable-baselines much more consistent. From a research side, it's great that you guys are publishing the algorithm implementations as it is a great resource to be able to have vetted implementations, especially when the authors of the original papers clarify subtle points in the openai implementations. From a world-impact perspective, the changes that stable-baselines have made take openai baselines and make it actually useful by taking the "raw" algorithms and making them consistent, approachable, and reusable ... I would argue for openai to consider stable-baselines as the project that openai should contribute to. I'm only half joking... presumably openai's goal is to have gym and baselines eventually take on a life of its own... this seems like a perfect way to do it. Both teams are doing great work... as a community member would love to find a way for you both to benefit from a strong relationship, as you each bring something valuable to the table! Back to my corner...
|
Commenting as a user as well: what OpenAI has done is great: It went for the unification, providing common gyms, providing baselines. The idea and some implementations are great, no question. And don't let the following change that, I'm a huge fan of what you do/did, it is one of those cornerstones that really moves the community forward! But: The whole repositories are not quite what they should be (talking of gym and baseline)
stable-baseline has done exactly all that. Baseline is an experimental, you-have-to-figure-it-out-yourself repository while My opinion: gym should become more like stable-baseline and stable-baseline should become standalone. It is just so much better to use (yeah, the extra effort is not too much. But it's about the way of approaching the user). Again: Great compliment for what you've done at OpenAI. But user-friendlyness and the last 20% of stability is what (for me, IMHO, as a user) seems missing and where stable-baseline seems to know what they do. |
As of today:
Considering the excellent reputation of OpenAI, I wouldn't be suprised if openai/baselines because the standard RL framework one day. However, a necessary condition would be to improve the spaghetti code, or at least provide a standard interface to the users. |
* Update DQN arguments: add double_q * Fix CI build * [ci skip] Update link * Replace pdf link with arxiv link
tf.session().__enter__()
being used, rather thansess = tf.session()
and passing the session to the objectsValueError: Cannot take the length of Shape with unknown rank.
inacktr
, when runningrun_atari.py
script.EOFError
when reading from connection in theworker
insubproc_vec_env.py
behavior_clone
weight loading and saving for GAILtrpo_mpi.py
reset_task
insubproc_vec_env.py
Missing: tests for acktr continuous (+ HER, gail but they rely on mujoco...)