You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I agree some documentation would be nice, particularly with the output that is logged, can anyone explain what the output variables mean? Why (raw score / avg reward) over time isn't included?
It looks like the raw env outputs are logged to a directory in /tmp created in monitor.py, which you can tail. It looks like they can be loaded after the fact with this.
Would it be possible to make the A2C code (and other algorithms) a bit more pedagogical?
The text was updated successfully, but these errors were encountered: