PPO cpp project and C++ export (openai#585)

* Adding PPO_CPP project description. * Changing PPO_CPP project description. * Changelog addition on new dependant project. * Update changelog.rst * Added section on C++ portability of Tensorflow models
shwang · Nov 28, 2019 · 04c35e1 · 04c35e1
1 parent b461adb
commit 04c35e1
Show file tree

Hide file tree

Showing 3 changed files with 40 additions and 1 deletion.
diff --git a/docs/guide/export.rst b/docs/guide/export.rst
@@ -46,6 +46,33 @@ function to obtain model parameters, construct the network manually in PyTorch a
 See `discussion #372 <https://github.com/hill-a/stable-baselines/issues/372>`_ for details.
 
 
+Export to C++
+-----------------
+
+Tensorflow, which is the backbone of Stable Baselines, is fundamentally a C/C++ library despite being most commonly accessed
+through the Python frontend layer. This design choice means that the models created at Python level should generally be
+fully compliant with the respective C++ version of Tensorflow.
+
+.. warning::
+   It is advisable not to mix-and-match different versions of Tensorflow libraries, particularly in terms of the state.
+   Moving computational graphs is generally more forgiving. As a matter of fact, mentioned below `PPO_CPP <https://github.com/Antymon/ppo_cpp>`_ project uses
+   graphs generated with Python Tensorflow 1.x in C++ Tensorflow 2 version.
+
+Stable Baselines comes very handily when hoping to migrate a computational graph and/or a state (weights) as
+the existing algorithms define most of the necessary computations for you so you don't need to recreate the core of the algorithms again.
+This is exactly the idea that has been used in the `PPO_CPP <https://github.com/Antymon/ppo_cpp>`_ project, which executes the training at the C++ level for the sake of
+computational efficiency. The graphs are exported from Stable Baselines' PPO2 implementation through ``tf.train.export_meta_graph``
+function. Alternatively, and perhaps more commonly, you could use the C++ layer only for inference. That could be useful
+as a deployment step of server backends or optimization for more limited devices.
+
+.. warning::
+   As a word of caution, C++-level APIs are more imperative than their Python counterparts or more plainly speaking: cruder.
+   This is particularly apparent in Tensorflow 2.0 where the declarativeness of Autograph exists only at Python level. The
+   C++ counterpart still operates on Session objects' use, which are known from earlier versions of Tensorflow. In our use case,
+   availability of graphs utilized by Session depends on the use of ``tf.function`` decorators. However, as of November 2019, Stable Baselines still
+   uses Tensorflow 1.x in the main version which is slightly easier to use in the context of the C++ portability.
+
+
 Export to tensorflowjs / tfjs
 -----------------------------
 

diff --git a/docs/misc/changelog.rst b/docs/misc/changelog.rst
@@ -64,6 +64,8 @@ Documentation:
 - Fix `result_plotter` example
 - Fix typo in algos.rst, "containes" to "contains" (@SyllogismRXS)
 - Fix outdated source documentation for load_results
+- Add PPO_CPP project (@Antymon)
+- Add section on C++ portability of Tensorflow models (@Antymon)
 
 Release 2.8.0 (2019-09-29)
 --------------------------
@@ -544,4 +546,4 @@ Thanks to @bjmuld @iambenzo @iandanforth @r7vme @brendenpetersen @huvar @abhiskk
 @EliasHasle @mrakgr @Bleyddyn @antoine-galataud @junhyeokahn @AdamGleave @keshaviyengar @tperol
 @XMaster96 @kantneel @Pastafarianist @GerardMaggiolino @PatrickWalter214 @yutingsz @sc420 @Aaahh @billtubbs
 @Miffyli @dwiel @miguelrass @qxcv @jaberkow @eavelardev @ruifeng96150 @pedrohbtp @srivatsankrishnan @evilsocket
-@MarvineGothic @jdossgollin @SyllogismRXS @rusu24edward @jbulow
+@MarvineGothic @jdossgollin @SyllogismRXS @rusu24edward @jbulow @Antymon
diff --git a/docs/misc/projects.rst b/docs/misc/projects.rst
@@ -168,3 +168,13 @@ this study are from stable-baselines.
 | Email: [email protected]
 | Github: https://github.com/harvard-edge/quarl
 | Paper: https://arxiv.org/pdf/1910.01055.pdf
+
+
+PPO_CPP: C++ version of a Deep Reinforcement Learning algorithm PPO
+-------------------------------------------------------------------
+Executes PPO at C++ level yielding notable execution performance speedups.
+Uses Stable Baselines to create a computational graph which is then used for training with custom environments by machine-code-compiled binary.
+
+| Authors: Szymon Brych
+| Email: [email protected]
+| GitHub: https://github.com/Antymon/ppo_cpp