Skip to content

Commit

Permalink
Add documentation on exporting models (openai#475)
Browse files Browse the repository at this point in the history
* Add documentation on exporting models

* Update changelog

* Update doc (and maintainers list)

* Rework exporting doc
  • Loading branch information
Miffyli authored and araffin committed Sep 16, 2019
1 parent 153ce70 commit 19ed2ca
Show file tree
Hide file tree
Showing 5 changed files with 108 additions and 51 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ To cite this repository in publications:

```
@misc{stable-baselines,
author = {Hill, Ashley and Raffin, Antonin and Ernestus, Maximilian and Gleave, Adam and Traore, Rene and Dhariwal, Prafulla and Hesse, Christopher and Klimov, Oleg and Nichol, Alex and Plappert, Matthias and Radford, Alec and Schulman, John and Sidor, Szymon and Wu, Yuhuai},
author = {Hill, Ashley and Raffin, Antonin and Ernestus, Maximilian and Gleave, Adam and Kanervisto, Anssi and Traore, Rene and Dhariwal, Prafulla and Hesse, Christopher and Klimov, Oleg and Nichol, Alex and Plappert, Matthias and Radford, Alec and Schulman, John and Sidor, Szymon and Wu, Yuhuai},
title = {Stable Baselines},
year = {2018},
publisher = {GitHub},
Expand All @@ -211,7 +211,7 @@ To cite this repository in publications:

## Maintainers

Stable-Baselines is currently maintained by [Ashley Hill](https://github.com/hill-a) (aka @hill-a), [Antonin Raffin](https://araffin.github.io/) (aka [@araffin](https://github.com/araffin)), [Maximilian Ernestus](https://github.com/erniejunior) (aka @erniejunior) and [Adam Gleave](https://github.com/adamgleave) (@AdamGleave).
Stable-Baselines is currently maintained by [Ashley Hill](https://github.com/hill-a) (aka @hill-a), [Antonin Raffin](https://araffin.github.io/) (aka [@araffin](https://github.com/araffin)), [Maximilian Ernestus](https://github.com/erniejunior) (aka @erniejunior), [Adam Gleave](https://github.com/adamgleave) (@AdamGleave) and [Anssi Kanervisto](https://github.com/Miffyli) (@Miffyli).

**Important Note: We do not do technical support, nor consulting** and don't answer personal questions per email.

Expand Down
76 changes: 76 additions & 0 deletions docs/guide/export.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
.. _export:


Exporting models
================

After training an agent, you may want to deploy/use it in an other language
or framework, like PyTorch or `tensorflowjs <https://github.com/tensorflow/tfjs>`_.
Stable Baselines does not include tools to export models to other frameworks, but
this document aims to cover parts that are required for exporting along with
more detailed stories from users of Stable Baselines.


Background
----------

In Stable Baselines, the controller is stored inside :ref:`policies <policies>` which convert
observations into actions. Each learning algorithm (e.g. DQN, A2C, SAC) contains
one or more policies, some of which are only used for training. An easy way to find
the policy is to check the code for the ``predict`` function of the agent:
This function should only call one policy with simple arguments.

Policies hold the necessary Tensorflow placeholders and tensors to do the
inference (i.e. predict actions), so it is enough to export these policies
to do inference in an another framework.

.. note::
Learning algorithms also may contain other Tensorflow placeholders, that are used for training only and are
not required for inference.


.. warning::
When using CNN policies, the observation is normalized internally (dividing by 255 to have values in [0, 1])


Export to PyTorch
-----------------

A known working solution is to use :func:`get_parameters <stable_baselines.common.base_class.BaseRLModel.get_parameters>`
function to obtain model parameters, construct the network manually in PyTorch and assign parameters correctly.

.. warning::
PyTorch and Tensorflow have internal differences with e.g. 2D convolutions (see discussion linked below).


See `discussion #372 <https://github.com/hill-a/stable-baselines/issues/372>`_ for details.


Export to tensorflowjs / tfjs
-----------------------------

Can be done via Tensorflow's `simple_save <https://www.tensorflow.org/api_docs/python/tf/saved_model/simple_save>`_ function
and `tensorflowjs_converter <https://www.tensorflow.org/js/tutorials/conversion/import_saved_model>`_.

See `discussion #474 <https://github.com/hill-a/stable-baselines/issues/474>`_ for details.


Export to Java
---------------

Can be done via Tensorflow's `simple_save <https://www.tensorflow.org/api_docs/python/tf/saved_model/simple_save>`_ function.

See `this discussion <https://github.com/hill-a/stable-baselines/issues/329>`_ for details.


Manual export
-------------

You can also manually export required parameters (weights) and construct the
network in your desired framework, as done with the PyTorch example above.

You can access parameters of the model via agents'
:func:`get_parameters <stable_baselines.common.base_class.BaseRLModel.get_parameters>`
function. If you use default policies, you can find the architecture of the networks in
source for :ref:`policies <policies>`. Otherwise, for DQN/SAC/DDPG or TD3 you need to check the `policies.py` file located
in their respective folders.
28 changes: 14 additions & 14 deletions docs/guide/save_format.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,24 @@
On saving and loading
=====================

Stable baselines stores both neural network parameters and algorithm-related parameters such as
exploration schedule, number of environments and observation/action space. This allows continual learning and easy
Stable baselines stores both neural network parameters and algorithm-related parameters such as
exploration schedule, number of environments and observation/action space. This allows continual learning and easy
use of trained agents without training, but it is not without its issues. Following describes two formats
used to save agents in stable baselines, their pros and shortcomings.

Terminology used in this page:

- *parameters* refer to neural network parameters (also called "weights"). This is a dictionary
mapping Tensorflow variable name to a NumPy array.
- *data* refers to RL algorithm parameters, e.g. learning rate, exploration schedule, action/observation space.
- *parameters* refer to neural network parameters (also called "weights"). This is a dictionary
mapping Tensorflow variable name to a NumPy array.
- *data* refers to RL algorithm parameters, e.g. learning rate, exploration schedule, action/observation space.
These depend on the algorithm used. This is a dictionary mapping classes variable names their values.


Cloudpickle (stable-baselines<=2.7.0)
-------------------------------------

Original stable baselines save format. Data and parameters are bundled up into a tuple ``(data, parameters)``
and then serialized with ``cloudpickle`` library (essentially the same as ``pickle``).
and then serialized with ``cloudpickle`` library (essentially the same as ``pickle``).

This save format is still available via an argument in model save function in stable-baselines versions above
v2.7.0 for backwards compatibility reasons, but its usage is discouraged.
Expand All @@ -32,31 +32,31 @@ Pros:
- Works with almost any type of Python object, including functions.


Cons:
Cons:

- Pickle/Cloudpickle is not designed for long-term storage or sharing between Python version.
- If one object in file is not readable (e.g. wrong library version), then reading the rest of the
file is difficult.
- Python-specific format, hard to read stored files from other languages.


If part of a saved model becomes unreadable for any reason (e.g. different Tensorflow versions), then
If part of a saved model becomes unreadable for any reason (e.g. different Tensorflow versions), then
it may be tricky to restore any of the model. For this reason another save format was designed.


Zip-archive (stable-baselines>2.7.0)
-------------------------------------

A zip-archived JSON dump and NumPy zip archive of the arrays. The data dictionary (class parameters)
A zip-archived JSON dump and NumPy zip archive of the arrays. The data dictionary (class parameters)
is stored as a JSON file, model parameters are serialized with ``numpy.savez`` function and these two files
are stored under a single .zip archive.
are stored under a single .zip archive.

Any objects that are not JSON serializable are serialized with cloudpickle and stored as base64-encoded
string in the JSON file, along with some information that was stored in the serialization. This allows
inspecting stored objects without deserializing the object itself.

This format allows skipping elements in the file, i.e. we can skip deserializing objects that are
broken/non-serializable. This can be done via ``custom_objects`` argument to load functions.
broken/non-serializable. This can be done via ``custom_objects`` argument to load functions.

This is the default save format in stable baselines versions after v2.7.0.

Expand All @@ -69,7 +69,7 @@ File structure:
├── parameter_list JSON file of model parameters and their ordering (list)
├── parameters Bytes from numpy.savez (a zip file of the numpy arrays). ...
├── ... Being a zip-archive itself, this object can also be opened ...
├── ... as a zip-archive and browsed.
├── ... as a zip-archive and browsed.


Pros:
Expand All @@ -80,7 +80,7 @@ Pros:
languages.


Cons:
Cons:

- More complex implementation.
- Still relies partly on cloudpickle for complex objects (e.g. custom functions).
- Still relies partly on cloudpickle for complex objects (e.g. custom functions).
3 changes: 2 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ This toolset is a fork of OpenAI Baselines, with a major structural refactoring,
guide/pretrain
guide/checking_nan
guide/save_format
guide/export


.. toctree::
Expand Down Expand Up @@ -96,7 +97,7 @@ To cite this project in publications:
.. code-block:: bibtex
@misc{stable-baselines,
author = {Hill, Ashley and Raffin, Antonin and Ernestus, Maximilian and Gleave, Adam and Traore, Rene and Dhariwal, Prafulla and Hesse, Christopher and Klimov, Oleg and Nichol, Alex and Plappert, Matthias and Radford, Alec and Schulman, John and Sidor, Szymon and Wu, Yuhuai},
author = {Hill, Ashley and Raffin, Antonin and Ernestus, Maximilian and Gleave, Adam and Kanervisto, Anssi and Traore, Rene and Dhariwal, Prafulla and Hesse, Christopher and Klimov, Oleg and Nichol, Alex and Plappert, Matthias and Radford, Alec and Schulman, John and Sidor, Szymon and Wu, Yuhuai},
title = {Stable Baselines},
year = {2018},
publisher = {GitHub},
Expand Down
48 changes: 14 additions & 34 deletions docs/misc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,31 @@ Breaking Changes:
extra. When `mpi4py` is not available, stable-baselines skips imports of
OpenMPI-dependent algorithms.
See :ref:`installation notes <openmpi>` and
`Issue #430 <https://github.com/hill-a/stable-baselines/issues/430>`.
`Issue #430 <https://github.com/hill-a/stable-baselines/issues/430>`_.
- SubprocVecEnv now defaults to a thread-safe start method, `forkserver` when
available and otherwise `spawn`. This may require application code be
wrapped in `if __name__ == '__main__'`. You can restore previous behavior
by explicitly setting `start_method = 'fork'`. See
`PR #428 <https://github.com/hill-a/stable-baselines/pull/428>`_.
- updated dependencies: tensorflow v1.8.0 is now required

New Features:
^^^^^^^^^^^^^
- **important change** Switch to using zip-archived JSON and Numpy `savez` for
storing models for better support across library/Python versions. (@Miffyli)

Bug Fixes:
^^^^^^^^^^
- Skip automatic imports of OpenMPI-dependent algorithms to avoid an issue
where OpenMPI would cause stable-baselines to hang on Ubuntu installs.
See :ref:`installation notes <openmpi>` and
`Issue #430 <https://github.com/hill-a/stable-baselines/issues/430>`.
`Issue #430 <https://github.com/hill-a/stable-baselines/issues/430>`_.
- Fix a bug when calling `logger.configure()` with MPI enabled (@keshaviyengar)
- set `allow_pickle=True` for numpy>=1.17.0 when loading expert dataset

Deprecations:
^^^^^^^^^^^^^
- Models saved with cloudpickle format (stable-baselines<=2.7.0) are now
- Models saved with cloudpickle format (stable-baselines<=2.7.0) are now
deprecated in favor of zip-archive format for better support across
Python/Tensorflow versions. (@Miffyli)

Expand All @@ -46,42 +50,15 @@ Others:
to `stable_baselines.common.noise`. The API remains backward-compatible;
for example `from stable_baselines.ddpg.noise import NormalActionNoise` is still
okay. (@shwang)
- **important change** Switch to using zip-archived JSON and Numpy `savez` for
storing models for better support across library/Python verions. (@Miffyli)
- docker images were updated

Documentation:
^^^^^^^^^^^^^^
- Add WaveRL project (@jaberkow)
- Add Fenics-DRL project (@DonsetPG)
- Fix and rename custom policy names (@eavelardev)



Pre-Release 2.7.1a0 (WIP)
--------------------------


Breaking Changes:
^^^^^^^^^^^^^^^^^
- updated dependencies: tensorflow v1.8.0 is now required

New Features:
^^^^^^^^^^^^^

Bug Fixes:
^^^^^^^^^^
- set `allow_pickle=True` for numpy>=1.17.0 when loading expert dataset

Deprecations:
^^^^^^^^^^^^^

Others:
^^^^^^^
- docker images were updated

Documentation:
^^^^^^^^^^^^^^

- Add documentation on exporting models.
- Update maintainers list (Welcome to @Miffyli)


Release 2.7.0 (2019-07-31)
Expand Down Expand Up @@ -476,14 +453,17 @@ Maintainers
-----------

Stable-Baselines is currently maintained by `Ashley Hill`_ (aka @hill-a), `Antonin Raffin`_ (aka `@araffin`_),
`Maximilian Ernestus`_ (aka @erniejunior) and `Adam Gleave`_ (`@AdamGleave`_).
`Maximilian Ernestus`_ (aka @erniejunior), `Adam Gleave`_ (`@AdamGleave`_) and `Anssi Kanervisto`_ (aka `@Miffyli`_).

.. _Ashley Hill: https://github.com/hill-a
.. _Antonin Raffin: https://araffin.github.io/
.. _Maximilian Ernestus: https://github.com/erniejunior
.. _Adam Gleave: https://gleave.me/
.. _@araffin: https://github.com/araffin
.. _@AdamGleave: https://github.com/adamgleave
.. _Anssi Kanervisto: https://github.com/Miffyli
.. _@Miffyli: https://github.com/Miffyli


Contributors (since v2.0.0):
----------------------------
Expand Down

0 comments on commit 19ed2ca

Please sign in to comment.