Generalization of the observations for GPI-LS #81

AdrienBolling · 2023-12-07T16:48:29Z

Here is my modification of the library to accept all types of Observations
More things are left up to the person using the library, especially through the implementation of the generic Observation class (that could admitedly be moved to MO-Gym alongside the associated wrapper)
This has been tester with the Minecart file to see if it was still compatible and it seems to be as it ran without issues.

The principle is just that all observations are just considered of type Observation and are handled as array of objects to keep as much of the previous code as possible. The conversion to tensor is left to the discretion of the user, and made in the forward function of the q_net

This PR also allows people to use custom q_net with their observations (Observations that are not a wrapper of a np vector will require a custom q_net by default)

The basic Observation class I made should be pretty compatible with a good number of observations, although not optimally.

Please don't take into account the changes made to enveloppe, these were old tweaks and mluch less transferable.

I think this implementation could be deployed in the entire library without too much hassle

Please excuse me for the mess of a PR this is, it is the first contribution I'm doing of the sort of my life

I have also not really touched the ProbabilisticEnsemble part. I think a reasonnable option would be to leave the choice for the user to provide a custom model for the env simulation if he wishes to use a "complex" observation.

(This modification mainly stems from my need to use your library with DGL Heterographs observations)

Modification of the pyproject to allow installation of both the original and modified packages

…al and modified packages

Buffers have also been modified

Modification of the pyproject to allow installation of both the original and modified packages

…al and modified packages

Buffers have also been modified

# Conflicts: # morl_baselines/multi_policy/envelope/envelope.py

Buffers have also been modified Creation of a wrapper for the observations as well as a Generic type to encapsulate them

AdrienBolling · 2023-12-07T16:51:20Z

I just noticed that I required a merge to your main branch. I'm not experienced with git except from a personnal use stadnpoint, so I don't know if you'll be able to redirect it into another branch or if I need to redo it towards another branch myself ?

LucasAlegre · 2023-12-19T18:37:27Z

Hi @AdrienBolling, I am not sure I understand the motivation for this PR. GPI-LS already supports vector and image observations. In case another type of observation would be used, this could be done by a wrapper converting to vector, right?

AdrienBolling · 2023-12-19T21:30:51Z

In my case : I am working with graph based observations
(More precisely heterogeneous dynamic graphs)

So I have several issues with just using a flattened array :

my observations don't have a fixed size (dynamic), which is handled by DGL but not your implementation of a replay buffer
It is a tedious work in terms of number of operations to convert everything to a flattened array then convert it back to a graph, and that is not only the issue with graphs, but even for image based
Which leads me to the next point, the networks that can be useful or interesting are not all just fully connected MLP, and with just a flatten wrapper, you are forced to unflatten in your forward function, for no good reason actually since the "array" structure of the flattened observation is not used, it is just always treated as an object
This in turn leads to a very high number of technically unnecessary conversion operations, which scales a lot
To use a more simple and less specific case than graphs, in a previous approach, I used a tuple of numpy matrixes as my observation, these matrixes being of different shapes (which doesn't seem so unlikely for an environment), flattening then unflattening in the forward function of my q_net, to treat each matrix with a separate sub-network was a costly task which slowed down things a lot

Buffers have also been modified Creation of a wrapper for the observations as well as a Generic type to encapsulate them

ffelten · 2024-02-07T12:07:37Z

Hi Adrien 🙂,

I will close this since we do not want to overcomplexify the codebase for now. We believe there are more advantages in staying close to SB3/cleanRL (which only support Gymnasium's spaces) while MORL is still not so well known.

In case you need any more help on this, you can still message us on Discord :-).

Cheers,

AdrienBolling added 11 commits October 11, 2023 11:24

Alterations to the originl package to accomodate the MoDMSE environment.

f97743b

Modification of the pyproject to allow installation of both the original and modified packages

Modification of the setup.py to allow installation of both the origin…

f414eae

…al and modified packages

Returned Envelope to its default implementation

90771dc

Modifications to allow any type of observation to be used with GPI-LS

1490ed8

Buffers have also been modified

Merge branch 'LucasAlegre:main' into graphs

e9c977b

Alterations to the originl package to accomodate the MoDMSE environment.

f0faa75

Modification of the pyproject to allow installation of both the original and modified packages

Modification of the setup.py to allow installation of both the origin…

0c96e2f

…al and modified packages

Returned Envelope to its default implementation

caad87a

Modifications to allow any type of observation to be used with GPI-LS

f39f2ab

Buffers have also been modified

Merge remote-tracking branch 'origin/graphs' into graphs

f8fc8cb

# Conflicts: # morl_baselines/multi_policy/envelope/envelope.py

Modifications to allow any type of observation to be used with GPI-LS

afe4fa9

Buffers have also been modified Creation of a wrapper for the observations as well as a Generic type to encapsulate them

ffelten requested review from LucasAlegre and ffelten December 11, 2023 00:32

Modifications to allow any type of observation to be used with GPI-LS

bf5a622

Buffers have also been modified Creation of a wrapper for the observations as well as a Generic type to encapsulate them

ffelten closed this Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalization of the observations for GPI-LS #81

Generalization of the observations for GPI-LS #81

AdrienBolling commented Dec 7, 2023

AdrienBolling commented Dec 7, 2023

LucasAlegre commented Dec 19, 2023

AdrienBolling commented Dec 19, 2023

ffelten commented Feb 7, 2024

Generalization of the observations for GPI-LS #81

Generalization of the observations for GPI-LS #81

Conversation

AdrienBolling commented Dec 7, 2023

AdrienBolling commented Dec 7, 2023

LucasAlegre commented Dec 19, 2023

AdrienBolling commented Dec 19, 2023

ffelten commented Feb 7, 2024