Any help for multiple inputs observation space (Dict image and vector) #467

shaoxiang · 2024-06-07T02:23:06Z

shaoxiang
Jun 7, 2024

I see that the demos in the Lab are all single_observation_space, but my experiment is multiple observation_space. Combine images and vectors into a multi-input observation space. I want to understand how to develop it. Can anyone help me? Also, I find it difficult to understand the meaning of num_states and there is no relevant explanation.
The following is the approach I found for Stable Baselines3:
Stable Baselines3 supports handling of multiple inputs by using Dict Gym space. This can be done using
MultiInputPolicy, which by default uses the CombinedExtractor features extractor to turn multiple
inputs into a single vector, handled by the net_arch network.

FANG-Zhiwei · 2024-12-05T08:14:13Z

FANG-Zhiwei
Dec 5, 2024

Hi shaoxiang, have you found a solution? I also find it hard to use Dict observations.
BTW, as far as I know, the num_states/state_space are used for actor-critic policy.

2 replies

Toni-SM Dec 5, 2024
Maintainer

Hi @FANG-Zhiwei and @shaoxiang

These discussions may help:

shaoxiang Jan 22, 2025
Author

Thank you @Toni-SM , thank you for your answer, this is exactly what I want.
I noticed that skrl 1.4.0 version is about to be released. This is a big update. I can use more reinforcement learning algorithms and more flexible observations. I'll try it now.

StrainFlow · 2025-01-10T19:05:53Z

StrainFlow
Jan 10, 2025
Maintainer

FANG-Zhiwei is right, num_states is the number of priviledged observations used for an actor-critic policy. (these are observations used to speed up training that are not available to the final policy, such as measurements that are available in simulation but not in the real world.

Are you using a managed or direct environment? Can you elaborate on the issue you're having? I'm working in on direct environment right now and it seems fairly straight-forward to bring observations together from different sources.

1 reply

shaoxiang Jan 22, 2025
Author

Hi @StrainFlow I am your fan. I saw that you have been training JetRacer recently. I am still learning that.
I use direct environment, and I want to combine the robot camera image state with some robot positions, speeds, actions state, etc. to train them together, so as to realize robot visual obstacle avoidance navigation. Since the image and these one-dimensional state quantities have different dimensions, RL needs to be able to handle both CNN and MLP, which becomes a bit troublesome. One way I can do this is to modify the RL code, such as rl_games, but this will destroy a lot of things, and I hope there is a ready-made solution.
As shown in the video below, I don’t want the robot to crash into other robots. I want to use vision to solve the obstacle avoidance problem.
https://github.com/user-attachments/assets/a2dc970b-b1da-4776-8858-b79dd779bf54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any help for multiple inputs observation space (Dict image and vector) #467

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Any help for multiple inputs observation space (Dict image and vector) #467

shaoxiang Jun 7, 2024

Replies: 2 comments · 3 replies

FANG-Zhiwei Dec 5, 2024

Toni-SM Dec 5, 2024 Maintainer

shaoxiang Jan 22, 2025 Author

StrainFlow Jan 10, 2025 Maintainer

shaoxiang Jan 22, 2025 Author

shaoxiang
Jun 7, 2024

Replies: 2 comments 3 replies

FANG-Zhiwei
Dec 5, 2024

Toni-SM Dec 5, 2024
Maintainer

shaoxiang Jan 22, 2025
Author

StrainFlow
Jan 10, 2025
Maintainer

shaoxiang Jan 22, 2025
Author