Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller difference for different envs #11

Open
HuFY-dev opened this issue Jul 6, 2024 · 9 comments
Open

Controller difference for different envs #11

HuFY-dev opened this issue Jul 6, 2024 · 9 comments

Comments

@HuFY-dev
Copy link

HuFY-dev commented Jul 6, 2024

Hi, thanks for the great work! I'm playing around with your environments and found that both RT1 and Octo seems to be capable of only the google_robot tasks but not the widowx tasks.

Further, I noticed that google_robot environments use the arm_pd_ee_delta_pose + gripper_pd_joint_target_delta_pos controllers while widowx environments use the arm_pd_ee_target_delta_pose + gripper_pd_joint_pos controllers. Notice that both the arm and gripper controllers are different. However, in your model wrappers, you seem to be treating the world_vector and rot_axangle the same way regardless of the difference in the controller. I wonder if that's causing the models to fail widowx tasks

FYI, I got the controllers using env.unwrapped.control_mode. More details on controllers can be found here

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented Jul 7, 2024

Hi,

Octo models do not fail on the widowx tasks. The controller we use are here: https://github.com/simpler-env/SimplerEnv/blob/main/simpler_env/utils/env/env_builder.py .

Note that "delta_pose_align" is different from "delta_pose". You can see https://github.com/simpler-env/ManiSkill2_real2sim/blob/cd45dd27dc6bb26d048cb6570cdab4e3f935cc37/mani_skill2_real2sim/agents/controllers/pd_ee_pose.py#L202 for "align" vs "non-align". In simple terms, "delta_pose_align" decouples translation and rotation, instead of directly multiplying 2 SE(3) matrices.

@HuFY-dev
Copy link
Author

HuFY-dev commented Jul 7, 2024

Thanks, let me look into that!

@HuFY-dev
Copy link
Author

HuFY-dev commented Jul 7, 2024

Also, why there's no difference in handling the different arm controllers in the model wrappers? Seems like only the gripper actions are handled differently.

In terms of gripper actions, can you explain why you set self.sticky_gripper_num_repeat = 15 for google and 1 for bridge? I feel like it came from nowhere

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented Jul 7, 2024

The sticky_gripper_num_repeat was set only for Octo models to match the real eval. Our implementations of RT-* and Octo match the real eval setup.

OpenVLA likely uses a different setup from RT-* and Octo for real eval, so need to connect to the authors to verify if (and how) they implement the sticky actions in real eval. There are other things like action ensembling and obs len / action len that need to be verified too.

Could you share some videos of OpenVLA failing on Bridge? Is the arm reaching the object? If it's not reaching, most likely there are implementation issues.

@HuFY-dev
Copy link
Author

HuFY-dev commented Jul 7, 2024

One failure case of OpenVLA:

output

Here are some explanations from the OpenVLA authors, but I don't think they are clear enough.

Can you share where you found the sticky_gripper_num_repeat in the original setting from the authors of Octo?

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented Jul 7, 2024

This is almost surely an implementation issue. Looks like the rotation orderings might be wrong for Bridge envs for OpenVLA. OpenVLA might output ypr instead of rpy for Bridge. That is, on Bridge evaluations, while for Octo, roll, pitch, yaw = action_rotation_delta, for OpenVLA, it might be the case that yaw, pitch, roll = action_rotation_delta or things like pitch, roll, yaw = action_rotation_delta (you can tweek with different combinations).

sticky_gripper_num_repeat is not in the public Octo repo. I communicated with the authors directly and referenced their real eval implementations.

@HuFY-dev
Copy link
Author

HuFY-dev commented Jul 7, 2024

I tried all arrangements and neither worked. The task is to pick up the spoon, and the gripper should move forward in order to do that. However, every time it always goes straight down without moving forward, which is very weird.

@Toradus
Copy link

Toradus commented Jul 10, 2024

Have you tested with other tasks?
Im currently training my own custom model and try to make a eval on SimplerEnv, which works very nice for Fractal but also struggles alot with training on the Octo provided Bridge_v2 and evaluating on the spoon task.
I also get the same results of the robot driving into the table for the spoon task, but for the carrot task it is at least aiming at the right target. Note that this could easily be a model problem on my end, but its interesting that the OpenVLA eval looks about the same as my eval. I copy pasted the OctoInference provided by Simpler and adjusted it for my model to work. Did OpenVLA do the same?

Spoon.mp4
karotte.mp4

@HuFY-dev
Copy link
Author

HuFY-dev commented Jul 10, 2024

Yes my model wrapper was also modified from the Octo code. However, from my understanding, the main part of that code is processing the rotation from rpy to axis angles and perform the sticky gripper logic, but it seems what's going wrong is the translation. Here are some carrot examples:

output (1)

output (2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants