Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vision encoder output dimension does not match #1

Open
yuaoze opened this issue Oct 18, 2024 · 15 comments
Open

Vision encoder output dimension does not match #1

yuaoze opened this issue Oct 18, 2024 · 15 comments

Comments

@yuaoze
Copy link

yuaoze commented Oct 18, 2024

Hi, thanks for your excellent work! I'm trying to run bash eval_calvin.sh.
When running to FeedbackPolicy/models/policy.py, there is an issue where the shape of the vision_x input to vision_encoder is 192 * 192, which does not match the model size of 224 * 224.
So I interpolated vision_x to 224 * 224, and the shape of output by vision_encoder is 8 * 768, which does not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T)

@retsuh-bqw
Copy link
Collaborator

Thanks for your interests in our work!
We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively.

Further updates are welcome if it fails to solve your issue. 😃

@yuaoze
Copy link
Author

yuaoze commented Oct 21, 2024

Thanks for your interests in our work! We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively.

Further updates are welcome if it fails to solve your issue. 😃

Hi, I followed your advice and modified the config file of VC1-Base model, but error still occurred. Here is the details.
image

@yuaoze
Copy link
Author

yuaoze commented Oct 21, 2024

Thanks for your interests in our work! We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively.
Further updates are welcome if it fails to solve your issue. 😃

Hi, I followed your advice and modified the config file of VC1-Base model, but error still occurred. Here is the details. image

I solved this issue by specified output_size: 192 under "transform" in config file
But output of vision_encoder is shape of 8 * 768, which can not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T)
Can you give me some advice?

@retsuh-bqw
Copy link
Collaborator

But output of vision_encoder is shape of 8 * 768, which can not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T)
Can you give me some advice?

My bad. You should also set use_cls to False in the config file. Then the encoder will return all feature tokens.

@hkz103
Copy link

hkz103 commented Oct 28, 2024

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

@retsuh-bqw
Copy link
Collaborator

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

Is it because the sanity check in the load_model function (line 26 - 29) of VC-1?
You may change the function as following:

def load_model(
    model,
    transform,
    metadata=None,
    checkpoint_dict=None,
):
    if checkpoint_dict is not None:
        msg = model.load_state_dict(checkpoint_dict)
        log.warning(msg)

    return model

@hkz103
Copy link

hkz103 commented Oct 30, 2024

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

Is it because the sanity check in the load_model function (line 26 - 29) of VC-1? You may change the function as following:

def load_model(
    model,
    transform,
    metadata=None,
    checkpoint_dict=None,
):
    if checkpoint_dict is not None:
        msg = model.load_state_dict(checkpoint_dict)
        log.warning(msg)

    return model

It works! But I met a new problem:
bug

@retsuh-bqw
Copy link
Collaborator

It works! But I met a new problem: bug

It seems to be an issue within CALVIN. Is your CALVIN env properly installed?

@gouyinghong
Copy link

微信图片_20241030164904 Hi, I run `bash eval_calvin.sh`, but `failed to EGL with glad.`, Do you know how to solve this?

@hkz103
Copy link

hkz103 commented Oct 30, 2024

It works! But I met a new problem: bug

It seems to be an issue within CALVIN. Is your CALVIN env properly installed?

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

@retsuh-bqw
Copy link
Collaborator

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

There is a provided requirements.txt at visual_planner/requirements.txt.
What packages conflicts are you getting exactly?

@hkz103
Copy link

hkz103 commented Oct 31, 2024

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

There is a provided requirements.txt at visual_planner/requirements.txt. What packages conflicts are you getting exactly?

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help!
problem

@retsuh-bqw
Copy link
Collaborator

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help! problem

You can try to downgrade your networkx to 2.2. I think the other packages are fine.

@hkz103
Copy link

hkz103 commented Oct 31, 2024

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help! problem

You can try to downgrade your networkx to 2.2. I think the other packages are fine.

When using networkx2.2,AttributeError"module 'numpy' has no attribute 'int'." is reported, because the high version of numpy no longer uses int and networkx2.2 may use int in numpy.

@retsuh-bqw
Copy link
Collaborator

When using networkx2.2,AttributeError"module 'numpy' has no attribute 'int'." is reported, because the high version of numpy no longer uses int and networkx2.2 may use int in numpy.

You may try to downgrade the numpy as well. I'll update relavant information in a new Troubleshooting section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants