Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected label during prediction #66

Open
yuerout opened this issue Oct 31, 2024 · 5 comments
Open

Unexpected label during prediction #66

yuerout opened this issue Oct 31, 2024 · 5 comments

Comments

@yuerout
Copy link

yuerout commented Oct 31, 2024

Hi! So I'm running the grounded_sam2_hf_model_demo.py on one of my images with labels woman. swab. balcony. room. bucket. sky. However, the results that the processor looks like this: 'labels': ['bucket', 'room', 'woman', 'womanab']. I am not too sure where the womanab label comes from?
Here is the image that I used
1172
Here is the image post processing
groundingdino_annotated_image
Does anyone have any insights on why this might be happening? Thanks!

@rentainhe
Copy link
Collaborator

Hello @yuerout , we have similar discussions here: #50. You can refer to this issue for more details.

@yuerout
Copy link
Author

yuerout commented Oct 31, 2024

Hi @rentainhe , thanks for the reply! I looked at the file, but it seems like the input to that is a text prompt ("There is a cat and a dog in the image ."). My input is already parsed into individual labels. I was wondering if you know how I can adapt the code there for my use? Thank you!

@yuerout
Copy link
Author

yuerout commented Oct 31, 2024

I also noticed that for all the labels with an underscore, additional spaces were added into the output label after prediction. For example, in the demo grounded_sam2_hf_model_demo.py, if I change the label of the car to white_car, the predicted label in grounded_sam2_hf_model_demo_results.json becomes white _ car.

@rentainhe
Copy link
Collaborator

Hi @yuerout

About how the womanab come: Grounding DINO will first compute the box region similarity with each text, if the max score is higher than the pre-defined box_threshold, we will keep the box to the final output list, and the box's label will be all the combination of the texts which has the similarity scores with this box higher than the text_threshold. So in this case, both woman and ab has high score with this box, so the label will be womanab. This may introduce some confusing results to the users.

@yuerout
Copy link
Author

yuerout commented Dec 12, 2024

Hi @rentainhe, thanks for the response! I'm still a bit confused -- Why would a score be computed for ab in the first place? This is not in the list of labels. It looks like a fusion of woman and swab.

rentainhe pushed a commit that referenced this issue Dec 21, 2024
…rious typos (#218)

(close #217, #66, #67, #69, #91, #126, #127, #145)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants