Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2-VL-7B的推理结果和OpenCompass榜单上的结果不一致 #722

Open
zerovl opened this issue Jan 14, 2025 · 1 comment
Open

Qwen2-VL-7B的推理结果和OpenCompass榜单上的结果不一致 #722

zerovl opened this issue Jan 14, 2025 · 1 comment
Assignees

Comments

@zerovl
Copy link

zerovl commented Jan 14, 2025

用下述代码测试了MMStar:
export exp_name=./Qwen2-VL-7B-Instruct export model_name=Qwen2-VL-7B CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --master_port=25678 --nproc-per-node=8 run_rh2.py --data MMStar --model ${model_name} --model-path $exp_name --verbose

本地结果:
image

OpenCompass榜单结果:
image

请问可能对不齐的原因有什么呀?这两个结果理论上是会对齐的对吗🤔

@PhoenixZ810
Copy link
Collaborator

你好,结果由于环境不同(transformers, torch, flash-attn, cuda)等会有不等的数值波动。
您的MMStar结果(59.6)和我们的结果(60.7)差距较小,可以理解为环境不同带来的小范围波动。
如果您有其他问题,欢迎讨论。

@PhoenixZ810 PhoenixZ810 self-assigned this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants