Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft]Add Multimodal RAG notebook #2497

Merged
merged 29 commits into from
Dec 23, 2024

Conversation

openvino-dev-samples
Copy link
Collaborator

@openvino-dev-samples openvino-dev-samples commented Nov 1, 2024

image

image

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@openvino-dev-samples openvino-dev-samples changed the title [Draft]Add Multimodal RAG [Draft]Add Multimodal RAG notebook Nov 4, 2024
transfer to optimum-intel

transfer to optimum-intel
@eaidova
Copy link
Collaborator

eaidova commented Nov 18, 2024

@openvino-dev-samples for me everything looks good, thanks.

Couple of comments:
Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

Is there any plans to integrate OV Visual Language models directly in llama-index?

@openvino-dev-samples
Copy link
Collaborator Author

openvino-dev-samples commented Nov 18, 2024

@openvino-dev-samples for me everything looks good, thanks.

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

Is there any plans to integrate OV Visual Language models directly in llama-index?

Thanks for your review, the integration is already done in llama-index

https://docs.llamaindex.ai/en/stable/examples/multi_modal/openvino_multimodal/

BTW is there an example for phi3-vision's accuracy aware quantization ?

@nikita-savelyevv
Copy link
Collaborator

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.

Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

@openvino-dev-samples
Copy link
Collaborator Author

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.

Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

Hi, as my test, the accuracy with this configuration is not satisfied:
optimum-cli export openvino --model {vlm_model_id} {vlm_model_path} --trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

add load image function
@nikita-savelyevv
Copy link
Collaborator

nikita-savelyevv commented Nov 19, 2024

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.
Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

Hi, as my test, the accuracy with this configuration is not satisfied: optimum-cli export openvino --model {vlm_model_id} {vlm_model_path} --trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

Thanks for the information! Have you compared it against the configuration below?

compression_config = {
    "mode": nncf.CompressWeightsMode.INT4_SYM,
    "group_size": 64,
    "ratio": 0.6,
}

Yes, this configuration brings more reasonable responses compared to optimum-cli

update the method of audio extraction
update with accruaracy aware quantization
@openvino-dev-samples
Copy link
Collaborator Author

--trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

Sorry, I made a mistake before. The result looks good with this accuracy aware config now

However I find it impossible to run this quantization method in a client PC with 32GB RAM

@nikita-savelyevv
Copy link
Collaborator

--trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

Sorry, I made a mistake before. The result looks good with this accuracy aware config now

However I find it impossible to run this quantization method in a client PC with 32GB RAM

Could you please verify that NNCF 2.14 is installed? It was released recently and it contains significant improvements in terms of peak RAM during data-aware compression.

@brmarkus
Copy link

Could you please verify that NNCF 2.14 is installed? It was released recently and it contains significant improvements in terms of peak RAM during data-aware compression.

At least this pull-request doesn't touch "requirements.txt" and doesn't check to make sure a recent version of NNCF is present. On my 64GB laptop the max. system memory is shortly fully used.

@openvino-dev-samples openvino-dev-samples force-pushed the multimodal-rag branch 6 times, most recently from 1508f3c to e590031 Compare December 18, 2024 01:18
change the ASR model id

change the ASR model id

change the ASR model id

change the ASR model id

change the ASR model id

change the ASR model id

change the ASR model id

change the ASR model id
@eaidova eaidova merged commit 41280d6 into openvinotoolkit:latest Dec 23, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants