[Frontend][Core] Override HF `config.json` via CLI #5836

KrishnaM251 · 2024-06-25T20:00:30Z

PR Title and Classification

[Frontend] [Core]

Notes

While I believe the implementation is almost complete, the test I wrote only checks if the new Optional[dict] parameter hf_kwargs is succesfully set in ModelConfig.
I would like to write a test that detects a change in the LLMEngine output when kf_kwargs is added as a EngineArgs parameter.
But I have two questions:
- How can I write another test that runs hf_kwargs through an OpenAI compatible server? I wanted to repurpose a test in test_openai_server.py, however the EngineArgs params are set before all tests are run.
- What are some test values that I can set in hf_kwargs which will generate output detectable in a function like: completion = client.completions.create(...) (motivating example). If this is not the best approach for testing, then any recommendations?

simon-mo · 2024-06-26T05:58:03Z

@DarkLight1337 can you help answer the question since you recently touched the testing harness.

Additionally, there might be other places we want override including tokenizer or generation config. Addressing those will be nice to have.

DarkLight1337 · 2024-06-26T06:33:58Z

Thanks for picking this up! To answer your questions:

How can I write another test that runs hf_kwargs through an OpenAI compatible server? I wanted to repurpose a test in test_openai_server.py, however the EngineArgs params are set before all tests are run.

You should launch a new instance of the server. Each RemoteOpenAIServer instance creates a new subprocess that invokes the OpenAI entrypoint through the command line, so you can't change the hf_config after instantiating it.

You can add pytest fixtures to use a different server for your tests.

What are some test values that I can set in hf_kwargs which will generate output detectable in a function like: completion = client.completions.create(...) (motivating example). If this is not the best approach for testing, then any recommendations?

~~Maybe you can set the attention implementation to eager, which from my understanding would cause some numerical inaccuracies compared to the default one.~~ Actually, this may not work since we are running the vLLM model rather than the HF model.

Perhaps you can ask the author of the original issue what they want to accomplish using this feature that cannot otherwise be done via vLLM args. (If we don't have any situation that results in different vLLM output, what is the point of enabling this?)

KrishnaM251 · 2024-06-27T22:12:30Z

@DarkLight1337 I appreciate the response. I will do as you suggested.

vpellegrain · 2024-09-16T08:50:17Z

Hi @KrishnaM251,

Any news on this?

I have a specific use case in which I'd like to deploy a Phi3.5-vision model on a vLLM openai server entrypoint ; but i'd like to specify the argument num_crops=16 (which is by default 4 in the preprocessor config file).

DarkLight1337 · 2024-09-16T10:58:56Z

Hi @KrishnaM251,

Any news on this?

I have a specific use case in which I'd like to deploy a Phi3.5-vision model on a vLLM openai server entrypoint ; but i'd like to specify the argument num_crops=16 (which is by default 4 in the preprocessor config file).

@alex-jw-brooks is currently working on a PR that lets you pass options to the HF processor on demand instead of at startup time (the latter is what this PR focuses on). Stay tuned!

alex-jw-brooks · 2024-09-20T08:11:26Z

Hi @vpellegrain - thought I would link this PR if you'd like to track it, this exposes num_crops as an processor kwarg for phi3v, and can be used or both offline inference and in the CLI when starting the server entry point.

Happy to add a follow-up to make this configurable per-request later on, but as it was already turning into a lot of code to correctly handle processor kwargs for memory profiling etc, the current PR sets it up for init time 😄

mergify · 2024-11-02T07:20:49Z

This pull request has merge conflicts that must be resolved before it can be
merged. @KrishnaM251 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2024-11-02T07:21:55Z

Sorry for forgetting about this! I think we now have a valid use case which is to patch out incorrect HF configs. cc @K-Mistele

K-Mistele · 2024-11-02T18:24:18Z

right, would be good to be able to adjust RoPE/YARN configurations in config.json at startup-time. I left comments about this on some other issues I just can't seem to find them right this second

Signed-off-by: DarkLight1337 <[email protected]>

…ing` and `rope_theta` Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 · 2024-11-09T14:50:25Z

I have updated this PR and also changed the tests to check overriding rope_scaling and rope_theta.

DarkLight1337

Since now we have a use case for this, I'm approving the PR.

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Loc Huynh <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

KrishnaM251 and others added 7 commits June 25, 2024 03:22

first attempt, no async server test

d79002c

passed through LLM interface & added arg docstring

2088379

first attempt, no async server test

4612034

passed through LLM interface & added arg docstring

d63b625

ran formatting command

1921d23

Merge branch 'vllm-project:main' into hf_config_args

c00faa9

fixing merge conflicts

c925422

KrishnaM251 changed the title ~~feat: passing hf_config args through openai server~~ [Frontend][Core] passing hf_config args through openai server Jun 27, 2024

KrishnaM251 mentioned this pull request Jun 27, 2024

Allow passing hf config args with openai server #2547

Closed

KrishnaM251 closed this Jul 8, 2024

KrishnaM251 force-pushed the hf_config_args branch from c925422 to 4f0e0ea Compare July 8, 2024 21:50

fixing engine args conflict

8753eee

KrishnaM251 reopened this Jul 8, 2024

fixing formatting

671fe57

mergify bot added frontend needs-rebase labels Nov 2, 2024

DarkLight1337 mentioned this pull request Nov 2, 2024

fix passing hf_config args: #2547 #2670

Closed

DarkLight1337 added 3 commits November 9, 2024 13:50

Merge branch 'main' into hf_config_args

11af627

Signed-off-by: DarkLight1337 <[email protected]>

Update test

cf9e927

Signed-off-by: DarkLight1337 <[email protected]>

format

39fab38

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 requested a review from WoosukKwon as a code owner November 9, 2024 13:52

DarkLight1337 requested review from zhuohan123, youkaichao, alexm-redhat, comaniac and njhill as code owners November 9, 2024 13:52

DarkLight1337 removed request for comaniac, njhill, zhuohan123, youkaichao, WoosukKwon and alexm-redhat November 9, 2024 13:52

mergify bot removed the needs-rebase label Nov 9, 2024

Rename hf_kwargs to hf_overrides and use it to replace `rope_scal…

7394933

…ing` and `rope_theta` Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 changed the title ~~[Frontend][Core] passing hf_config args through openai server~~ [Frontend][Core] Override HF config.json via CLI Nov 9, 2024

DarkLight1337 approved these changes Nov 9, 2024

View reviewed changes

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 9, 2024

Change argument order

776bda1

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 enabled auto-merge (squash) November 9, 2024 14:59

DarkLight1337 merged commit b09895a into vllm-project:main Nov 9, 2024
53 checks passed

rickyyx pushed a commit to rickyyx/vllm that referenced this pull request Nov 13, 2024

[Frontend][Core] Override HF config.json via CLI (vllm-project#5836)

aa682d7

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Frontend][Core] Override HF config.json via CLI (vllm-project#5836)

0ba52d5

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Frontend][Core] Override HF config.json via CLI (vllm-project#5836)

9d43ced

Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend][Core] Override HF `config.json` via CLI #5836

[Frontend][Core] Override HF `config.json` via CLI #5836

KrishnaM251 commented Jun 25, 2024 •

edited by DarkLight1337

Loading

simon-mo commented Jun 26, 2024

DarkLight1337 commented Jun 26, 2024 •

edited

Loading

KrishnaM251 commented Jun 27, 2024

vpellegrain commented Sep 16, 2024

DarkLight1337 commented Sep 16, 2024

alex-jw-brooks commented Sep 20, 2024

mergify bot commented Nov 2, 2024

DarkLight1337 commented Nov 2, 2024

K-Mistele commented Nov 2, 2024

DarkLight1337 commented Nov 9, 2024

DarkLight1337 left a comment •

edited

Loading

[Frontend][Core] Override HF config.json via CLI #5836

[Frontend][Core] Override HF config.json via CLI #5836

Conversation

KrishnaM251 commented Jun 25, 2024 • edited by DarkLight1337 Loading

PR Title and Classification

Notes

simon-mo commented Jun 26, 2024

DarkLight1337 commented Jun 26, 2024 • edited Loading

KrishnaM251 commented Jun 27, 2024

vpellegrain commented Sep 16, 2024

DarkLight1337 commented Sep 16, 2024

alex-jw-brooks commented Sep 20, 2024

mergify bot commented Nov 2, 2024

DarkLight1337 commented Nov 2, 2024

K-Mistele commented Nov 2, 2024

DarkLight1337 commented Nov 9, 2024

DarkLight1337 left a comment • edited Loading

Choose a reason for hiding this comment

[Frontend][Core] Override HF `config.json` via CLI #5836

[Frontend][Core] Override HF `config.json` via CLI #5836

KrishnaM251 commented Jun 25, 2024 •

edited by DarkLight1337

Loading

DarkLight1337 commented Jun 26, 2024 •

edited

Loading

DarkLight1337 left a comment •

edited

Loading