Able to set the toRender parameters dynamically #239

kerthcet · 2025-01-16T08:16:07Z

What would you like to be added:

Here's an example from Triton_RTLLM with lws, https://github.com/triton-inference-server/tutorials/blob/main/Deployment/Kubernetes/EKS_Multinode_Triton_TRTLLM/multinode_helm_chart/chart/templates/deployment.yaml,
it needs to set a bunch of parameters dynamically, see

          - python3
          - ./server.py
          - leader
          - --triton_model_repo_dir={{ $.Values.triton.triton_model_repo_path }}
          - --namespace={{ $.Release.Namespace }}
          - --pp={{ $.Values.tensorrtLLM.parallelism.pipeline }}
          - --tp={{ $.Values.tensorrtLLM.parallelism.tensor }}
          - --gpu_per_node={{ $.Values.gpuPerNode }}
          - --stateful_set_group_key=$(GROUP_KEY)

We should support this, basically, we can set the params in the model.spec.inferenceFlavors[x].params, with a prefix like Params_GPU_PER_NODE, when rending, we'll cut the Params_.

Why is this needed:

Completion requirements:

This enhancement requires the following artifacts:

Design doc
API change
Docs update

The artifacts should be linked in subsequent comments.

The text was updated successfully, but these errors were encountered:

InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-kind Indicates a PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. labels Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Able to set the toRender parameters dynamically #239

Able to set the toRender parameters dynamically #239

kerthcet commented Jan 16, 2025 •

edited

Loading

Able to set the toRender parameters dynamically #239

Able to set the toRender parameters dynamically #239

Comments

kerthcet commented Jan 16, 2025 • edited Loading

kerthcet commented Jan 16, 2025 •

edited

Loading