Skip to content

Commit

Permalink
Make memory request tunable for pvc download pod (red-hat-data-servic…
Browse files Browse the repository at this point in the history
…es#1695)



Signed-off-by: lugi0 <[email protected]>
  • Loading branch information
lugi0 authored Aug 7, 2024
1 parent 3c750f7 commit 7d49d0b
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 13 deletions.
15 changes: 10 additions & 5 deletions ods_ci/tests/Resources/CLI/ModelServing/llm.resource
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,14 @@ Set Project And Runtime
[Arguments] ${namespace} ${enable_metrics}=${FALSE} ${runtime}=caikit-tgis-runtime ${protocol}=grpc
... ${access_key_id}=${S3.AWS_ACCESS_KEY_ID} ${access_key}=${S3.AWS_SECRET_ACCESS_KEY}
... ${endpoint}=${MODELS_BUCKET.ENDPOINT} ${verify_ssl}=${TRUE}
... ${download_in_pvc}=${FALSE} ${storage_size}=70Gi ${model_name}=${NONE} ${model_path}=${model_name} ${download_timeout}=600s
... ${download_in_pvc}=${FALSE} ${storage_size}=70Gi ${model_name}=${NONE}
... ${model_path}=${model_name} ${download_timeout}=600s ${memory_request}=40Gi
Set Up Test OpenShift Project test_ns=${namespace}
IF ${download_in_pvc}
Create PVC And Download Model From S3 model_name=${model_name} namespace=${namespace} bucket_name=${MODELS_BUCKET.NAME}
... use_https=${USE_BUCKET_HTTPS} download_timeout=${download_timeout}
... storage_size=${storage_size} model_path=${model_path}
Create PVC And Download Model From S3 model_name=${model_name} namespace=${namespace}
... bucket_name=${MODELS_BUCKET.NAME} use_https=${USE_BUCKET_HTTPS}
... download_timeout=${download_timeout} storage_size=${storage_size} model_path=${model_path}
... memory_request=${memory_request}
ELSE
Create Secret For S3-Like Buckets endpoint=${endpoint}
... region=${MODELS_BUCKET.REGION} namespace=${namespace}
Expand Down Expand Up @@ -734,7 +736,7 @@ Clean Up Test Project
Create PVC And Download Model From S3
[Arguments] ${model_name} ${bucket_name}
... ${use_https} ${namespace} ${storage_size}
... ${model_path} ${download_timeout}=500s
... ${model_path} ${memory_request} ${download_timeout}=500s
Set Log Level NONE
Set Test Variable ${model_name}
Set Test Variable ${bucket_name}
Expand All @@ -743,10 +745,13 @@ Create PVC And Download Model From S3
Set Test Variable ${namespace}
Set Test Variable ${storage_size}
Set Test Variable ${model_path}
Set Test Variable ${memory_request}
Set Log Level INFO
Create File From Template ${DOWNLOAD_PVC_FILEPATH} ${DOWNLOAD_PVC_FILLED_FILEPATH}
${rc} ${out}= Run And Return Rc And Output oc -n ${namespace} apply -f ${DOWNLOAD_PVC_FILLED_FILEPATH}
Should Be Equal As Integers ${rc} ${0}
# No reason to keep this file around once it's applied
Remove File ${DOWNLOAD_PVC_FILLED_FILEPATH}
Run Keyword And Continue On Failure Wait For Pods To Be Ready label_selector=name=download-${model_name} namespace=${namespace}
Wait For Pods To Succeed label_selector=name=download-${model_name} namespace=${namespace}
... timeout=${download_timeout}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ spec:
containers:
- resources:
requests:
memory: 40Gi
memory: ${memory_request}
name: download-model
imagePullPolicy: IfNotPresent
image: quay.io/modh/kserve-storage-initializer@sha256:330af2d517b17dbf0cab31beba13cdbe7d6f4b9457114dea8f8485a011e3b138
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,9 @@ Get Model Inference
END

${inference_output}= Run ${curl_cmd}
# Passes if file does not exist, cleans up otherwise. No point keeping these after executing the curl call.
Remove File openshift_ca_istio_knative.crt
Remove File openshift_ca.crt
RETURN ${inference_output}

Verify Model Inference
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Test Tags KServe-OVMS

*** Variables ***
${TEST_NS}= ovmsmodel
${RUNTIME_NAME}= ovms-runtime
${RUNTIME_NAME}= ovms-runtime
${USE_PVC}= ${TRUE}
${DOWNLOAD_IN_PVC}= ${TRUE}
${USE_GPU}= ${FALSE}
Expand All @@ -36,25 +36,30 @@ Verify User Can Serve And Query ovms Model
... kserve_mode=${KSERVE_MODE}
Set Project And Runtime runtime=${RUNTIME_NAME} protocol=${PROTOCOL} namespace=${test_namespace}
... download_in_pvc=${DOWNLOAD_IN_PVC} model_name=${model_name}
... storage_size=5Gi
${requests}= Create Dictionary memory=5Gi
... storage_size=100Mi memory_request=100Mi
${requests}= Create Dictionary memory=1Gi
Compile Inference Service YAML isvc_name=${model_name}
... sa_name=${EMPTY}
... model_storage_uri=${storage_uri}
... model_format=${MODEL_FORMAT} serving_runtime=${RUNTIME_NAME}
... limits_dict=${limits} requests_dict=${requests} kserve_mode=${KSERVE_MODE}
Deploy Model Via CLI isvc_filepath=${INFERENCESERVICE_FILLED_FILEPATH}
... namespace=${test_namespace}
# File is not needed anymore after applying
Remove File ${INFERENCESERVICE_FILLED_FILEPATH}
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${model_name}
... namespace=${test_namespace}
${pod_name}= Get Pod Name namespace=${test_namespace} label_selector=serving.kserve.io/inferenceservice=${model_name}
${service_port}= Extract Service Port service_name=${model_name}-predictor protocol=TCP namespace=${test_namespace}
${pod_name}= Get Pod Name namespace=${test_namespace}
... label_selector=serving.kserve.io/inferenceservice=${model_name}
${service_port}= Extract Service Port service_name=${model_name}-predictor protocol=TCP
... namespace=${test_namespace}
Run Keyword If "${KSERVE_MODE}"=="RawDeployment"
... Start Port-forwarding namespace=${test_namespace} pod_name=${pod_name} local_port=${service_port}
... remote_port=${service_port} process_alias=ovms-process
Verify Model Inference With Retries model_name=${model_name} inference_input=${INFERENCE_INPUT}
... expected_inference_output=${EXPECTED_INFERENCE_OUTPUT} project_title=${test_namespace} deployment_mode="Cli" kserve_mode=${KSERVE_MODE}
... service_port=${service_port} end_point=/v2/models/${model_name}/infer retries=10
... expected_inference_output=${EXPECTED_INFERENCE_OUTPUT} project_title=${test_namespace}
... deployment_mode="Cli" kserve_mode=${KSERVE_MODE} service_port=${service_port}
... end_point=/v2/models/${model_name}/infer retries=10

[Teardown] Run Keywords
... Clean Up Test Project test_ns=${test_namespace}
Expand Down

0 comments on commit 7d49d0b

Please sign in to comment.