Skip to content

Commit

Permalink
Update 服务化部署.md
Browse files Browse the repository at this point in the history
  • Loading branch information
karagg authored Dec 19, 2023
1 parent a289bed commit 54d1df4
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions llm/服务化部署.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,13 @@ bash gen_serving_model.sh ${output_model_path} ${serving_model_path}
```bash
# 1、拉取docker镜像,创建docker,要求cuda驱动大于520
docker pull registry.baidubce.com/paddlepaddle/fastdeploy-llm:0.0.9
# 2.创建容器,进入docker
# 2.创建容器,挂载模型路径到容器中,进入docker
nvidia-docker run --name 容器名 -v $PWD:/work --network=host --privileged --shm-size=5g -it registry.baidubce.com/paddlepaddle/fastdeploy-llm:0.0.9 /bin/bash

# 3、进入docker,设置如下环境变量,并且启动triton服务
export FLAGS_cache_inference_while_scope=1
export BATCH_SIZE=8 #指定batch_size

export IS_PTUNING=0 #非ptuning模型
# 配置此环境变量,会将接收到的请求dump到日志,便于后期追查问题
export ENABLE_DEBUG_LOG=1

Expand Down

0 comments on commit 54d1df4

Please sign in to comment.