04-Chinese LLaMA Alpaca系列模型OpenAI API调用实现(跟练)-部署本地“chatgpt” #24

Open
opened 2024-09-06 17:34:59 +08:00 by 12535224197cs · 0 comments

服务端

step1: 下载源码
wget https://file.huishiwei.top/Chinese-LLaMA-Alpaca-3-3.0.tar.gz
tar -xvf Chinese-LLaMA-Alpaca-3-3.0.tar.gz
setp2: 模型下载
pip install modelscope -i https://mirrors.aliyun.com/pypi/simple
modelscope download --model ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3 --cache_dir /mnt/wksp/agi/models
X.jpg
step3: 创建虚拟环境
conda create -n chinese_llama_alpaca_3 python=3.8.17 pip -y
conda activate chinese_llama_alpaca_3
cd Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
step4: 修改文件
Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/openai_api_server.py
①处修改:增加停止token,防止回复重复内容(兼容llama3特有的停止token,不然流式接口返回的内容会不断的自动重复,不停止)
②处修改:在用户终止回复时,释放资源(在修改前,如果项目的使用场景是生成长文本,即使用户提前终止大模型回复,脚本依然会占用GPU资源直到整个结果完全生成)
X.jpg
step5: 启动服务
cd Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/
python openai_api_server.py --gpus 0, 1 --base_model /mnt/wksp/agi/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3/
X.jpg

客户端

step1: 客户端配置
①输入启动服务的接口地址
②自定义本地模型名称
③选择模型
X.jpg
step2: 用部署的本地模型进行问答
X.jpg

## 服务端 **step1: 下载源码** wget https://file.huishiwei.top/Chinese-LLaMA-Alpaca-3-3.0.tar.gz tar -xvf Chinese-LLaMA-Alpaca-3-3.0.tar.gz **setp2: 模型下载** pip install modelscope -i https://mirrors.aliyun.com/pypi/simple modelscope download --model ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3 --cache_dir /mnt/wksp/agi/models ![X.jpg](https://cdn.nlark.com/yuque/0/2024/jpeg/44993204/1725607920439-b314ff08-0bb3-47c6-a65f-1dad37a70582.jpeg?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0%2Finterlace%2C1) **step3: 创建虚拟环境** conda create -n chinese_llama_alpaca_3 python=3.8.17 pip -y conda activate chinese_llama_alpaca_3 cd Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/ pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple **step4: 修改文件** Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/openai_api_server.py ①处修改:增加停止token,防止回复重复内容(兼容llama3特有的停止token,不然流式接口返回的内容会不断的自动重复,不停止) ②处修改:在用户终止回复时,释放资源(在修改前,如果项目的使用场景是生成长文本,即使用户提前终止大模型回复,脚本依然会占用GPU资源直到整个结果完全生成) ![X.jpg](https://cdn.nlark.com/yuque/0/2024/jpeg/44993204/1725613356544-49c3ac0a-f725-4db3-9b88-cf17ea156355.jpeg?x-oss-process=image%2Fresize%2Cw_1142%2Climit_0%2Finterlace%2C1) **step5: 启动服务** cd Chinese-LLaMA-Alpaca-3-3.0/scripts/oai_api_demo/ python openai_api_server.py --gpus 0, 1 --base_model /mnt/wksp/agi/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3/ ![X.jpg](https://cdn.nlark.com/yuque/0/2024/jpeg/44993204/1725613739504-10b2a0b3-82f5-490d-9480-00d2423c42e8.jpeg?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0%2Finterlace%2C1) ## 客户端 **step1: 客户端配置** ①输入启动服务的接口地址 ②自定义本地模型名称 ③选择模型 ![X.jpg](https://cdn.nlark.com/yuque/0/2024/jpeg/44993204/1725613772646-1e221a82-67ab-4c8e-9b14-d815baf047a1.jpeg?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0%2Finterlace%2C1) **step2: 用部署的本地模型进行问答** ![X.jpg](https://cdn.nlark.com/yuque/0/2024/png/44993204/1725613807020-97e11d22-5841-4f90-9e0f-f7fd201cfc65.png?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0)
12535224197cs changed title from Chinese LLaMA Alpaca系列模型OpenAI API调用实现(跟练)-部署本地“chatgpt” to 04-Chinese LLaMA Alpaca系列模型OpenAI API调用实现(跟练)-部署本地“chatgpt” 2024-09-28 17:06:12 +08:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: HswOAuth/llm_course#24
No description provided.