基于Ollama的本地模型部署 【复现】 #350

Closed
opened 2024-11-05 18:09:03 +08:00 by GANGUAGUA · 0 comments

实验环境

操作系统:本地连接Linux远程服务器

GPU:RTX3080

离线安装

下载安装脚本

curl -fsSL https://ollama.com/install.sh -o install.sh

下载安装包

根据硬件下载对应的安装包,比如最新版本为v0.3.8

0.3.8 版本Ollama 下载地址

*因为这里我用的是远程服务器做的实验,所以我在将安装包下载到本地后需要将其上传到远程服务器

修改脚本

确保前面的步骤已经下载到服务器上:

使用编辑器修改install.sh文件中的curl命令下载部分;将其替换成安装包的地址

修改好后执行安装:

sh install.sh

测试安装是否成功

ollama run qwen2

*安装完成后会默认启动ollama服务端,默认端口为11434

启动模型推理

单卡推理

ollama run qwen2

多卡推理

由于我使用的是Linux服务器,所以需要在启动多卡推理前需要先修改环境变量。

使用编辑器,对 /etc/systemd/system/ollama.service.d进行编辑

vim /etc/systemd/system/ollama.service.d

将如下代码编辑进文件中:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="CUDA_VISIBLE_DEVICES=0,1"
Environment="OLLAMA_SCHED_SPREAD=1"

重新启动ollama serve

systemctl daemon-reload # 重新加载启动文件
systemctl restart ollama # 重启ollama serve

查看ollama的状态,是否已经在运行

交互方式

使用OpenAI python SDK调用ollama部署的模型

先确保有call_ollama.py这个python文件的存在,使用如下命令查看文件内容:

cat call_ollama.py

确保服务器地址正确

之后转到该文件目录下

cd ollama_anzhuang

执行该文件

python call_ollama.py

测试下function call

python call_ollama_tools.py

使用curl方式调用

使用如下curl命令调用:

curl http://192.168.31.143:11434/api/chat -d '{
  "model": "llama3.1:8b",
  "messages": [
  {
     "role": "system",
     "content": "以海盗的口吻简单作答。"
  },
  {
    "role": "user",
    "content": "天空为什么是蓝色的?"
  }
 ],
  "stream": false
 }'

*注意:修改为自己的IP地址

命令行模式

对话模式

ollama run qwen2  #进入交互模式

非对话模式

ollama run qwen2 "介绍下你自己" # 可以加 --verbose 查看执行效率

使用open webUI

执行如下命令即可启动open webui:

sudo docker run -d -p 28080:8080 \
        -e OLLAMA_BASE_URL=http://192.168.31.143:11434 \
        -v open-webui:/app/backend/data \
        --name open-webui \
        --restart always \
        ghcr.io/open-webui/open-webui:main

打开浏览器网页,输入:

192.168.31.143:28080

*注意:这里将IP地址更改为自己的地址

选择模型:

跟模型进行对话:

接入FastGPT

可详细参考:基于FastGPT搭建一个RAG知识问答系统

确保已经在前面步骤中设置好环境

确认已安装Docker

可用如下命令来查看docker版本:

docker--version

安装docker

#移除 Docker 相关的软件包和容器运行时
apt remove docker-ce docker-ce-cli containerd.io docker-compose-plugin docker docker-engine docker.io containerd runc

# 切换国内源
sed -i -r 's#http://(archive|security).ubuntu.com#https://mirrors.aliyun.com#g' /etc/apt/sources.list && apt update -y

### 安装docker
# 安装ca-certificates curl gnupg lsb-release
apt install ca-certificates curl gnupg lsb-release -y
#下载 Docker 的 GPG 密钥,并将其添加到 apt-key 中
curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | apt-key add -
# 为Docker 添加阿里源
add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
# 更新系统的软件包
apt -y update
# 安装docker相关的包
apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y

#测试是否安装成功
docker --version

安装docker-compose

apt-get install docker-compose-plugin

#测试安装是否成功
docker compose version

确认已部署FastGPT+OneAPI系统

由于之前的课程中有部署过FastGPT以及OneAPI系统,所以我的服务器上有相关的文件夹

如果没有部署过,则需要自己部署

部署Fast GPT与OneAPI

这里需要准备两个文件:docker-compose.yml以及config.json,其中config.json是FastGPT的配置文件。

首先创建目录,将所有关于fastgpt的文件放在一个新的文件夹下:

mkdir fastgpt
cd fastgpt
准备docker-compose.yml:
# 数据库的默认账号和密码仅首次运行时设置有效
# 如果修改了账号密码,记得改数据库和项目连接参数,别只改一处~
# 该配置文件只是给快速启动,测试使用。正式使用,记得务必修改账号密码,以及调整合适的知识库参数,共享内存等。
# 如何无法访问 dockerhub 和 git,可以用阿里云(阿里云没有arm包)

version: '3.3'
services:
  # db
  pg:
    #image: pgvector/pgvector:0.7.0-pg15 # docker hub
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/pgvector:v0.7.0 # 阿里云
    container_name: pg
    restart: always
    ports: # 生产环境建议不要暴露
      - 5432:5432
    networks:
      - fastgpt
    environment:
      # 这里的配置只有首次运行生效。修改后,重启镜像是不会生效的。需要把持久化数据删除再重启,才有效果
      - POSTGRES_USER=username
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=postgres
    volumes:
      - ./pg/data:/var/lib/postgresql/data
  mongo:
    #image: mongo:5.0.18 # dockerhub
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/mongo:5.0.18 # 阿里云
    # image: mongo:4.4.29 # cpu不支持AVX时候使用
    container_name: mongo
    restart: always
    ports:
      - 27017:27017
    networks:
      - fastgpt
    command: mongod --keyFile /data/mongodb.key --replSet rs0
    environment:
      - MONGO_INITDB_ROOT_USERNAME=myusername
      - MONGO_INITDB_ROOT_PASSWORD=mypassword
    volumes:
      - ./mongo/data:/data/db
    entrypoint:
      - bash
      - -c
      - |
        openssl rand -base64 128 > /data/mongodb.key
        chmod 400 /data/mongodb.key
        chown 999:999 /data/mongodb.key
        echo 'const isInited = rs.status().ok === 1
        if(!isInited){
          rs.initiate({
              _id: "rs0",
              members: [
                  { _id: 0, host: "mongo:27017" }
              ]
          })
        }' > /data/initReplicaSet.js
        # 启动MongoDB服务
        exec docker-entrypoint.sh "$$@" &

        # 等待MongoDB服务启动
        until mongo -u myusername -p mypassword --authenticationDatabase admin --eval "print('waited for connection')" > /dev/null 2>&1; do
          echo "Waiting for MongoDB to start..."
          sleep 2
        done

        # 执行初始化副本集的脚本
        mongo -u myusername -p mypassword --authenticationDatabase admin /data/initReplicaSet.js

        # 等待docker-entrypoint.sh脚本执行的MongoDB服务进程
        wait $$!

  # fastgpt
  sandbox:
    container_name: sandbox
    # image: ghcr.io/labring/fastgpt-sandbox:latest # git
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:latest # 阿里云
    networks:
      - fastgpt
    restart: always
  fastgpt:
    container_name: fastgpt
    # image: ghcr.io/labring/fastgpt:v4.8.8-fix2 # git
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.8-fix2 # 阿里云
    ports:
      - 3000:3000
    networks:
      - fastgpt
    depends_on:
      - mongo
      - pg
      - sandbox
    restart: always
    environment:
      # root 密码,用户名为: root。如果需要修改 root 密码,直接修改这个环境变量,并重启即可。
      - DEFAULT_ROOT_PSW=1234
      # AI模型的API地址哦。务必加 /v1。这里默认填写了OneApi的访问地址。
      - OPENAI_BASE_URL=http://oneapi:3000/v1
      # AI模型的API Key。(这里默认填写了OneAPI的快速默认key,测试通后,务必及时修改)
      - CHAT_API_KEY=sk-fastgpt
      # 数据库最大连接数
      - DB_MAX_LINK=30
      # 登录凭证密钥
      - TOKEN_KEY=any
      # root的密钥,常用于升级时候的初始化请求
      - ROOT_KEY=root_key
      # 文件阅读加密
      - FILE_TOKEN_KEY=filetoken
      # MongoDB 连接参数. 用户名myusername,密码mypassword。
      - MONGODB_URI=mongodb://myusername:mypassword@mongo:27017/fastgpt?authSource=admin
      # pg 连接参数
      - PG_URL=postgresql://username:password@pg:5432/postgres
      # sandbox 地址
      - SANDBOX_URL=http://sandbox:3000
      # 日志等级: debug, info, warn, error
      - LOG_LEVEL=info
      - STORE_LOG_LEVEL=warn
    volumes:
      - ./config.json:/app/data/config.json

  # oneapi
  mysql:
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/mysql:8.0.36 # 阿里云
    # image: mysql:8.0.36
    container_name: mysql
    restart: always
    ports:
      - 3306:3306
    networks:
      - fastgpt
    command: --default-authentication-plugin=mysql_native_password
    environment:
      # 默认root密码,仅首次运行有效
      MYSQL_ROOT_PASSWORD: oneapimmysql
      MYSQL_DATABASE: oneapi
    volumes:
      - ./mysql:/var/lib/mysql
  oneapi:
    container_name: oneapi
    # image: ghcr.io/songquanpeng/one-api:v0.6.7
    image: registry.cn-hangzhou.aliyuncs.com/fastgpt/one-api:v0.6.6 # 阿里云
    ports:
      - 3001:3000
    depends_on:
      - mysql
    networks:
      - fastgpt
    restart: always
    environment:
      # mysql 连接参数
      - SQL_DSN=root:oneapimmysql@tcp(mysql:3306)/oneapi
      # 登录凭证加密密钥
      - SESSION_SECRET=oneapikey
      # 内存缓存
      - MEMORY_CACHE_ENABLED=true
      # 启动聚合更新,减少数据交互频率
      - BATCH_UPDATE_ENABLED=true
      # 聚合更新时长
      - BATCH_UPDATE_INTERVAL=10
      # 初始化的 root 密钥(建议部署完后更改,否则容易泄露)
      - INITIAL_ROOT_TOKEN=fastgpt
    volumes:
      - ./oneapi:/data
networks:
  fastgpt:
准备FastGPT配置文件config.json
{
  "feConfigs": {
    "lafEnv": "https://laf.dev"
  },
  "systemEnv": {
    "vectorMaxProcess": 15,
    "qaMaxProcess": 15,
    "pgHNSWEfSearch": 100
  },
  "llmModels": [
    {
        "model": "llama-3.1-instruct",
        "name": "llama-3.1-instruct",
        "maxContext": 128000,
        "maxResponse": 128000,
        "quoteMaxToken": 32000,
        "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": false,
      "datasetProcess": false,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": false,
      "functionCall": false,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    },
    {
        "model": "qwen2-instruct",
        "name": "qwen2-instruct",
        "avatar": "/imgs/model/qwen.svg",
        "maxContext": 128000,
        "maxResponse": 128000,
        "quoteMaxToken": 32000,
        "maxTemperature": 1.2,
        "charsPointsPrice": 0,
        "censor": false,
        "vision": false,
        "datasetProcess": true,
        "usedInClassify": true,
        "usedInExtractFields": true,
        "usedInToolCall": true,
        "usedInQueryExtension": true,
        "toolChoice": false,
        "functionCall": false,
        "customCQPrompt": "",
        "customExtractPrompt": "",
        "defaultSystemChatPrompt": "",
        "defaultConfig": {}
    },
    {
        "model": "glm-4v",
        "name": "glm-4v",
        "avatar": "/imgs/model/chatglm.svg",
        "maxContext": 128000,
        "maxResponse": 128000,
        "quoteMaxToken": 32000,
        "maxTemperature": 1.2,
        "charsPointsPrice": 0,
        "censor": false,
        "vision": true,
        "datasetProcess": true,
        "usedInClassify": true,
        "usedInExtractFields": true,
        "usedInToolCall": true,
        "usedInQueryExtension": true,
        "toolChoice": true,
        "functionCall": false,
        "customCQPrompt": "",
        "customExtractPrompt": "",
        "defaultSystemChatPrompt": "",
        "defaultConfig": {}
    },
    {
        "model": "ERNIE-Speed-128K",
        "name": "ERNIE-Speed-128K",
	"avatar": "/imgs/model/ernie.svg",
        "maxContext": 128000,
        "maxResponse": 128000,
        "quoteMaxToken": 32000,
        "maxTemperature": 1.2,
        "charsPointsPrice": 0,
        "censor": false,
        "vision": false,
        "datasetProcess": true,
        "usedInClassify": true,
        "usedInExtractFields": true,
        "usedInToolCall": true,
        "usedInQueryExtension": true,
        "toolChoice": true,
        "functionCall": false,
        "customCQPrompt": "",
        "customExtractPrompt": "",
        "defaultSystemChatPrompt": "",
        "defaultConfig": {}
    },
    {
      "model": "gpt-4o-mini",
      "name": "gpt-4o-mini",
      "avatar": "/imgs/model/openai.svg",
      "maxContext": 125000,
      "maxResponse": 4000,
      "quoteMaxToken": 120000,
      "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": true,
      "datasetProcess": true,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": true,
      "functionCall": false,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    },
    {
      "model": "gpt-4o",
      "name": "gpt-4o",
      "avatar": "/imgs/model/openai.svg",
      "maxContext": 125000,
      "maxResponse": 4000,
      "quoteMaxToken": 120000,
      "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": true,
      "datasetProcess": false,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": true,
      "functionCall": false,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    }
  ],
  "vectorModels": [
    {
      "model": "m3e-base",
      "name": "m3e-base",
      "charsPointsPrice": 0,
      "defaultToken": 256,
      "maxToken": 512,
      "weight": 100,
      "defaultConfig": {},
      "dbConfig": {},
      "queryConfig": {}
    },
    {
      "model": "text-embedding-ada-002",
      "name": "Embedding-2",
      "avatar": "/imgs/model/openai.svg",
      "charsPointsPrice": 0,
      "defaultToken": 700,
      "maxToken": 3000,
      "weight": 100,
      "defaultConfig": {},
      "dbConfig": {},
      "queryConfig": {}
    },
    {
      "model": "text-embedding-3-large",
      "name": "text-embedding-3-large",
      "avatar": "/imgs/model/openai.svg",
      "charsPointsPrice": 0,
      "defaultToken": 512,
      "maxToken": 3000,
      "weight": 100,
      "defaultConfig": {
        "dimensions": 1024
      }
    },
    {
      "model": "embeding3",
      "name": "embeding3",
      "avatar": "/imgs/model/chatglm.svg",
      "charsPointsPrice": 0,
      "defaultToken": 512,
      "maxToken": 3000,
      "weight": 100
    }
  ],
  "reRankModels": [
    {
      "model": "bge-reranker-v2-m3",
      "name": "bge-reranker-v2-m3"
    }
  ],
  "audioSpeechModels": [
    {
      "model": "tts-1",
      "name": "OpenAI TTS1",
      "charsPointsPrice": 0,
      "voices": [
        { "label": "Alloy", "value": "alloy", "bufferId": "openai-Alloy" },
        { "label": "Echo", "value": "echo", "bufferId": "openai-Echo" },
        { "label": "Fable", "value": "fable", "bufferId": "openai-Fable" },
        { "label": "Onyx", "value": "onyx", "bufferId": "openai-Onyx" },
        { "label": "Nova", "value": "nova", "bufferId": "openai-Nova" },
        { "label": "Shimmer", "value": "shimmer", "bufferId": "openai-Shimmer" }
      ]
    }
  ],
  "whisperModel": {
    "model": "whisper-1",
    "name": "Whisper1",
    "charsPointsPrice": 0
  }
}

启动docker

cd fastgpt #先转到该目录

在fastgpt目录下执行如下命令启动系统:

docker-compose up -d

使用如下命令可重启FastGPT并生效之前的修改:

sudo docker restart fastgpt #重启生效刚刚更改的配置

在One API接入模型

打开浏览器,输入:

192.168.31.143:3001

先确定已经启动了ollama,用如下命令查看ollama状态:

systemctl status ollama

然后查看ollama里有哪些模型:

ollama list

在'渠道'里添加新的渠道,在渠道中填入Ollama以及Ollama里有的模型

测试添加的渠道

在FastGPT**创建一个简易聊天应用**

在浏览器输入以下网址:

192.168.31.143:3000

选择刚刚用OneAPI接入的模型:

可在ollama官网文档里查到上下文长度:https://ollama.com/blog

跟模型进行对话:

# 实验环境 操作系统:本地连接Linux远程服务器 GPU:RTX3080 # 离线安装 ### **<font style="color:#333333;">下载安装脚本</font>** ```plain curl -fsSL https://ollama.com/install.sh -o install.sh ``` ### 下载安装包 根据硬件下载对应的安装包,<font style="color:#333333;">比如最新版本为v0.3.8</font> 0.3.8 版本Ollama [下载地址](https://github.com/ollama/ollama/tags?after=v0.3.11-rc2) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730712890478-751de785-b594-4d54-b27c-6e3e77c5e204.png) *因为这里我用的是远程服务器做的实验,所以我在将安装包下载到本地后需要将其上传到远程服务器 ### 修改脚本 确保前面的步骤已经下载到服务器上: ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730714208275-4d71d3ec-dff4-4cb0-b57a-8d81888d9481.png) 使用编辑器修改install.sh文件中的<font style="color:#333333;">curl命令下载部分;将其替换成安装包的地址</font> ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730713461528-44fa8688-bfb0-4e55-b6e1-9e6e9d824cf5.png) 修改好后执行安装: ```plain sh install.sh ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730713129173-bc3c4065-2d88-4650-ac6f-6e98fb718d1c.png) 测试安装是否成功 ```plain ollama run qwen2 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730442284867-8f5bc131-a83b-498e-9911-d48348c3aa7e.png) *<font style="color:#000000;">安装完成后会默认启动ollama服务端,默认端口为11434</font> <font style="color:#000000;"></font> # 启动模型推理 ### 单卡推理 ```plain ollama run qwen2 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730442358848-ae77e44a-a80e-440e-a7f8-e472fc028a45.png) ### 多卡推理 由于我使用的是Linux服务器,所以需要在启动多卡推理前需要先修改环境变量。 使用编辑器,对 /etc/systemd/system/ollama.service.d进行编辑 ```plain vim /etc/systemd/system/ollama.service.d ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443166701-8751369d-18ab-445d-98e8-e5b16ab69667.png) 将如下代码编辑进文件中: ```plain [Service] Environment="OLLAMA_HOST=0.0.0.0:11434" Environment="CUDA_VISIBLE_DEVICES=0,1" Environment="OLLAMA_SCHED_SPREAD=1" ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730442878073-b1300f57-efd1-4e26-a6cc-a0e1ef88eb11.png) <font style="color:#333333;">重新启动ollama serve</font> ```plain systemctl daemon-reload # 重新加载启动文件 systemctl restart ollama # 重启ollama serve ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443200375-b3590f6f-9c51-4750-8353-f3fa649f0e2a.png) 查看ollama的状态,是否已经在运行 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443142301-a7547935-c944-46e6-bf8a-770a83a7ae66.png) # 交互方式 ### **<font style="color:#333333;">使用OpenAI python SDK调用ollama部署的模型</font>** 先确保有call_ollama.py这个python文件的存在,使用如下命令查看文件内容: ```plain cat call_ollama.py ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443494118-2fd95ac6-1aad-4a10-b7fa-ddb423afcc50.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443549176-0dc1c217-bb7a-4a26-bd54-2f6ac3fa9724.png) 确保服务器地址正确 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443658845-e72997b9-2967-49a4-bae0-53596a610a48.png) 之后转到该文件目录下 ```plain cd ollama_anzhuang ``` 执行该文件 ```plain python call_ollama.py ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443538363-5f482ccf-8b4d-4732-bdd0-394ff95a616e.png) 测试下function call ```plain python call_ollama_tools.py ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730797619732-d19b1dd2-8428-4f0c-8e03-6de885e29dd4.png) ### 使用curl方式调用 使用如下curl命令调用: ```plain curl http://192.168.31.143:11434/api/chat -d '{ "model": "llama3.1:8b", "messages": [ { "role": "system", "content": "以海盗的口吻简单作答。" }, { "role": "user", "content": "天空为什么是蓝色的?" } ], "stream": false }' ``` *注意:修改为自己的IP地址 ### ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730443955705-ae2a7bd3-7435-4805-8e16-df07332f150e.png) ### 命令行模式 #### 对话模式 ```plain ollama run qwen2 #进入交互模式 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730444407373-0f545293-c9ab-466a-ade7-dd3568988233.png) #### 非对话模式 ```plain ollama run qwen2 "介绍下你自己" # 可以加 --verbose 查看执行效率 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730444437543-094806e8-759f-4511-8844-bb6b46d796ff.png) ### 使用open webUI 执行如下命令即可启动open webui: ```plain sudo docker run -d -p 28080:8080 \ -e OLLAMA_BASE_URL=http://192.168.31.143:11434 \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:main ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730698793446-f4e541ab-84ad-4852-a823-acb3931aea08.png) 打开浏览器网页,输入: ```plain 192.168.31.143:28080 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730798696528-43ad38ea-6442-496b-a9c0-08501aa3dbe9.png) *注意:这里将IP地址更改为自己的地址 选择模型: ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730698887320-66f7ce46-3fab-4b3c-8745-7bd43122b63d.png) 跟模型进行对话: ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730698955944-585256c8-3658-4bb8-b0ba-09699edda078.png) # 接入FastGPT 可详细参考:[基于FastGPT搭建一个RAG知识问答系统](https://docs.qq.com/doc/p/0555f97a90b6f2f9f039bd0c9251f5d7d1e7a506) 确保已经在前面步骤中设置好环境 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730701422612-5a067545-9cd3-4c7a-8d6f-596c49d661e2.png) ### 确认已安装Docker 可用如下命令来查看docker版本: ```plain docker--version ``` #### 安装docker ```plain #移除 Docker 相关的软件包和容器运行时 apt remove docker-ce docker-ce-cli containerd.io docker-compose-plugin docker docker-engine docker.io containerd runc # 切换国内源 sed -i -r 's#http://(archive|security).ubuntu.com#https://mirrors.aliyun.com#g' /etc/apt/sources.list && apt update -y ### 安装docker # 安装ca-certificates curl gnupg lsb-release apt install ca-certificates curl gnupg lsb-release -y #下载 Docker 的 GPG 密钥,并将其添加到 apt-key 中 curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | apt-key add - # 为Docker 添加阿里源 add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" # 更新系统的软件包 apt -y update # 安装docker相关的包 apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y #测试是否安装成功 docker --version ``` #### 安装docker-compose ```plain apt-get install docker-compose-plugin #测试安装是否成功 docker compose version ``` ### **<font style="color:#1a1a1a;">确认已部署FastGPT+OneAPI系统</font>** 由于之前的课程中有部署过FastGPT以及OneAPI系统,所以我的服务器上有相关的文件夹 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730704341326-6eaa6638-30e5-40d0-a69d-8fd09cf01386.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730710133608-d8d1c72d-4a6a-48c0-9917-80b9d3e8dd51.png) 如果没有部署过,则需要自己部署 #### 部署Fast GPT与OneAPI <font style="color:#333333;">这里需要准备两个文件:docker-compose.yml以及</font><font style="color:#333333;">config.json,其中config.json是FastGPT的配置文件。</font> <font style="color:#333333;">首先创建目录,将所有关于fastgpt的文件放在一个新的文件夹下:</font> ```plain mkdir fastgpt cd fastgpt ``` ##### 准备docker-compose.yml: ```plain # 数据库的默认账号和密码仅首次运行时设置有效 # 如果修改了账号密码,记得改数据库和项目连接参数,别只改一处~ # 该配置文件只是给快速启动,测试使用。正式使用,记得务必修改账号密码,以及调整合适的知识库参数,共享内存等。 # 如何无法访问 dockerhub 和 git,可以用阿里云(阿里云没有arm包) version: '3.3' services: # db pg: #image: pgvector/pgvector:0.7.0-pg15 # docker hub image: registry.cn-hangzhou.aliyuncs.com/fastgpt/pgvector:v0.7.0 # 阿里云 container_name: pg restart: always ports: # 生产环境建议不要暴露 - 5432:5432 networks: - fastgpt environment: # 这里的配置只有首次运行生效。修改后,重启镜像是不会生效的。需要把持久化数据删除再重启,才有效果 - POSTGRES_USER=username - POSTGRES_PASSWORD=password - POSTGRES_DB=postgres volumes: - ./pg/data:/var/lib/postgresql/data mongo: #image: mongo:5.0.18 # dockerhub image: registry.cn-hangzhou.aliyuncs.com/fastgpt/mongo:5.0.18 # 阿里云 # image: mongo:4.4.29 # cpu不支持AVX时候使用 container_name: mongo restart: always ports: - 27017:27017 networks: - fastgpt command: mongod --keyFile /data/mongodb.key --replSet rs0 environment: - MONGO_INITDB_ROOT_USERNAME=myusername - MONGO_INITDB_ROOT_PASSWORD=mypassword volumes: - ./mongo/data:/data/db entrypoint: - bash - -c - | openssl rand -base64 128 > /data/mongodb.key chmod 400 /data/mongodb.key chown 999:999 /data/mongodb.key echo 'const isInited = rs.status().ok === 1 if(!isInited){ rs.initiate({ _id: "rs0", members: [ { _id: 0, host: "mongo:27017" } ] }) }' > /data/initReplicaSet.js # 启动MongoDB服务 exec docker-entrypoint.sh "$$@" & # 等待MongoDB服务启动 until mongo -u myusername -p mypassword --authenticationDatabase admin --eval "print('waited for connection')" > /dev/null 2>&1; do echo "Waiting for MongoDB to start..." sleep 2 done # 执行初始化副本集的脚本 mongo -u myusername -p mypassword --authenticationDatabase admin /data/initReplicaSet.js # 等待docker-entrypoint.sh脚本执行的MongoDB服务进程 wait $$! # fastgpt sandbox: container_name: sandbox # image: ghcr.io/labring/fastgpt-sandbox:latest # git image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:latest # 阿里云 networks: - fastgpt restart: always fastgpt: container_name: fastgpt # image: ghcr.io/labring/fastgpt:v4.8.8-fix2 # git image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.8-fix2 # 阿里云 ports: - 3000:3000 networks: - fastgpt depends_on: - mongo - pg - sandbox restart: always environment: # root 密码,用户名为: root。如果需要修改 root 密码,直接修改这个环境变量,并重启即可。 - DEFAULT_ROOT_PSW=1234 # AI模型的API地址哦。务必加 /v1。这里默认填写了OneApi的访问地址。 - OPENAI_BASE_URL=http://oneapi:3000/v1 # AI模型的API Key。(这里默认填写了OneAPI的快速默认key,测试通后,务必及时修改) - CHAT_API_KEY=sk-fastgpt # 数据库最大连接数 - DB_MAX_LINK=30 # 登录凭证密钥 - TOKEN_KEY=any # root的密钥,常用于升级时候的初始化请求 - ROOT_KEY=root_key # 文件阅读加密 - FILE_TOKEN_KEY=filetoken # MongoDB 连接参数. 用户名myusername,密码mypassword。 - MONGODB_URI=mongodb://myusername:mypassword@mongo:27017/fastgpt?authSource=admin # pg 连接参数 - PG_URL=postgresql://username:password@pg:5432/postgres # sandbox 地址 - SANDBOX_URL=http://sandbox:3000 # 日志等级: debug, info, warn, error - LOG_LEVEL=info - STORE_LOG_LEVEL=warn volumes: - ./config.json:/app/data/config.json # oneapi mysql: image: registry.cn-hangzhou.aliyuncs.com/fastgpt/mysql:8.0.36 # 阿里云 # image: mysql:8.0.36 container_name: mysql restart: always ports: - 3306:3306 networks: - fastgpt command: --default-authentication-plugin=mysql_native_password environment: # 默认root密码,仅首次运行有效 MYSQL_ROOT_PASSWORD: oneapimmysql MYSQL_DATABASE: oneapi volumes: - ./mysql:/var/lib/mysql oneapi: container_name: oneapi # image: ghcr.io/songquanpeng/one-api:v0.6.7 image: registry.cn-hangzhou.aliyuncs.com/fastgpt/one-api:v0.6.6 # 阿里云 ports: - 3001:3000 depends_on: - mysql networks: - fastgpt restart: always environment: # mysql 连接参数 - SQL_DSN=root:oneapimmysql@tcp(mysql:3306)/oneapi # 登录凭证加密密钥 - SESSION_SECRET=oneapikey # 内存缓存 - MEMORY_CACHE_ENABLED=true # 启动聚合更新,减少数据交互频率 - BATCH_UPDATE_ENABLED=true # 聚合更新时长 - BATCH_UPDATE_INTERVAL=10 # 初始化的 root 密钥(建议部署完后更改,否则容易泄露) - INITIAL_ROOT_TOKEN=fastgpt volumes: - ./oneapi:/data networks: fastgpt: ``` ##### 准备FastGPT配置文件config.json ```plain { "feConfigs": { "lafEnv": "https://laf.dev" }, "systemEnv": { "vectorMaxProcess": 15, "qaMaxProcess": 15, "pgHNSWEfSearch": 100 }, "llmModels": [ { "model": "llama-3.1-instruct", "name": "llama-3.1-instruct", "maxContext": 128000, "maxResponse": 128000, "quoteMaxToken": 32000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": false, "datasetProcess": false, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": false, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} }, { "model": "qwen2-instruct", "name": "qwen2-instruct", "avatar": "/imgs/model/qwen.svg", "maxContext": 128000, "maxResponse": 128000, "quoteMaxToken": 32000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": false, "datasetProcess": true, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": false, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} }, { "model": "glm-4v", "name": "glm-4v", "avatar": "/imgs/model/chatglm.svg", "maxContext": 128000, "maxResponse": 128000, "quoteMaxToken": 32000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": true, "datasetProcess": true, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": true, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} }, { "model": "ERNIE-Speed-128K", "name": "ERNIE-Speed-128K", "avatar": "/imgs/model/ernie.svg", "maxContext": 128000, "maxResponse": 128000, "quoteMaxToken": 32000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": false, "datasetProcess": true, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": true, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} }, { "model": "gpt-4o-mini", "name": "gpt-4o-mini", "avatar": "/imgs/model/openai.svg", "maxContext": 125000, "maxResponse": 4000, "quoteMaxToken": 120000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": true, "datasetProcess": true, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": true, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} }, { "model": "gpt-4o", "name": "gpt-4o", "avatar": "/imgs/model/openai.svg", "maxContext": 125000, "maxResponse": 4000, "quoteMaxToken": 120000, "maxTemperature": 1.2, "charsPointsPrice": 0, "censor": false, "vision": true, "datasetProcess": false, "usedInClassify": true, "usedInExtractFields": true, "usedInToolCall": true, "usedInQueryExtension": true, "toolChoice": true, "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultSystemChatPrompt": "", "defaultConfig": {} } ], "vectorModels": [ { "model": "m3e-base", "name": "m3e-base", "charsPointsPrice": 0, "defaultToken": 256, "maxToken": 512, "weight": 100, "defaultConfig": {}, "dbConfig": {}, "queryConfig": {} }, { "model": "text-embedding-ada-002", "name": "Embedding-2", "avatar": "/imgs/model/openai.svg", "charsPointsPrice": 0, "defaultToken": 700, "maxToken": 3000, "weight": 100, "defaultConfig": {}, "dbConfig": {}, "queryConfig": {} }, { "model": "text-embedding-3-large", "name": "text-embedding-3-large", "avatar": "/imgs/model/openai.svg", "charsPointsPrice": 0, "defaultToken": 512, "maxToken": 3000, "weight": 100, "defaultConfig": { "dimensions": 1024 } }, { "model": "embeding3", "name": "embeding3", "avatar": "/imgs/model/chatglm.svg", "charsPointsPrice": 0, "defaultToken": 512, "maxToken": 3000, "weight": 100 } ], "reRankModels": [ { "model": "bge-reranker-v2-m3", "name": "bge-reranker-v2-m3" } ], "audioSpeechModels": [ { "model": "tts-1", "name": "OpenAI TTS1", "charsPointsPrice": 0, "voices": [ { "label": "Alloy", "value": "alloy", "bufferId": "openai-Alloy" }, { "label": "Echo", "value": "echo", "bufferId": "openai-Echo" }, { "label": "Fable", "value": "fable", "bufferId": "openai-Fable" }, { "label": "Onyx", "value": "onyx", "bufferId": "openai-Onyx" }, { "label": "Nova", "value": "nova", "bufferId": "openai-Nova" }, { "label": "Shimmer", "value": "shimmer", "bufferId": "openai-Shimmer" } ] } ], "whisperModel": { "model": "whisper-1", "name": "Whisper1", "charsPointsPrice": 0 } } ``` ### 启动docker ```plain cd fastgpt #先转到该目录 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730705775466-d44279ac-d616-4e35-bba5-463460f01759.png) 在fastgpt目录下执行如下命令启动系统: ```plain docker-compose up -d ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730706060911-2903be62-22ea-40ef-af5d-5c349b1d59c9.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730706104953-bb01904f-e803-4c5a-8406-f1f45f112081.png) 使用如下命令可重启FastGPT并生效之前的修改: ```plain sudo docker restart fastgpt #重启生效刚刚更改的配置 ``` ### 在One API接入模型 打开浏览器,输入: ```plain 192.168.31.143:3001 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730706221843-0f443088-3650-4c9b-be40-4c0ae9cdc959.png) 先确定已经启动了ollama,用如下命令查看ollama状态: ```plain systemctl status ollama ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730800238552-514c1210-7afc-4500-bd6b-7fd306bfebe0.png) 然后查看ollama里有哪些模型: ```plain ollama list ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730800208461-5a0ea8a2-9248-4ea3-9e12-90ab8dd4bcea.png) 在'渠道'里添加新的渠道,在渠道中填入Ollama以及Ollama里有的模型 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730709827973-acd3680d-acd4-4f5a-b9fd-2896a4d48841.png) 测试添加的渠道 ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730712776933-39bc5e3c-44ac-4df9-9d69-a6d156aecb12.png) ### 在FastGPT**<font style="color:#1a1a1a;">创建一个简易聊天应用</font>** 在浏览器输入以下网址: ```plain 192.168.31.143:3000 ``` ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730710621134-f2b53ff9-a7cc-4761-9d51-a61cf2a47726.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730710658848-9e07d9ad-4111-42ac-a321-b39d6e055895.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730710861591-a6696958-a3bb-4f3e-abb0-8412c6f0dfa6.png) 选择刚刚用OneAPI接入的模型: ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730710900353-cb3a144a-e581-449a-9fd7-e86bb1a56a25.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730711270268-4e9ace2c-6deb-4166-b2e6-7289dab1e10c.png) 可在ollama官网文档里查到上下文长度:[https://ollama.com/blog](https://ollama.com/blog) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730711392530-b31d2bbc-3ddf-4948-ab34-177264b30f5b.png) ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730711411051-c48e9fbd-4297-4d0c-af82-b1852c8b3d4f.png) 跟模型进行对话: ![](https://cdn.nlark.com/yuque/0/2024/png/48118617/1730711321443-f19674f0-cd9c-458a-aaa1-d8972daefb88.png)
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: HswOAuth/llm_course#350
No description provided.