【Dify】向量数据库迁移至Milvus踩坑记录
Dify迁移数据库至Milvus时报错Create dataset index error: MilvusException )的解决方法
Dify版本:0.7.2
部署方式:docker
先上图,官方提供的数据库迁移操作,看起来应该是比较老版本的操作了,新版本暂未更新。总体思路就是修改环境变量,再用dify-api中提供的vdb-migrate进行数据库迁移即可。
一、备份
备份一下挂载目录(主要是数据库内容)
tar -cvf volumes-$(date +%s).tgz volumes
二、修改环境变量
找到dify/docker/.env文件修改环境变量,修改VETOR_STORE的值为milvus(默认为weaviate)
(PS:这里我同步修改了compose-config.yml中milvus的版本,并添加了Milvus可视化管理attu,完整配置在文末)
三、重启
docker-compose down
docker-compose up -d
启动可以看到milvus相关容器已经启动
四、执行迁移命令
docker exec -it docker-api-1 flask vdb-migrate
报错连接失败,信息如下:
2024-09-02 02:11:43,655.655 ERROR [MainThread] [milvus_client.py:655] - Failed to create new connection using: 5993ac3c136a4864a15513e4d0f4f35a
Create dataset index error: MilvusException <MilvusException: (code=2, message=Fail connecting to server on 127.0.0.1:19530, illegal connection params or server unavailable)>
这里看到一条issue说因为端口没有开放,我这里按照提供的方式修改Forgot to expose Milvus standalone port in yaml · Issue #7653 · langgenius/dify (github.com)
替换compose-config.yml中的milvus-standalone为:
milvus-standalone:
container_name: milvus-standalone
image: milvusdb/milvus:v2.3.1
profiles:
- milvus
command: [ "milvus", "run", "standalone" ]
environment:
ETCD_ENDPOINTS: ${ETCD_ENDPOINTS:-etcd:2379}
MINIO_ADDRESS: ${MINIO_ADDRESS:-minio:9000}
common.security.authorizationEnabled: ${MILVUS_AUTHORIZATION_ENABLED:-true}
volumes:
- ./volumes/milvus/milvus:/var/lib/milvus
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9091/healthz" ]
interval: 30s
start_period: 90s
timeout: 20s
retries: 3
ports:
- "19530:19530" # Milvus gRPC port
- "9091:9091" # Milvus HTTP port
depends_on:
- "etcd"
- "minio"
networks:
- milvus
重新启动后运行迁移仍然报错:
尝试通过Attu连接,发现正常连接到Milvus,如下图:
经历了十多次尝试,得到了最终解决方式:
在compose_config.yml的API中,添加milvus的environment内容。
MILVUS_HOST: milvus-standalone # Milvus服务的容器名称
MILVUS_PORT: 19530 # Milvus服务的端口
五、相关修改代码*
1、attu和milvus新版本compose-config.yml文件内容:(注意:这里由于attu的3000端口被其他占用,我修改为宿主机18000映射3000,有需要可以自行调整)
# Milvus vector database services
etcd:
container_name: milvus-etcd
image: quay.io/coreos/etcd:v3.5.5
profiles:
- milvus
environment:
- ETCD_AUTO_COMPACTION_MODE=${ETCD_AUTO_COMPACTION_MODE:-revision}
- ETCD_AUTO_COMPACTION_RETENTION=${ETCD_AUTO_COMPACTION_RETENTION:-1000}
- ETCD_QUOTA_BACKEND_BYTES=${ETCD_QUOTA_BACKEND_BYTES:-4294967296}
- ETCD_SNAPSHOT_COUNT=${ETCD_SNAPSHOT_COUNT:-50000}
volumes:
- ./volumes/milvus/etcd:/etcd
command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
healthcheck:
test: [ "CMD", "etcdctl", "endpoint", "health" ]
interval: 30s
timeout: 20s
retries: 3
networks:
- milvus
minio:
container_name: milvus-minio
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
profiles:
- milvus
environment:
MINIO_ACCESS_KEY: ${MINIO_ACCESS_KEY:-minioadmin}
MINIO_SECRET_KEY: ${MINIO_SECRET_KEY:-minioadmin}
ports:
- "9001:9001"
- "9000:9000"
volumes:
- ./volumes/milvus/minio:/minio_data
command: minio server /minio_data --console-address ":9001"
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9000/minio/health/live" ]
interval: 30s
timeout: 20s
retries: 3
networks:
- milvus
milvus-standalone:
container_name: milvus-standalone
image: milvusdb/milvus:v2.4.9
profiles:
- milvus
command: [ "milvus", "run", "standalone" ]
environment:
ETCD_ENDPOINTS: ${ETCD_ENDPOINTS:-etcd:2379}
MINIO_ADDRESS: ${MINIO_ADDRESS:-minio:9000}
common.security.authorizationEnabled: ${MILVUS_AUTHORIZATION_ENABLED:-true}
volumes:
- ./volumes/milvus/milvus:/var/lib/milvus
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9091/healthz" ]
interval: 30s
start_period: 90s
timeout: 20s
retries: 3
ports:
- "19530:19530"
- "9091:9091"
depends_on:
- "etcd"
- "minio"
networks:
- milvus
attu:
container_name: attu
profiles:
- milvus
image: zilliz/attu:v2.4.7
environment:
MILVUS_URL: milvus-standalone:19530
ports:
- "18000:3000"
depends_on:
- "milvus-standalone"
networks:
- milvus
2、api和web环境变量设置(后续测试发现都需配置环境变量才能正常使用)
# API service
api:
image: langgenius/dify-api:0.7.2
restart: always
environment:
# Use the shared environment variables.
<<: *shared-api-worker-env
# Startup mode, 'api' starts the API server.
MODE: api
MILVUS_HOST: milvus-standalone # Milvus服务的容器名称
MILVUS_PORT: 19530 # Milvus服务的端口
depends_on:
- db
- redis
volumes:
# Mount the storage directory to the container, for storing user files.
- ./volumes/app/storage:/app/api/storage
networks:
- ssrf_proxy_network
- default
# worker service
# The Celery worker for processing the queue.
worker:
image: langgenius/dify-api:0.7.2
restart: always
environment:
# Use the shared environment variables.
<<: *shared-api-worker-env
# Startup mode, 'worker' starts the Celery worker for processing the queue.
MODE: worker
MILVUS_HOST: milvus-standalone # Milvus服务的容器名称
MILVUS_PORT: 19530 # Milvus服务的端口
depends_on:
- db
- redis
volumes:
# Mount the storage directory to the container, for storing user files.
- ./volumes/app/storage:/app/api/storage
networks:
- ssrf_proxy_network
- default
# Frontend web application.
web:
image: langgenius/dify-web:0.7.2
restart: always
environment:
CONSOLE_API_URL: ${CONSOLE_API_URL:-}
APP_API_URL: ${APP_API_URL:-}
SENTRY_DSN: ${WEB_SENTRY_DSN:-}
NEXT_TELEMETRY_DISABLED: ${NEXT_TELEMETRY_DISABLED:-0}
MILVUS_HOST: milvus-standalone # Milvus服务的容器名称
MILVUS_PORT: 19530 # Milvus服务的端口
参考内容:
更多推荐
所有评论(0)