Dify平台压测
选择该工具的原因:由于Dify的chat-messages为流式响应接口,经调研Locust相较其他工具(如Jmter、K6、wrk),直接在测试脚本中通过参数设置即可支持流式接口的调用,故本次压测使用Locust作为压测工具。由于在8核16G的限制下,资源已无法再进行调整,同时在压测过程中发现接口最终会通过pdsql进行数据的查询和修改,故对pdsql的配置进行调优,具体调整如下。该查询sql是
目录
背景与目标
在Dify智能体平台投入实际应用前,压力测试是不可或缺的环节。这一测试旨在通过模拟极端条件,全面评估平台在应对高并发请求、大量用户同时访问等场景下的性能表现,确保平台的稳定性和可靠性。
本次压力测试旨在实现以下目标:
- 8核16G最小部署的tps
- 16核32G资源下的tps及最优部署架构
压测场景
场景一:简单chatFLow场景
压测Dify最简单chatFlow应用的chat-messages接口
场景二:复杂chatFLow场景
压测Dify涉及到所有服务应用的chat-messages接口
场景三:文件召回场景
压测Dify的retrieve接口
压测工具
本次压测使用的压测工具为Locust,Locust是一个用于HTTP和其他协议的开源性能/负载测试工具,使用Python代码编写测试脚本。
选择该工具的原因:由于Dify的chat-messages为流式响应接口,经调研Locust相较其他工具(如Jmter、K6、wrk),直接在测试脚本中通过参数设置即可支持流式接口的调用,故本次压测使用Locust作为压测工具。
压测物料准备
配置Dify
简单chatFLow场景
新建一个应用,类型为CHATFLOE,编排中只包含开始和直接回复两个节点,在直接回复节点中设置任意回复内容

复杂chatFLow场景
新建一个应用,类型为CHATFLOE,编排中包含涉及到所有相关服务的节点

文件召回场景
新建一个知识库,不勾选Rerank,在知识库中上传一个文件,文件中包含文字及相关图片


Locust安装
确认压测机已经安装好python,通过以下命令安装Locust
pip install locust
Locust脚本编写
脚本如下:
from locust import HttpUser, TaskSet, task, between
import time
class ChatMessages(TaskSet):
@task
def chat_messages(self):
url = "/chat-messages"
headers = {
"Authorization": "Bearer app-vUKfsDmFlRJCLkQcUSniuUYX",
"Content-Type": "application/json",
}
payload = {
"inputs": {},
"query": "压测",
"response_mode": "streaming",
"user": "压测"
}
# 记录开始时间
start_time = time.time()
try:
# 发起 POST 请求
with self.client.post(url, json=payload, headers=headers, stream=True, catch_response=True, timeout=60) as response:
if response.status_code != 200:
response.failure(f"Unexpected status code: {response.status_code}")
return
# 记录首字节时间(TTFB)
ttfb = time.time() - start_time
print(f"TTFB: {ttfb:.2f}s")
# 逐块读取响应内容
chunk_count = 0
for line in response.iter_lines(decode_unicode=True):
if line.startswith("data:"):
chunk_count += 1
total_time = time.time() - start_time
print(f"Received chunk #{chunk_count}: {line} :{total_time:.2f}s")
if "message_end" in line:
total_time = time.time() - start_time
print(f"Total response time: {total_time:.2f}s")
break
except Exception as e:
response.failure(f"Request failed: {str(e)}")
class Retrieve(TaskSet):
@task
def retrieve(self):
url = "/datasets/66fb8951-bdfb-457c-8267-66b9e822a4ee/retrieve"
headers = {
"Authorization": "Bearer dataset-pqrDBWoy9UILq7zbHnCkN3dY",
"Content-Type": "application/json",
}
payload = {
"query": "流程审批是什么,如果有图片,请一起返回",
"retrieval_model": {
"search_method": "hybrid_search",
"reranking_enable": False,
"reranking_mode": None,
"reranking_model": {
"reranking_provider_name": "",
"reranking_model_name": ""
},
"weights": None,
"score_threshold_enabled": False,
"score_threshold": None
}
}
# 记录开始时间
start_time = time.time()
try:
# 发起 POST 请求
response = self.client.post(url, json=payload, headers=headers, timeout=60)
print(response.text)
if response.status_code != 200:
print(f"Unexpected status code: {response.status_code}")
return
# 记录请求时间(TTFB)
ttfb = time.time() - start_time
print(f"TTFB: {ttfb:.2f}s")
except Exception as e:
print(f"Request failed: {str(e)}")
class ChatMessagesTest(HttpUser):
# 声明执行的任务集是哪个类
tasks = [ChatMessages]
# 设置运行过程中间隔时间
# wait_time = between(1, 2)
# 每用户结束动作,作用等同于pytest、unittest的teardown
def on_stop(self):
self.client.close()
class RetrieveTest(HttpUser):
# 声明执行的任务集是哪个类
tasks = [Retrieve]
# 设置运行过程中间隔时间
# wait_time = between(1, 2)
# 每用户结束动作,作用等同于pytest、unittest的teardown
def on_stop(self):
self.client.close()
压测步骤
启动locust
通过以下命令启动Locust(test.py为上一小节的测试脚本)
locust -f test.py
开始压测
浏览器打开Locust的web页面,输入对应参数后开始压测(并发数依次为50,100,150压测持续时间为5分钟)

指标调整
压测期间查看Dify的dify-api、dify-plugin-daemon、xinference、ollama服务的性能指标是否异常,对异常的指标进行调整后重复压测,以确定哪些指标为Dify的性能瓶颈,相关性能指标如下:
|
CPU使用率 |
内存使用率 |
网络IO |
磁盘IO |
线程数 |
资源及配置调优过程(8核16G)
智能体配置:

并发数:100
压测持续时间:3分钟
第一次压测
服务部署详情:
|
服务名称 |
cpu |
内存 |
|
xinference |
1 |
2048 |
|
ollama |
1 |
2048 |
|
dify-sandbox |
0.8 |
1024 |
|
dify-nginx |
0.5 |
1024 |
|
dify-web |
0.5 |
1024 |
|
dify-ssrf |
0.3 |
300 |
|
dify-plugin-daemon |
0.8 |
2772 |
|
dify-worker |
0.8 |
1024 |
|
dify-api |
0.8 |
2048 |
|
dify-weaviate |
0.5 |
1024 |
|
dify-redis |
0.5 |
1024 |
|
dify-postgres |
0.5 |
1024 |
|
合计 |
8 |
16384 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
5319 |
0 |
2200 |
4100 |
9000 |
2262.98 |
37 |
10843 |
0 |
21.9 |
0 |
压测过程发现dify-api服务CPU跑满
第二次压测
增加dify-api服务的CPU,具体配置如下:
|
服务名称 |
CPU |
内存 |
|
xinference |
1 |
2048 |
|
ollama |
1 |
2048 |
|
dify-sandbox |
0.8 |
1024 |
|
dify-nginx |
0.5 |
1024 |
|
dify-web |
0.5 |
1024 |
|
dify-ssrf |
0.3 |
300 |
|
dify-plugin-daemon |
0.8 |
2772 |
|
dify-worker |
0.8 |
1024 |
|
dify-api |
1 |
2048 |
|
dify-weaviate |
0.5 |
1024 |
|
dify-redis |
0.5 |
1024 |
|
dify-postgres |
0.5 |
1024 |
|
合计 |
8.2 |
16384 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
5932 |
0 |
1500 |
3400 |
7900 |
1636.67 |
40 |
13931 |
0 |
31.2 |
0 |
RPS由21.9上升到31.2,证明增加dify-api的cpu对RPS指标有影响
第三次压测
再次增加dify-api服务的CPU,同时调整整体服务配置达到8核16G的配置要求,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
0.5 |
1024 |
|
|
合计 |
8 |
16384 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
6741 |
0 |
1600 |
4700 |
9000 |
1675.86 |
39 |
9897 |
0 |
49.5 |
0 |
RPS上升到49.5
第四次压测
由于在8核16G的限制下,资源已无法再进行调整,同时在压测过程中发现接口最终会通过pdsql进行数据的查询和修改,故对pdsql的配置进行调优,具体调整如下:
# 内存相关配置
shared_buffers = 256MB
work_mem = 16MB
work_mem = 17MB
work_mem = 18MB
# 并发连接
max_connections = 200
# 检查点
checkpoint_timeout = 15min
# 打印慢日志
log_min_duration_statement = 200ms
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
6688 |
0 |
1700 |
4600 |
13000 |
1708.36 |
40 |
18824 |
0 |
54.1 |
0 |
RPS没有明显提升
第五次压测
增加pdsql配置,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
1 |
2048 |
|
|
合计 |
8.5 |
17408 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
7748 |
0 |
1500 |
3900 |
6700 |
1437.3 |
41 |
7148 |
0 |
48.4 |
0 |
RPS没有明显提升,观察pdsql慢日志发现存在大量的慢查询及慢事务

该查询sql是压测接口中会执行的sql,同时查看表结构确认查询可以命中索引,且该表无数据,通过nfsiostat命令查看K8S的nfs性能
192.168.83.248:/mnt/nfs_share/dify-postgres mounted on /var/lib/kubelet/pods/4e22098e-0df1-4a90-9126-bf91e937772f/volumes/kubernetes.io~nfs/postgres-data:
op/s rpc bklog
28.87 0.00
read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
4.509 62.329 13.823 0 (0.0%) 1.007 5.010
write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
9.524 140.775 14.781 0 (0.0%) 5.562 160.093
发现写操作的平均执行时间(160 ms)远高于读操作(5 ms),这可能会影响 PostgreSQL 的性能,特别是在事务提交(COMMIT)时,故提示pdsql后RPS无提升原因大概率为nfs性能问题。
第六次压测
调整pdsql为直连SSD磁盘
压测结果
RPS无明显提升,但性能曲线更平稳

数据库慢日志打印的慢事务数量和时间减少

同时发先dify-api服务的CPU跑满
调优结论
在8核16G的资源限制下已无法再进行资源及配置调优,故改限制下的最优部署配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
0.5 |
1024 |
|
|
合计 |
8 |
16384 |
接口压测结果
场景1
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
7748 |
0 |
1500 |
3900 |
6700 |
1437.3 |
41 |
7148 |
0 |
48.4 |
0 |
场景2
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
2752 |
0 |
4900 |
11000 |
14000 |
5688.9 |
3552 |
14582 |
0 |
20.7 |
0 |
场景3
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/datasets/da0bcf35-abc5-4c77-8e2b-4e890b93b61c/retrieve |
3342 |
0 |
5300 |
6700 |
7300 |
5287.1 |
450 |
8038 |
2625 |
17.6 |
0 |

压测结论
8核16G的最优部署配置
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
0.5 |
1024 |
|
|
8 |
16384 |
tps
- 简单chatflow场景TPS:48.4
- 复杂chatflow场景TPS:20.7
- 文件检索场景TPS:17.6
资源及配置调优过程(16核32G)
第一次压测
增加dify-api服务的资源配置,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
4 |
4096 |
SERVER_WORKER_AMOUNT=4 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
0.5 |
1024 |
|
|
合计 |
10 |
18422 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
3537 |
0 |
3800 |
7700 |
11000 |
4120.93 |
1518 |
12092 |
0 |
22.9 |
0 |
RPS下降为22,观察dify-plugin-daemon日志发现请求响应时间变长且打印大量慢sql

第二次压测
增加pdsql的资源配置,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
4 |
4086 |
SERVER_WORKER_AMOUNT=4 |
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
1 |
2048 |
|
|
合计 |
10.5 |
19446 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
14167 |
0 |
910 |
1800 |
2000 |
1032.79 |
759 |
2807 |
0 |
91.3 |
0 |
RPS上升到91.3,dify-plugin-daemon日志的响应时间变短且无打印慢sql

第三次压测
降低dify-api服务的资源配置,同时实例数增加为2,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-api |
2 |
2048 |
|
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
1 |
2048 |
|
|
合计 |
10.5 |
19456 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
14222 |
0 |
940 |
2800 |
3800 |
1067.39 |
37 |
5519 |
0 |
95.1 |
0 |
RPS为95,压测过程中峰值超过100,但性能曲线存在明显突刺

第四次压测
调整dify-api的SERVER_WORKER_AMOUNT值为3
压测结果
|
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
|
/v1/chat-messages |
12241 |
0 |
1000 |
2600 |
2800 |
1098.56 |
38 |
4751 |
0 |
59.9 |
0 |
RPS为60,峰值为85且性能曲线平缓

第五次压测
调整dify-api的实例数为3,具体配置如下
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-api |
2 |
2048 |
|
|
dify-api |
2 |
2048 |
|
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
1 |
2048 |
|
|
12.5 |
21504 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
4980 |
0 |
2000 |
5300 |
9000 |
2423.31 |
126 |
12645 |
0 |
26.3 |
0 |
RPS降为26,观察dify-plugin-daemon日志发现请求响应时间变长且打印大量慢sql

第六次压测
增加pdsql的资源配置,具体配置如下:
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
0.5 |
2048 |
|
|
ollama |
0.5 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-api |
2 |
2048 |
|
|
dify-api |
2 |
2048 |
|
|
dify-weaviate |
0.5 |
1024 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
2 |
3072 |
|
|
13.5 |
22528 |
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
18552 |
0 |
410 |
2000 |
2500 |
667.39 |
34 |
3910 |
0 |
98.4 |
0 |
RPS为98.4,峰值达到124,场景1调优目标达成

压测结果
场景1
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
18552 |
0 |
410 |
2000 |
2500 |
667.39 |
34 |
3910 |
0 |
98.4 |
0 |

场景2
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/chat-messages |
5648 |
0 |
350 |
4200 |
5500 |
979.49 |
36 |
7128 |
0 |
61.2 |
0 |
场景3
压测结果
|
Name |
# Requests |
# Fails |
Median (ms) |
95%ile (ms) |
99%ile (ms) |
Average (ms) |
Min (ms) |
Max (ms) |
Average size (bytes) |
Current RPS |
Current Failures/s |
|
/v1/datasets/da0bcf35-abc5-4c77-8e2b-4e890b93b61c/retrieve |
7312 |
0 |
1900 |
6100 |
8300 |
2411.12 |
91 |
15778 |
2625 |
44.3 |
0 |

压测结论
16核32G的最优部署配置
|
服务名称 |
CPU |
内存 |
环境参数配置 |
|
xinference |
1 |
2048 |
|
|
ollama |
1 |
2048 |
|
|
dify-sandbox |
0.6 |
1024 |
|
|
dify-nginx |
0.5 |
1024 |
|
|
dify-web |
0.5 |
1024 |
|
|
dify-ssrf |
0.3 |
300 |
|
|
dify-plugin-daemon |
0.8 |
2772 |
|
|
dify-worker |
0.8 |
1024 |
|
|
dify-api |
2 |
2048 |
SERVER_WORKER_AMOUNT=2 |
|
dify-api |
2 |
2048 |
|
|
dify-api |
2 |
2048 |
|
|
dify-weaviate |
1 |
2048 |
|
|
dify-redis |
0.5 |
1024 |
|
|
dify-postgres |
2 |
3072 |
|
|
15 |
24576 |
tps
简单chatflow场景TPS:98.4
复杂chatflow场景TPS:61.2
文件检索场景TPS:44.3
更高TPS要求调优意见
- chatflow接口可通过增加dify-api服务的数量及提高pdsql(独立部署)的性能来尝试
- 复杂编排需具体场景具体分析性能瓶颈在哪个服务,针对具体服务进行调优
- 文件检索接口可通过提高weaviate(独立部署)的性能来尝试
- 本报告结果不涉及大模型部署,如需大模型部署需按需提高xinference、ollama的资源配置
作者:道一云低代码
作者想说:喜欢本文请点点关注~
更多推荐






所有评论(0)