如何让Query Node被快速替换? #39057
Replies: 2 comments 4 replies
-
每次release/load的数据太多,一般会担心内存问题或者查询抖动。不过这个确实应该可以改进 @weiliu1031 please help on it |
Beta Was this translation helpful? Give feedback.
-
Milvus uses a Balancer to migrate data from QueryNodes that are about to go offline. It provides the following parameters to control the frequency and batch size of balancing operations, ensuring minimal impact on query performance:
By default, Milvus controls the workload of balancing operations primarily through batch size limits to reduce their impact on query performance. However, in your case, the Solutions:
It is recommended to select the appropriate adjustment strategy based on your specific business requirements and sensitivity to query performance. |
Beta Was this translation helpful? Give feedback.
-
Cluster 2.5.2, 600k个partition。因为各种原因要替换query node。比如version upgrade,更改config。
替换Query Node的过程是,启动一个新的query node,然后terminate一个现有的query node(该node会不断offload数据),那个新起来的query node会不断load新数据。
但是这个过程非常缓慢,大概有几小时。查看了query node的log,就是不停的在显示”migrate data..."
后来偶然的一次机会,发现被terminate的query node是每隔10分钟offload一批数据。被启动的query node也是每隔10分钟就load一批数据。然后就停止了。过十分钟再继续。
与此同时,我的checkBalanceInterval是600000,就是十分钟。
请问,migrate data的逻辑,是依靠balancer来完成的吗? (先label segment,然后balancer每次要运行的时候创建task release / load segment)。
如果是这样,除了把checkBalanceInterval拉低(变频繁),还有其他办法吗?比如每次release / load的segment数目,能多一些吗?比如一次把所有segment release和load完,不要一次一次,持续几小时把一个query node关掉。
另外,我有意把checkBalanceInterval拉高,因为我不想让coordinator和query node过度负载。
Beta Was this translation helpful? Give feedback.
All reactions