How partition key base work and how to use it? #39446
Replies: 3 comments 4 replies
-
The major purpose of "partition" is: split the data into different physical storage paths and control the search scope hence improving search performance significantly. Let's say we have 1000 students, their exam scores are from 0 to 100. Now we want to find out a student whose score is 70. Will adding more scores change the number of partitions? No, doesn't change. If we split the 1000 scores into 10 partitions, then we add 500 scores, all of them are still in range [0, 100]. =============================================================================
num_partitions defines how many partitions are created. All data will be split into these partitions. You can change the num_partitions value to set partition number. The default value is 16. No need to set it to be a large value since too many partitions might bring overhead to milvus service. In my opinion, a value between 16 to 64 is recommended. |
Beta Was this translation helpful? Give feedback.
-
Currently, each collection supports a maximum of 1,024 partitions. For multi-tenancy scenarios, you can create multiple collections, each with multiple partitions. However, creating a collection or partition consumes significant resources, meaning it's not feasible to create an unlimited number of physical partitions. For current release, we recommend a cluster has no more than 100K collection * partitions. We did some optimization in master branch so it could be larger in the future. Partition keys provide a balanced approach by first dividing data into a limited number of physical partitions (e.g., the 16 partitions you see) and then further segmenting the data logically within each partition using filters. Since segments are logically isolated, a large number of tenants can be supported efficiently. |
Beta Was this translation helpful? Give feedback.
-
Thank you @xiaofan-luan and @yhmo for your really fast replies! I have a better understanding of partitions now, but I still don't quite understand the term And in the table from the article "Multi-tenancy strategies", what does 10M+ tenants for "Partition-key-based" actually mean? Thank you once again! |
Beta Was this translation helpful? Give feedback.
-
Hi, I am readding this page and have a few question about Partition-key-based multi-tenancy: https://milvus.io/docs/multi_tenancy.md
As I read in the article "Multi-tenancy strategies", it mentioned that when using the
Partition-key-based
strategy, Milvus automatically manages partitions and supports up to 10M+ partitions. However, when I tried it, only 16 partitions were created.I also tried using Ask AI and read some other documentation pages, and I found that I could specify the number of partitions, with the default being 16, but the maximum seems to be only 1024 or 4096 instead of 10M+.
This is making me a bit confused because I thought that when using Partition-key-based, the number of partitions would increase gradually according to the value of the partition field.
In my case, my collection has a date field, and I want to partition the data based on this field, with each day being a separate partition. However, when I created the collection with the following schema:
I then tried inserting 10k test data points (10k different values), and upon checking, only 16 partitions were created instead of the expected 10k partitions.
Beta Was this translation helpful? Give feedback.
All reactions