Qdrant memory and specs requirement for billions of vectors #2607
Replies: 1 comment 8 replies
-
This article only speaks about speed of retrieval, it does not make any statement about quality of the search, as it mostly depends on the model you are using. Also, there are no mentions of "good" in it. Regarding speed - there is a table with estimations in this section: https://qdrant.tech/articles/memory-consumption/#how-to-speed-up-the-search
Yes, for vector dimensionality it is close to linear
Yes, one machine of that size is enough to serve 1 billion 128d vectors, we checked it in our experiments. Of course, if you have additional requirement, like constant insertion of new vectors, or high RPS, or millisecond latency - it might require more resources.
Qdrant doesn't yet have dymanic re-sharding, meaning that you can't yet change the number of shards in the existing collection. |
Beta Was this translation helpful? Give feedback.
-
I came across this article describing Qdrant's RAM and other configuration benchmarking experiments. Looked pretty cool. I have some questions before we start using it for our project, since we all are fairly new to this, and these answers may help us evaluate our choices and configurations.
The article mentions some 135 MB of RAM for 1 million vectors with good performance under SSD. It does not quantify "good". How good is "good" here? Is that production level, like the kind one can see on Google or Bing image search result page, for example, when one searches for suppose "tiger cubs playing with each other" and immediately gets hundreds of matching images?
The above example states 128 dimensions. Can we assume that for 512 (128 * 4) dimensions, an equivalent performance will need 135 * 4 = 540 MB of RAM as an approximation? Will the calculation be linear like that? Or will there be checkpoints or mountpoints in the number of vectors increasing (or their dimensions) where the requirement may spike up or plateau down?
If 135 MB is indeed for 1 million vectors, then extrapolating that 1 billion vectors, it becomes 135 GB. That's just about 1 single EC2 node with 150 GB RAM. Will 135 GB RAM, coupled with sufficiently high amount of SSD, be enough to give production level vector comparison queries at runtime for 1 billion vectors? Milvus, for example, suggests here an entire arsenal of resources required to cater to 1 billion vectors, whereas, here, just one machine will suffice? Am I missing something?
This page, although authored by Ziliz, rendering it prone to bias, suggests that Milvus can dynamically scale and shard up, and cater to billion plus vectors, while Qdrant cannot. Whereas your page suggests that Qdrant is better at everything. Which one is true?
Beta Was this translation helpful? Give feedback.
All reactions