Qdrant memory and specs requirement for billions of vectors #2607

TheRabidWolverine · 2023-09-06T17:42:07Z

TheRabidWolverine
Sep 6, 2023

I came across this article describing Qdrant's RAM and other configuration benchmarking experiments. Looked pretty cool. I have some questions before we start using it for our project, since we all are fairly new to this, and these answers may help us evaluate our choices and configurations.

The article mentions some 135 MB of RAM for 1 million vectors with good performance under SSD. It does not quantify "good". How good is "good" here? Is that production level, like the kind one can see on Google or Bing image search result page, for example, when one searches for suppose "tiger cubs playing with each other" and immediately gets hundreds of matching images?
The above example states 128 dimensions. Can we assume that for 512 (128 * 4) dimensions, an equivalent performance will need 135 * 4 = 540 MB of RAM as an approximation? Will the calculation be linear like that? Or will there be checkpoints or mountpoints in the number of vectors increasing (or their dimensions) where the requirement may spike up or plateau down?
If 135 MB is indeed for 1 million vectors, then extrapolating that 1 billion vectors, it becomes 135 GB. That's just about 1 single EC2 node with 150 GB RAM. Will 135 GB RAM, coupled with sufficiently high amount of SSD, be enough to give production level vector comparison queries at runtime for 1 billion vectors? Milvus, for example, suggests here an entire arsenal of resources required to cater to 1 billion vectors, whereas, here, just one machine will suffice? Am I missing something?
This page, although authored by Ziliz, rendering it prone to bias, suggests that Milvus can dynamically scale and shard up, and cater to billion plus vectors, while Qdrant cannot. Whereas your page suggests that Qdrant is better at everything. Which one is true?

generall · 2023-09-06T18:44:46Z

generall
Sep 6, 2023
Maintainer

It does not quantify "good".

This article only speaks about speed of retrieval, it does not make any statement about quality of the search, as it mostly depends on the model you are using. Also, there are no mentions of "good" in it.

Regarding speed - there is a table with estimations in this section: https://qdrant.tech/articles/memory-consumption/#how-to-speed-up-the-search

Will the calculation be linear like that?

Yes, for vector dimensionality it is close to linear

If 135 MB is indeed for 1 million vectors, then extrapolating that 1 billion vectors, it becomes 135 GB. That's just about 1 single EC2 node with 150 GB RAM. Will 135 GB RAM, coupled with sufficiently high amount of SSD, be enough to give production level vector comparison queries at runtime for 1 billion vectors?

Yes, one machine of that size is enough to serve 1 billion 128d vectors, we checked it in our experiments. Of course, if you have additional requirement, like constant insertion of new vectors, or high RPS, or millisecond latency - it might require more resources.
So the one machine of 150Gb is minimal requirement.

suggests that Milvus can dynamically scale and shard up, and cater to billion plus vectors, while Qdrant cannot

Qdrant doesn't yet have dymanic re-sharding, meaning that you can't yet change the number of shards in the existing collection.
However I am not sure how much of a limitation it is in practice, considering that both vertical and horizontal scaling is possible within the pre-configured range, e.g. if you pre-configure 4 shards, you can scale from 1x2Gb machine into 4x64Gb (x128 times) with no downtime

8 replies

generall Sep 6, 2023
Maintainer

good enough to be production like, the kinds you get on Google or Bing image search pages?

Google have million request per second on it's servers. One machine with disk-based deployment can do maybe 100 RPS best case scenario.

offload a large part of the supposed-to-be-on-the-disk vectors to the remaining 45 GB of spare memory?

There will be no spare memory. All available memory will be used for disk cache.l

The average SSD one gets on say EC2 or Azure, do they serve the 100k IOPS perf you mentioned?

Only if it is local disk. Network mounted volumes won't be enough

TheRabidWolverine Sep 6, 2023
Author

Makes sense, thanks. To increase RPS, perhaps being able to accommodate 100s of vectors per second, what needs to be increased - RAM, I guess? Or more nodes with shards, and thus the load will be automatically more distributed and hence each node will be able to accommodate more RPS (runtime queries) while also accommodating fairly high number of vectors being inserted per second?

Secondly, we figured out from the 135 MB figure than 1 billion vectors will need around 135 GB of RAM. So all the vectors are fitting in the memory already, right? Why should there be no spare memory if the total available memory is 180 GB? What will the remaining 45 GB memory be used by Qdrant for?

generall Sep 6, 2023
Maintainer

Secondly, we figured out from the 135 MB figure than 1 billion vectors will need around 135 GB of RAM. So all the vectors are fitting in the memory already, right?

No, in this scenario almost no vectors are in RAM. To put 1B of 128d vectors in ram you would need about 500Gb

TheRabidWolverine Sep 7, 2023
Author

Umm, with almost no vector in RAM, will the performance be production level? And Milvus asks for much more memory - https://milvus.io/tools/sizing/, is it because, unlike Qdrant, it assumes that it will keep majority of the data in memory? And the Qdrant blogpost also mentions that is has better perfs than Milvus in almost all aspects, so, Qdrant, with a majority of its data on disk, performs better than Milvus, which takes much more memory, and is likely to have a substantial part of its data in memory? How come?

generall Sep 10, 2023
Maintainer

Umm, with almost no vector in RAM, will the performance be production level?

Again, that depends on your definition of production level. In particular - required latency and number of requests per second.
It also depends on disk performance you are going to have.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

Qdrant memory and specs requirement for billions of vectors #2607

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 8 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Qdrant

Qdrant memory and specs requirement for billions of vectors #2607

TheRabidWolverine Sep 6, 2023

Replies: 1 comment · 8 replies

generall Sep 6, 2023 Maintainer

generall Sep 6, 2023 Maintainer

TheRabidWolverine Sep 6, 2023 Author

generall Sep 6, 2023 Maintainer

TheRabidWolverine Sep 7, 2023 Author

generall Sep 10, 2023 Maintainer

TheRabidWolverine
Sep 6, 2023

Replies: 1 comment 8 replies

generall
Sep 6, 2023
Maintainer

generall Sep 6, 2023
Maintainer

TheRabidWolverine Sep 6, 2023
Author

generall Sep 6, 2023
Maintainer

TheRabidWolverine Sep 7, 2023
Author

generall Sep 10, 2023
Maintainer