You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In OpenSearch k-NN plugin, we integrate faiss into the Lucene index structure which results in having several IVFPQ indices on a machine at a time. Each index is initialized with the same empty trained faiss index and then unique data is added to it during ingestion.
One problem we are running into is that each index will precompute the table for IVFPQ-l2, leading to several redundant allocations on the machine, and causing a significant build up in memory. For example, the following configuration will lead to memory consumption of : 100*4*2048*64*2^8 = 12.5 GB
nlist = 2048
pq_m = 64
bits_per_subvec = 8
100 indices using the same trained template on a machine
We want to change this behavior to be able to share the precomputed_table, which would lead to a consumption of 4*2048*64*2^8 = 0.125 GB
I built out a proof of concept (jmazanec15@3083614) where I just changed precomputed_table to a pointer and added the necessary set/dealloc logic and it worked fine.
However, this solution has the downside that it would break backwards compatibility for users expecting precomputed_table to be a AlignedTable<float> and not a AlignedTable<float> *.
One way I could think of that would not break the interface would be to switch precomputed_table from
a value to a reference (i.e. AlignedTable<float> &) and then handle the lifecycle of the precomputed_table by adding state about whether the instance owns the table, and then manually allocate and deallocate in the constructor and destructors. Before I go to much further on it, I wanted to get initial feedback from faiss team to see if this approach would be acceptable.
Alternatives considered
Introduce a new variable in the class so that we dont have to change existing behavior of precomputed_table and add logic to manage the duplication - this seems like it would be harder to maintain
Maintain a patch externally that switches the type to a reference
Disable the table - I ran some experiments doing this and it led to a noticeable query performance degradation, so would like to avoid
The text was updated successfully, but these errors were encountered:
I see...
Option 4: better separation of the "empty index" and the invlist storage. There is a recently introduced "context" parameter that can be used to switch between different storages for the same index, see #3247. An additional benefit is that it also deduplicates the quantization index.
Thanks @mdouze, so to use the context parameter, we would need to create a custom InvertedLists implementation and then pass at search time the table via the SearchParameters and then integrate the functionality into overrides of the Scanner and anything else below that would eventually compute the distance, correct?
Yes that's the idea. I would be happy to make a demo for that because it's a common use case (and would be interested if I'd run into unexpected limitations) but I don't have too much time these days.
Summary
In OpenSearch k-NN plugin, we integrate faiss into the Lucene index structure which results in having several IVFPQ indices on a machine at a time. Each index is initialized with the same empty trained faiss index and then unique data is added to it during ingestion.
One problem we are running into is that each index will precompute the table for IVFPQ-l2, leading to several redundant allocations on the machine, and causing a significant build up in memory. For example, the following configuration will lead to memory consumption of :
100*4*2048*64*2^8 = 12.5 GB
We want to change this behavior to be able to share the
precomputed_table
, which would lead to a consumption of4*2048*64*2^8 = 0.125 GB
I built out a proof of concept (jmazanec15@3083614) where I just changed
precomputed_table
to a pointer and added the necessary set/dealloc logic and it worked fine.However, this solution has the downside that it would break backwards compatibility for users expecting
precomputed_table
to be aAlignedTable<float>
and not aAlignedTable<float> *
.One way I could think of that would not break the interface would be to switch
precomputed_table
froma value to a reference (i.e.
AlignedTable<float> &
) and then handle the lifecycle of theprecomputed_table
by adding state about whether the instance owns the table, and then manually allocate and deallocate in the constructor and destructors. Before I go to much further on it, I wanted to get initial feedback from faiss team to see if this approach would be acceptable.Alternatives considered
precomputed_table
and add logic to manage the duplication - this seems like it would be harder to maintainThe text was updated successfully, but these errors were encountered: