Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to share precomputed_table amongst separate IVFPQ-l2 indices #3271

Closed
jmazanec15 opened this issue Feb 29, 2024 · 3 comments
Closed

Comments

@jmazanec15
Copy link
Contributor

Summary

In OpenSearch k-NN plugin, we integrate faiss into the Lucene index structure which results in having several IVFPQ indices on a machine at a time. Each index is initialized with the same empty trained faiss index and then unique data is added to it during ingestion.

One problem we are running into is that each index will precompute the table for IVFPQ-l2, leading to several redundant allocations on the machine, and causing a significant build up in memory. For example, the following configuration will lead to memory consumption of : 100*4*2048*64*2^8 = 12.5 GB

  1. nlist = 2048
  2. pq_m = 64
  3. bits_per_subvec = 8
  4. 100 indices using the same trained template on a machine

We want to change this behavior to be able to share the precomputed_table, which would lead to a consumption of
4*2048*64*2^8 = 0.125 GB

I built out a proof of concept (jmazanec15@3083614) where I just changed precomputed_table to a pointer and added the necessary set/dealloc logic and it worked fine.

However, this solution has the downside that it would break backwards compatibility for users expecting precomputed_table to be a AlignedTable<float> and not a AlignedTable<float> *.

One way I could think of that would not break the interface would be to switch precomputed_table from
a value to a reference (i.e. AlignedTable<float> &) and then handle the lifecycle of the precomputed_table by adding state about whether the instance owns the table, and then manually allocate and deallocate in the constructor and destructors. Before I go to much further on it, I wanted to get initial feedback from faiss team to see if this approach would be acceptable.

Alternatives considered

  1. Introduce a new variable in the class so that we dont have to change existing behavior of precomputed_table and add logic to manage the duplication - this seems like it would be harder to maintain
  2. Maintain a patch externally that switches the type to a reference
  3. Disable the table - I ran some experiments doing this and it led to a noticeable query performance degradation, so would like to avoid
@mdouze
Copy link
Contributor

mdouze commented Mar 1, 2024

I see...
Option 4: better separation of the "empty index" and the invlist storage. There is a recently introduced "context" parameter that can be used to switch between different storages for the same index, see #3247. An additional benefit is that it also deduplicates the quantization index.

@mdouze mdouze added the question label Mar 1, 2024
@jmazanec15
Copy link
Contributor Author

Thanks @mdouze, so to use the context parameter, we would need to create a custom InvertedLists implementation and then pass at search time the table via the SearchParameters and then integrate the functionality into overrides of the Scanner and anything else below that would eventually compute the distance, correct?

@mdouze
Copy link
Contributor

mdouze commented Mar 6, 2024

Yes that's the idea. I would be happy to make a demo for that because it's a common use case (and would be interested if I'd run into unexpected limitations) but I don't have too much time these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@asadoughi @jmazanec15 @mdouze and others