Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to transfer the IVF index from one PQ compression setting to another? #2455

Closed
jrcavani opened this issue Sep 5, 2022 · 6 comments
Closed
Labels

Comments

@jrcavani
Copy link

jrcavani commented Sep 5, 2022

Hello,

I have trained an OPQ128_512,IVF262144_HNSW32,PQ128x6, and it took some time! Would it be possible to avoid training again, but to just transfer the IVF index to another index, but with a different PQ compression, such as OPQ128_512,IVF262144_HNSW32,PQ128, or OPQ256_1024,IVF262144_HNSW32,PQ256?

And to a somewhat related note, would it be possible to take out the PQ/SQ compression module from a trained index and use as an embedding size-reduction preprocessor? We found SQ8 yields about the same accuracy for similarity comparisons as the original float32 vectors at 1/4 the size, so that can be quite attractive for storage purposes.

Thank you!

@mdouze
Copy link
Contributor

mdouze commented Sep 6, 2022

Yes it is possible but somewhat error-prone. For these cases,

OPQx,IVFy,PQz -> OPQx',IVFy',PQz'

  • the trained OPQ can be transferred if x=x'

  • the trained IVF centroids can be transferred if x=x' and y=y' (and additionally if x != x' but no dimensionality reduction, but that is black magic 🪄⚡🧹)

After the transfer is done the remaining untrained part (eg. the PQ) can be trained by calling train on the index.

@mdouze
Copy link
Contributor

mdouze commented Sep 6, 2022

Demo here:
demo_transfer_OPQ_IVF.ipynb

@jrcavani
Copy link
Author

jrcavani commented Sep 6, 2022

This is fascinating. Thank you for taking the time to write up a notebook.

If I understand it correctly, the IVF coarse quantizer uses the intermediate vectors with dimension D after OPQM_D. In your notation x means M, and D is defaulted to original vector dimension. So OPQ lays the starting point for IVF quantization, and if OPQ's D is different, IVF centroids simply can't transfer.

If D is the same, but M (or x in your notation) is different, it's black magic because... you technically could transfer the A and b in OPQ LinearTransform, and transfer the centroids to the destination index, but OPQ will not work as well for the new PQM - the old OPQ was not trained for the new M. Is this the case?

In your notebook, the OPQ trained for PQ4np can be reused for PQ4x6np. Is this the case for all PQ4 variants, such as PQ4, PQ4x12, PQ4fs, PQ4fsr?

@mdouze
Copy link
Contributor

mdouze commented Sep 9, 2022

No in my notation, x means M or M_D both work
Of course the PQ M should match that of the OPQ

about your last question: I am not sure, it should be tried out.

@mdouze mdouze added the question label Sep 9, 2022
@jrcavani
Copy link
Author

Asked in the comment section of the notebook:

For a sanity check I tried to copy everything:

# src_index pretrained, exact same type and params
dst_index = faiss.index_factory(d, "OPQ128_512,IVF2097152_HNSW32,PQ128x6", faiss.METRIC_INNER_PRODUCT)
faiss.ParameterSpace().set_index_parameters(index, f"nprobe=1024,quantizer_efSearch=512")

preproc_src = faiss.downcast_VectorTransform(src_index.chain.at(0))
preproc_dst = faiss.downcast_VectorTransform(dst_index.chain.at(0))

# copy preproc
preproc_dst.A = preproc_src.A
preproc_dst.b = preproc_src.b
preproc_dst.is_trained = True

# copy IVF
quantizer_src = faiss.extract_index_ivf(src_index).quantizer
quantizer_dst = faiss.extract_index_ivf(dst_index).quantizer
quantizer_dst.add(quantizer_src.reconstruct_n(0, quantizer_src.ntotal))

# copy PQ
faiss.downcast_index(dst_index.index).pq.centroids = faiss.downcast_index(src_index.index).pq.centroids
faiss.downcast_index(dst_index.index).is_trained = True

# finish training
dst_index.train(xb) # or dst_index.is_trained = True

I can confirm PreTransforms and pq centroids are identical, and codes computed from just the PQ modules are identical:

In [180]: faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0][:10]
Out[180]: array([149, 173, 176, 127,  27, 204,  45, 101,  52, 104], dtype=uint8)

In [182]: faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0][:10]
Out[182]: array([149, 173, 176, 127,  27, 204,  45, 101,  52, 104], dtype=uint8)

In [193]: (faiss.downcast_index(faiss.extract_index_ivf(src_index)).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(faiss.extract_index_ivf(dst_index)).pq.compute_codes(a[None, :])[0]).all()
Out[193]: True

However the IndexIVFPQ level sa_encode sometimes return different results:

In [175]: faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0][:10]
Out[175]: array([133,  57,  31,   7, 168, 240,  71, 174, 156,  20], dtype=uint8)

In [176]: faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0][:10]
Out[176]: array([153, 100,  24,  78, 189, 121, 116, 135, 166, 173], dtype=uint8)

And the whole index level sa_encode returns different results:

In [196]: src_index.sa_encode(a[None, :])[0][:10]
Out[196]: array([203, 253,  23, 231, 167, 191, 233,  54,  87, 120], dtype=uint8)

In [197]: dst_index.sa_encode(a[None, :])[0][:10]
Out[197]: array([132,   7,  12, 109,  65, 132,  27, 139,  53, 117], dtype=uint8)

Also the reconstruction error is different, sometimes better/worse when it happens.

Here I repetitively run below script:


a = np.random.randn(512).astype("float32"); a /= np.linalg.norm(a)
src_index.add(a[None, :])
dst_index.add(a[None, :])

print(src_index.search(a[None, :], 1))
print(dst_index.search(a[None, :], 1))

print( (faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0]).all() )
print( (faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0]).all() )

print( (faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0] == faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0]).all() )
print( (faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0] == faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0]).all() )

print( (src_index.sa_encode(a[None, :])[0] == dst_index.sa_encode(a[None, :])[0]).all() )
print( (src_index.sa_encode(a[None, :])[0] == dst_index.sa_encode(a[None, :])[0]).all() )

Most of the time

(array([[0.8010769]], dtype=float32), array([[48]]))
(array([[0.8010769]], dtype=float32), array([[44]]))
True
True
True
True
True
True

Sometimes

(array([[0.77809864]], dtype=float32), array([[54]]))
(array([[0.73597413]], dtype=float32), array([[50]]))
True
True
False
False
False
False

@jrcavani
Copy link
Author

I think I have an answer to this now from reading the source code. The dst HNSW quantizer could easily be different from the original src HNSW quantizer, even given the exact same params, because the graph structure is based on the sequence of vectors being added. If that happens, the quantizer will assign a different coarse centroid at the sa_encode() call, then the residual will also be different, further the PQ quantization code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants