How to transfer the IVF index from one PQ compression setting to another? #2455

jrcavani · 2022-09-05T15:48:13Z

Hello,

I have trained an OPQ128_512,IVF262144_HNSW32,PQ128x6, and it took some time! Would it be possible to avoid training again, but to just transfer the IVF index to another index, but with a different PQ compression, such as OPQ128_512,IVF262144_HNSW32,PQ128, or OPQ256_1024,IVF262144_HNSW32,PQ256?

And to a somewhat related note, would it be possible to take out the PQ/SQ compression module from a trained index and use as an embedding size-reduction preprocessor? We found SQ8 yields about the same accuracy for similarity comparisons as the original float32 vectors at 1/4 the size, so that can be quite attractive for storage purposes.

Thank you!

The text was updated successfully, but these errors were encountered:

mdouze · 2022-09-06T10:06:38Z

Yes it is possible but somewhat error-prone. For these cases,

OPQx,IVFy,PQz -> OPQx',IVFy',PQz'

the trained OPQ can be transferred if x=x'
the trained IVF centroids can be transferred if x=x' and y=y' (and additionally if x != x' but no dimensionality reduction, but that is black magic 🪄⚡🧹)

After the transfer is done the remaining untrained part (eg. the PQ) can be trained by calling train on the index.

mdouze · 2022-09-06T10:19:35Z

Demo here:
demo_transfer_OPQ_IVF.ipynb

jrcavani · 2022-09-06T15:16:59Z

This is fascinating. Thank you for taking the time to write up a notebook.

If I understand it correctly, the IVF coarse quantizer uses the intermediate vectors with dimension D after OPQM_D. In your notation x means M, and D is defaulted to original vector dimension. So OPQ lays the starting point for IVF quantization, and if OPQ's D is different, IVF centroids simply can't transfer.

If D is the same, but M (or x in your notation) is different, it's black magic because... you technically could transfer the A and b in OPQ LinearTransform, and transfer the centroids to the destination index, but OPQ will not work as well for the new PQM - the old OPQ was not trained for the new M. Is this the case?

In your notebook, the OPQ trained for PQ4np can be reused for PQ4x6np. Is this the case for all PQ4 variants, such as PQ4, PQ4x12, PQ4fs, PQ4fsr?

mdouze · 2022-09-09T07:35:45Z

No in my notation, x means M or M_D both work
Of course the PQ M should match that of the OPQ

about your last question: I am not sure, it should be tried out.

jrcavani · 2022-09-14T22:27:05Z

Asked in the comment section of the notebook:

For a sanity check I tried to copy everything:

# src_index pretrained, exact same type and params
dst_index = faiss.index_factory(d, "OPQ128_512,IVF2097152_HNSW32,PQ128x6", faiss.METRIC_INNER_PRODUCT)
faiss.ParameterSpace().set_index_parameters(index, f"nprobe=1024,quantizer_efSearch=512")

preproc_src = faiss.downcast_VectorTransform(src_index.chain.at(0))
preproc_dst = faiss.downcast_VectorTransform(dst_index.chain.at(0))

# copy preproc
preproc_dst.A = preproc_src.A
preproc_dst.b = preproc_src.b
preproc_dst.is_trained = True

# copy IVF
quantizer_src = faiss.extract_index_ivf(src_index).quantizer
quantizer_dst = faiss.extract_index_ivf(dst_index).quantizer
quantizer_dst.add(quantizer_src.reconstruct_n(0, quantizer_src.ntotal))

# copy PQ
faiss.downcast_index(dst_index.index).pq.centroids = faiss.downcast_index(src_index.index).pq.centroids
faiss.downcast_index(dst_index.index).is_trained = True

# finish training
dst_index.train(xb) # or dst_index.is_trained = True

I can confirm PreTransforms and pq centroids are identical, and codes computed from just the PQ modules are identical:

In [180]: faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0][:10]
Out[180]: array([149, 173, 176, 127,  27, 204,  45, 101,  52, 104], dtype=uint8)

In [182]: faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0][:10]
Out[182]: array([149, 173, 176, 127,  27, 204,  45, 101,  52, 104], dtype=uint8)

In [193]: (faiss.downcast_index(faiss.extract_index_ivf(src_index)).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(faiss.extract_index_ivf(dst_index)).pq.compute_codes(a[None, :])[0]).all()
Out[193]: True

However the IndexIVFPQ level sa_encode sometimes return different results:

In [175]: faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0][:10]
Out[175]: array([133,  57,  31,   7, 168, 240,  71, 174, 156,  20], dtype=uint8)

In [176]: faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0][:10]
Out[176]: array([153, 100,  24,  78, 189, 121, 116, 135, 166, 173], dtype=uint8)

And the whole index level sa_encode returns different results:

In [196]: src_index.sa_encode(a[None, :])[0][:10]
Out[196]: array([203, 253,  23, 231, 167, 191, 233,  54,  87, 120], dtype=uint8)

In [197]: dst_index.sa_encode(a[None, :])[0][:10]
Out[197]: array([132,   7,  12, 109,  65, 132,  27, 139,  53, 117], dtype=uint8)

Also the reconstruction error is different, sometimes better/worse when it happens.

Here I repetitively run below script:


a = np.random.randn(512).astype("float32"); a /= np.linalg.norm(a)
src_index.add(a[None, :])
dst_index.add(a[None, :])

print(src_index.search(a[None, :], 1))
print(dst_index.search(a[None, :], 1))

print( (faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0]).all() )
print( (faiss.downcast_index(src_index.index).pq.compute_codes(a[None, :])[0] == faiss.downcast_index(dst_index.index).pq.compute_codes(a[None, :])[0]).all() )

print( (faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0] == faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0]).all() )
print( (faiss.downcast_index(src_index.index).sa_encode(a[None, :])[0] == faiss.downcast_index(dst_index.index).sa_encode(a[None, :])[0]).all() )

print( (src_index.sa_encode(a[None, :])[0] == dst_index.sa_encode(a[None, :])[0]).all() )
print( (src_index.sa_encode(a[None, :])[0] == dst_index.sa_encode(a[None, :])[0]).all() )

Most of the time

(array([[0.8010769]], dtype=float32), array([[48]]))
(array([[0.8010769]], dtype=float32), array([[44]]))
True
True
True
True
True
True

Sometimes

(array([[0.77809864]], dtype=float32), array([[54]]))
(array([[0.73597413]], dtype=float32), array([[50]]))
True
True
False
False
False
False

jrcavani · 2022-10-13T03:25:51Z

I think I have an answer to this now from reading the source code. The dst HNSW quantizer could easily be different from the original src HNSW quantizer, even given the exact same params, because the graph structure is based on the sequence of vectors being added. If that happens, the quantizer will assign a different coarse centroid at the sa_encode() call, then the residual will also be different, further the PQ quantization code.

mdouze added the question label Sep 9, 2022

jrcavani closed this as completed Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to transfer the IVF index from one PQ compression setting to another? #2455

How to transfer the IVF index from one PQ compression setting to another? #2455

jrcavani commented Sep 5, 2022 •

edited

Loading

mdouze commented Sep 6, 2022 •

edited

Loading

mdouze commented Sep 6, 2022

jrcavani commented Sep 6, 2022 •

edited

Loading

mdouze commented Sep 9, 2022

jrcavani commented Sep 14, 2022

jrcavani commented Oct 13, 2022

How to transfer the IVF index from one PQ compression setting to another? #2455

How to transfer the IVF index from one PQ compression setting to another? #2455

Comments

jrcavani commented Sep 5, 2022 • edited Loading

mdouze commented Sep 6, 2022 • edited Loading

mdouze commented Sep 6, 2022

jrcavani commented Sep 6, 2022 • edited Loading

mdouze commented Sep 9, 2022

jrcavani commented Sep 14, 2022

jrcavani commented Oct 13, 2022

jrcavani commented Sep 5, 2022 •

edited

Loading

mdouze commented Sep 6, 2022 •

edited

Loading

jrcavani commented Sep 6, 2022 •

edited

Loading