-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move aKNN limits enforcement into the default Codec's KnnVectorsFormat implementation #12309
Labels
Milestone
Comments
In the same work, or in a separate work, we could create the extension of the HNSW implementation in the codecs package to provide it to users, so they don't have to have their own fork for that. This alternative codec would just support more dimensions to play with, without backwards compatibility constraints. |
mayya-sharipova
added a commit
to mayya-sharipova/lucene
that referenced
this issue
Jul 13, 2023
Move vector max dimension limits enforcement into the default Codec's KnnVectorsFormat implementation. This allows different implementation of knn search algorithms define their own limits of a maximum vector dimenstions that they can handle. Closes apache#12309
mayya-sharipova
added a commit
that referenced
this issue
Jul 27, 2023
Move vector max dimension limits enforcement into the default Codec's KnnVectorsFormat implementation. This allows different implementation of knn search algorithms define their own limits of a maximum vector dimensions that they can handle. Closes #12309
mayya-sharipova
added a commit
that referenced
this issue
Jul 27, 2023
Move vector max dimension limits enforcement into the default Codec's KnnVectorsFormat implementation. This allows different implementation of knn search algorithms define their own limits of a maximum vector dimensions that they can handle. Closes #12309
mayya-sharipova
added a commit
to mayya-sharipova/lucene
that referenced
this issue
Jul 27, 2023
- Backward codecs use 1024 as max dims - Test classes use the current KnnVectorsFormat#DEFAULT_MAX_DIMENSIONS Relates to PR#12436 Closes apache#12309
mayya-sharipova
added a commit
that referenced
this issue
Jul 28, 2023
- Backward codecs use 1024 as max dims - Test classes use the current KnnVectorsFormat#DEFAULT_MAX_DIMENSIONS Relates to PR#12436 Closes #12309
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
[Spinoff from #12306]
There have been many discussions and polls about what to do about the existing (weakly enforced) limit of aKNN vector dimensionality in Lucene.
This issue represents Option 3 in @alessandrobenedetti's recent poll thread.
Since it is this Codec component (currently
Lucene95HnswVectorsFormat
) that is implementing the HNSW approach for approximate KNN, it makes sense that it should be the one to enforce any limits (dimensionality, max connections, beam width, etc.). In fact, it already seems to enforce some limits -- I seeMAXIMUM_BEAM_WIDTH = 3200
andMAXIMUM_MAX_CON = 512
. Once we do this, users can still fork their own Codec to change limits, or implement a different aKNN algorithm, etc., and it will be clear that they are no longer using Lucene's default Codec so index format backwards compatibility is no longer ensured.Version and environment details
No response
The text was updated successfully, but these errors were encountered: