Skip to content

2.23.1.0-b40

@mbautin mbautin tagged this 17 Aug 21:17
Summary:
Some utilities needed for the HNSW vector index implementation and benchmarking.

Adding a new directory, "vector", and the new library yb_vector. The namespace is called vectorindex.

benchmark_data.{h,cc} -- implements readers for the .fvec file format (see http://corpus-texmex.irisa.fr/).

distance.{h,cc} -- functions for distance calculation, currently for only for L2 squared and cosine.

vector_index_if.h -- intended to contain high-level interfaces exposed by a vector index such as HNSW. Currently only the reader API is included, which will be needed by the recall computation utility.

hnsw_util.{h,cc} -- various types and functions needed in the HNSW implementation: level selection, and min/max priority queues for (vector, distance) pairs.

The vector_types.h header in the common directory is needed by the dockv code, so it can't be in the vector directory. The yb_dockv library is not allowed to depend on the yb_vector library.
Jira: DB-12298

Test Plan: Jenkins

Reviewers: sergei, aleksandr.ponomarenko

Reviewed By: sergei

Subscribers: ybase

Differential Revision: https://phorge.dev.yugabyte.com/D37340
Assets 2
Loading