Skip to content

Commit

Permalink
move vector search docs to the tidb-cloud folder
Browse files Browse the repository at this point in the history
  • Loading branch information
qiancai committed Oct 23, 2024
1 parent 31964f9 commit 5f9c1f9
Show file tree
Hide file tree
Showing 17 changed files with 146 additions and 420 deletions.
12 changes: 6 additions & 6 deletions tidb-cloud/tidb-cloud-release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ This page lists the release notes of [TiDB Cloud](https://www.pingcap.com/tidb-c

- [Data Service (beta)](https://tidbcloud.com/console/data-service) supports automatically generating vector search endpoints.

If your table contains [vector data types](/vector-search-data-types.md), you can automatically generate a vector search endpoint that calculates vector distances based on your selected distance function.
If your table contains [vector data types](/tidb-cloud/vector-data-types.md), you can automatically generate a vector search endpoint that calculates vector distances based on your selected distance function.

This feature enables seamless integration with AI platforms such as [Dify](https://docs.dify.ai/guides/tools) and [GPTs](https://openai.com/blog/introducing-gpts), enhancing your applications with advanced natural language processing and AI capabilities for more complex tasks and intelligent solutions.

Expand Down Expand Up @@ -166,12 +166,12 @@ This page lists the release notes of [TiDB Cloud](https://www.pingcap.com/tidb-c

The vector search (beta) feature provides an advanced search solution for performing semantic similarity searches across various data types, including documents, images, audio, and video. This feature enables developers to easily build scalable applications with generative artificial intelligence (AI) capabilities using familiar MySQL skills. Key features include:

- [Vector data types](/vector-search-data-types.md), [vector index](/vector-search-index.md), and [vector functions and operators](/vector-search-functions-and-operators.md).
- Ecosystem integrations with [LangChain](/vector-search-integrate-with-langchain.md), [LlamaIndex](/vector-search-integrate-with-llamaindex.md), and [JinaAI](/vector-search-integrate-with-jinaai-embedding.md).
- Programming language support for Python: [SQLAlchemy](/vector-search-integrate-with-sqlalchemy.md), [Peewee](/vector-search-integrate-with-peewee.md), and [Django ORM](/vector-search-integrate-with-django-orm.md).
- Sample applications and tutorials: perform semantic searches for documents using [Python](/vector-search-get-started-using-python.md) or [SQL](/vector-search-get-started-using-sql.md).
- [Vector data types](/tidb-cloud/vector-data-types.md), [vector index](/tidb-cloud/vector-index.md), and [vector functions and operators](/tidb-cloud/vector-functions-and-operators.md).
- Ecosystem integrations with [LangChain](/tidb-cloud/vector-integrate-with-langchain.md), [LlamaIndex](/tidb-cloud/vector-integrate-with-llamaindex.md), and [JinaAI](/tidb-cloud/vector-integrate-with-jinaai-embedding.md).
- Programming language support for Python: [SQLAlchemy](/tidb-cloud/vector-integrate-with-sqlalchemy.md), [Peewee](/tidb-cloud/vector-integrate-with-peewee.md), and [Django ORM](/tidb-cloud/vector-integrate-with-django-orm.md).
- Sample applications and tutorials: perform semantic searches for documents using [Python](/tidb-cloud/vector-get-started-using-python.md) or [SQL](/tidb-cloud/vector-get-started-using-sql.md).

For more information, see [Vector search (beta) overview](/vector-search-overview.md).
For more information, see [Vector search (beta) overview](/tidb-cloud/vector-overview.md).

- [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless) now offers weekly email reports for organization owners.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The following Vector data types are currently available:

Using vector data types provides the following advantages over using the [`JSON`](/data-type-json.md) type:

- Vector index support: You can build a [vector search index](/vector-search-index.md) to speed up vector searching.
- Vector index support: You can build a [vector search index](/tidb-cloud/vector-search-index.md) to speed up vector searching.
- Dimension enforcement: You can specify a dimension to forbid inserting vectors with different dimensions.
- Optimized storage format: Vector data types are optimized for handling vector data, offering better space efficiency and performance compared to `JSON` types.

Expand Down Expand Up @@ -65,9 +65,9 @@ In the following example, because dimension `3` is enforced for the `embedding`
ERROR 1105 (HY000): vector has 2 dimensions, does not fit VECTOR(3)
```

For available functions and operators over the vector data types, see [Vector Functions and Operators](/vector-search-functions-and-operators.md).
For available functions and operators over the vector data types, see [Vector Functions and Operators](/tidb-cloud/vector-search-functions-and-operators.md).

For more information about building and using a vector search index, see [Vector Search Index](/vector-search-index.md).
For more information about building and using a vector search index, see [Vector Search Index](/tidb-cloud/vector-search-index.md).

## Store vectors with different dimensions

Expand All @@ -83,11 +83,11 @@ INSERT INTO vector_table VALUES (1, '[0.3, 0.5, -0.1]'); -- 3 dimensions vector,
INSERT INTO vector_table VALUES (2, '[0.3, 0.5]'); -- 2 dimensions vector, OK
```

However, note that you cannot build a [vector search index](/vector-search-index.md) for this column, as vector distances can be only calculated between vectors with the same dimensions.
However, note that you cannot build a [vector search index](/tidb-cloud/vector-search-index.md) for this column, as vector distances can be only calculated between vectors with the same dimensions.

## Comparison

You can compare vector data types using [comparison operators](/functions-and-operators/operators.md) such as `=`, `!=`, `<`, `>`, `<=`, and `>=`. For a complete list of comparison operators and functions for vector data types, see [Vector Functions and Operators](/vector-search-functions-and-operators.md).
You can compare vector data types using [comparison operators](/functions-and-operators/operators.md) such as `=`, `!=`, `<`, `>`, `<=`, and `>=`. For a complete list of comparison operators and functions for vector data types, see [Vector Functions and Operators](/tidb-cloud/vector-search-functions-and-operators.md).

Vector data types are compared element-wise numerically. For example:

Expand Down Expand Up @@ -231,7 +231,7 @@ You can also explicitly cast a vector to its string representation. Take using t
1 row in set (0.01 sec)
```

For additional cast functions, see [Vector Functions and Operators](/vector-search-functions-and-operators.md).
For additional cast functions, see [Vector Functions and Operators](/tidb-cloud/vector-search-functions-and-operators.md).

### Cast between Vector ⇔ other data types

Expand All @@ -241,14 +241,14 @@ Note that vector data type columns stored in a table cannot be converted to othe

## Restrictions

For restrictions on vector data types, see [Vector search limitations](/vector-search-limitations.md) and [Vector index restrictions](/vector-search-index.md#restrictions).
For restrictions on vector data types, see [Vector search limitations](/tidb-cloud/vector-search-limitations.md) and [Vector index restrictions](/tidb-cloud/vector-search-index.md#restrictions).

## MySQL compatibility

Vector data types are TiDB specific, and are not supported in MySQL.

## See also

- [Vector Functions and Operators](/vector-search-functions-and-operators.md)
- [Vector Search Index](/vector-search-index.md)
- [Improve Vector Search Performance](/vector-search-improve-performance.md)
- [Vector Functions and Operators](/tidb-cloud/vector-search-functions-and-operators.md)
- [Vector Search Index](/tidb-cloud/vector-search-index.md)
- [Improve Vector Search Performance](/tidb-cloud/vector-search-improve-performance.md)
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ This document lists the functions and operators available for Vector data types.
## Vector functions

The following functions are designed specifically for [Vector data types](/vector-search-data-types.md).
The following functions are designed specifically for [Vector data types](/tidb-cloud/vector-search-data-types.md).

**Vector distance functions:**

Expand All @@ -43,7 +43,7 @@ The following functions are designed specifically for [Vector data types](/vecto

## Extended built-in functions and operators

The following built-in functions and operators are extended to support operations on [Vector data types](/vector-search-data-types.md).
The following built-in functions and operators are extended to support operations on [Vector data types](/tidb-cloud/vector-search-data-types.md).

**Arithmetic operators:**

Expand All @@ -52,7 +52,7 @@ The following built-in functions and operators are extended to support operation
| [`+`](https://dev.mysql.com/doc/refman/8.0/en/arithmetic-functions.html#operator_plus) | Vector element-wise addition operator |
| [`-`](https://dev.mysql.com/doc/refman/8.0/en/arithmetic-functions.html#operator_minus) | Vector element-wise subtraction operator |

For more information about how vector arithmetic works, see [Vector Data Type | Arithmetic](/vector-search-data-types.md#arithmetic).
For more information about how vector arithmetic works, see [Vector Data Type | Arithmetic](/tidb-cloud/vector-search-data-types.md#arithmetic).

**Aggregate (GROUP BY) functions:**

Expand Down Expand Up @@ -84,7 +84,7 @@ For more information about how vector arithmetic works, see [Vector Data Type |
| [`!=`, `<>`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_not-equal) | Not equal operator |
| [`NOT IN()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_not-in) | Check whether a value is not within a set of values |

For more information about how vectors are compared, see [Vector Data Type | Comparison](/vector-search-data-types.md#comparison).
For more information about how vectors are compared, see [Vector Data Type | Comparison](/tidb-cloud/vector-search-data-types.md#comparison).

**Control flow functions:**

Expand All @@ -102,7 +102,7 @@ For more information about how vectors are compared, see [Vector Data Type | Com
| [`CAST()`](https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_cast) | Cast a value as a string or vector |
| [`CONVERT()`](https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_convert) | Cast a value as a string |

For more information about how to use `CAST()`, see [Vector Data Type | Cast](/vector-search-data-types.md#cast).
For more information about how to use `CAST()`, see [Vector Data Type | Cast](/tidb-cloud/vector-search-data-types.md#cast).

## Full references

Expand Down Expand Up @@ -289,4 +289,4 @@ The vector functions and the extended usage of built-in functions and operators

## See also

- [Vector Data Types](/vector-search-data-types.md)
- [Vector Data Types](/tidb-cloud/vector-search-data-types.md)
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ summary: Learn how to quickly develop an AI application that performs semantic s

This tutorial demonstrates how to develop a simple AI application that provides **semantic search** features. Unlike traditional keyword search, semantic search intelligently understands the meaning behind your query and returns the most relevant result. For example, if you have documents titled "dog", "fish", and "tree", and you search for "a swimming animal", the application would identify "fish" as the most relevant result.

Throughout this tutorial, you will develop this AI application using [TiDB Vector Search](/vector-search-overview.md), Python, [TiDB Vector SDK for Python](https://github.com/pingcap/tidb-vector-python), and AI models.
Throughout this tutorial, you will develop this AI application using [TiDB Vector Search](/tidb-cloud/vector-search-overview.md), Python, [TiDB Vector SDK for Python](https://github.com/pingcap/tidb-vector-python), and AI models.

<CustomContent platform="tidb">

Expand Down Expand Up @@ -69,7 +69,7 @@ pip install sqlalchemy pymysql sentence-transformers tidb-vector python-dotenv
```

- `tidb-vector`: the Python client for interacting with TiDB vector search.
- [`sentence-transformers`](https://sbert.net): a Python library that provides pre-trained models for generating [vector embeddings](/vector-search-overview.md#vector-embedding) from text.
- [`sentence-transformers`](https://sbert.net): a Python library that provides pre-trained models for generating [vector embeddings](/tidb-cloud/vector-search-overview.md#vector-embedding) from text.

### Step 3. Configure the connection string to the TiDB cluster

Expand Down Expand Up @@ -135,7 +135,7 @@ The following are descriptions for each parameter:

### Step 4. Initialize the embedding model

An [embedding model](/vector-search-overview.md#embedding-model) transforms data into [vector embeddings](/vector-search-overview.md#vector-embedding). This example uses the pre-trained model [**msmarco-MiniLM-L12-cos-v5**](https://huggingface.co/sentence-transformers/msmarco-MiniLM-L12-cos-v5) for text embedding. This lightweight model, provided by the `sentence-transformers` library, transforms text data into 384-dimensional vector embeddings.
An [embedding model](/tidb-cloud/vector-search-overview.md#embedding-model) transforms data into [vector embeddings](/tidb-cloud/vector-search-overview.md#vector-embedding). This example uses the pre-trained model [**msmarco-MiniLM-L12-cos-v5**](https://huggingface.co/sentence-transformers/msmarco-MiniLM-L12-cos-v5) for text embedding. This lightweight model, provided by the `sentence-transformers` library, transforms text data into 384-dimensional vector embeddings.

To set up the model, copy the following code into the `example.py` file. This code initializes a `SentenceTransformer` instance and defines a `text_to_embedding()` function for later use.

Expand Down Expand Up @@ -247,5 +247,5 @@ Therefore, according to the output, the swimming animal is most likely a fish, o

## See also

- [Vector Data Types](/vector-search-data-types.md)
- [Vector Search Index](/vector-search-index.md)
- [Vector Data Types](/tidb-cloud/vector-search-data-types.md)
- [Vector Search Index](/tidb-cloud/vector-search-index.md)
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ summary: Learn how to quickly get started with Vector Search in TiDB using SQL s

# Get Started with Vector Search via SQL

TiDB extends MySQL syntax to support [Vector Search](/vector-search-overview.md) and introduce new [Vector data types](/vector-search-data-types.md) and several [vector functions](/vector-search-functions-and-operators.md).
TiDB extends MySQL syntax to support [Vector Search](/tidb-cloud/vector-search-overview.md) and introduce new [Vector data types](/tidb-cloud/vector-search-data-types.md) and several [vector functions](/tidb-cloud/vector-search-functions-and-operators.md).

This tutorial demonstrates how to get started with TiDB Vector Search just using SQL statements. You will learn how to use the [MySQL command-line client](https://dev.mysql.com/doc/refman/8.4/en/mysql.html) to complete the following operations:

Expand Down Expand Up @@ -90,7 +90,7 @@ mysql --comments --host 127.0.0.1 --port 4000 -u root

### Step 2. Create a vector table

When creating a table, you can define a column as a [vector](/vector-search-overview.md#vector-embedding) column by specifying the `VECTOR` data type.
When creating a table, you can define a column as a [vector](/tidb-cloud/vector-search-overview.md#vector-embedding) column by specifying the `VECTOR` data type.

For example, to create a table `embedded_documents` with a three-dimensional `VECTOR` column, execute the following SQL statements using your MySQL CLI:

Expand All @@ -113,7 +113,7 @@ Query OK, 0 rows affected (0.27 sec)

### Step 3. Insert vector embeddings to the table

Insert three documents with their [vector embeddings](/vector-search-overview.md#vector-embedding) into the `embedded_documents` table:
Insert three documents with their [vector embeddings](/tidb-cloud/vector-search-overview.md#vector-embedding) into the `embedded_documents` table:

```sql
INSERT INTO embedded_documents
Expand All @@ -134,7 +134,7 @@ Records: 3 Duplicates: 0 Warnings: 0
>
> This example simplifies the dimensions of the vector embeddings and uses only 3-dimensional vectors for demonstration purposes.
>
> In real-world applications, [embedding models](/vector-search-overview.md#embedding-model) often produce vector embeddings with hundreds or thousands of dimensions.
> In real-world applications, [embedding models](/tidb-cloud/vector-search-overview.md#embedding-model) often produce vector embeddings with hundreds or thousands of dimensions.
### Step 4. Query the vector table
Expand Down Expand Up @@ -191,5 +191,5 @@ Therefore, according to the output, the swimming animal is most likely a fish, o

## See also

- [Vector Data Types](/vector-search-data-types.md)
- [Vector Search Index](/vector-search-index.md)
- [Vector Data Types](/tidb-cloud/vector-search-data-types.md)
- [Vector Search Index](/tidb-cloud/vector-search-index.md)
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ TiDB Vector Search enables you to perform Approximate Nearest Neighbor (ANN) que
## Add vector search index for vector columns

The [vector search index](/vector-search-index.md) dramatically improves the performance of vector search queries, usually by 10x or more, with a trade-off of only a small decrease of recall rate.
The [vector search index](/tidb-cloud/vector-search-index.md) dramatically improves the performance of vector search queries, usually by 10x or more, with a trade-off of only a small decrease of recall rate.

## Ensure vector indexes are fully built

After you insert a large volume of vector data, some of it might be in the Delta layer waiting for persistence. The vector index for such data will be built after the data is persisted. Until all vector data is indexed, vector search performance is suboptimal. To check the index build progress, see [View index build progress](/vector-search-index.md#view-index-build-progress).
After you insert a large volume of vector data, some of it might be in the Delta layer waiting for persistence. The vector index for such data will be built after the data is persisted. Until all vector data is indexed, vector search performance is suboptimal. To check the index build progress, see [View index build progress](/tidb-cloud/vector-search-index.md#view-index-build-progress).

Check warning on line 28 in tidb-cloud/vector-search-improve-performance.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'a large volume of' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'a large volume of' because it may cause confusion.", "location": {"path": "tidb-cloud/vector-search-improve-performance.md", "range": {"start": {"line": 28, "column": 18}}}, "severity": "INFO"}

## Reduce vector dimensions or shorten embeddings

Expand Down
Loading

0 comments on commit 5f9c1f9

Please sign in to comment.