Skip to content

Commit

Permalink
Msudhir/add vector update functionality (#14)
Browse files Browse the repository at this point in the history
* ci: Add bigtable cleanup script

Signed-off-by: Danny C <[email protected]>

* fix: Missing Catalog argument in athena connector (feast-dev#3661)

update Catalog argument in athena connector

Signed-off-by: Gyumin Lee <[email protected]>
Co-authored-by: Gyumin Lee <[email protected]>

* ci: Disable flaky lambda materialization test

Signed-off-by: Danny C <[email protected]>

* fix: Broken non-root path with projects-list.json (feast-dev#3665)

ensure correct precedence with the two operators

Signed-off-by: Ben Fletcher <[email protected]>

* fix: Manage redis pipe's context (feast-dev#3655)

Signed-off-by: Jiwon Park <[email protected]>

* chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /sdk/python/feast/ui (feast-dev#3677)

Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3.
- [Release notes](https://github.com/salesforce/tough-cookie/releases)
- [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md)
- [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3)

---
updated-dependencies:
- dependency-name: tough-cookie
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /ui (feast-dev#3676)

Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3.
- [Release notes](https://github.com/salesforce/tough-cookie/releases)
- [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md)
- [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3)

---
updated-dependencies:
- dependency-name: tough-cookie
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: For SQL registry, increase max data_source_name length to 255 (feast-dev#3630)

* sql.py data_sources.data_source_name String(255)

Extend the limit of the data_source_name field from 50 to 255.

Signed-off-by: Ross Donnachie <[email protected]>

* fix: Optimize bytes processed when retrieving entity df schema to 0 (feast-dev#3680)

feat: Optimize bytes processed when retrieving entity df schema to 0

Signed-off-by: Hai Nguyen <[email protected]>

* fix: Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python (feast-dev#3640)

* fix! KeyError: __dummy on entityless fv

Signed-off-by: williamfoschiera <[email protected]>

* fix! join_keys typing.

Signed-off-by: williamfoschiera <[email protected]>

---------

Signed-off-by: williamfoschiera <[email protected]>
Co-authored-by: williamfoschiera <[email protected]>

* chore: Bump protobufjs from 7.1.1 to 7.2.4 in /ui (feast-dev#3674)

Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4.
- [Release notes](https://github.com/protobufjs/protobuf.js/releases)
- [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md)
- [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4)

---
updated-dependencies:
- dependency-name: protobufjs
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Bump protobufjs from 7.1.2 to 7.2.4 in /sdk/python/feast/ui (feast-dev#3675)

Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4.
- [Release notes](https://github.com/protobufjs/protobuf.js/releases)
- [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md)
- [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4)

---
updated-dependencies:
- dependency-name: protobufjs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Bump semver from 6.3.0 to 6.3.1 in /ui (feast-dev#3678)

Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1.
- [Release notes](https://github.com/npm/node-semver/releases)
- [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md)
- [Commits](npm/node-semver@v6.3.0...v6.3.1)

---
updated-dependencies:
- dependency-name: semver
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Bump semver from 6.3.0 to 6.3.1 in /sdk/python/feast/ui (feast-dev#3679)

Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1.
- [Release notes](https://github.com/npm/node-semver/releases)
- [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md)
- [Commits](npm/node-semver@v6.3.0...v6.3.1)

---
updated-dependencies:
- dependency-name: semver
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: Bump google.golang.org/grpc from 1.47.0 to 1.53.0 (feast-dev#3670)

Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](grpc/grpc-go@v1.47.0...v1.53.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(release): release 0.32.0

# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17)

### Bug Fixes

* Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2))
* Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0))
* Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e))
* Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32))
* Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e))
* Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7))
* For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec))
* Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a))
* Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971))
* Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf))
* Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035))

### Features

* Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf))
* Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))

* fix: Redshift push ignores schema (feast-dev#3671)

* Add fully-qualified-table-name Redshift prop

Signed-off-by: Robin Neufeld <[email protected]>

* pre-commit

Signed-off-by: Robin Neufeld <[email protected]>

* Docstring

Signed-off-by: Robin Neufeld <[email protected]>

* Test fully_qualified_table_name

Signed-off-by: Robin Neufeld <[email protected]>

* Simplify logic

Signed-off-by: Robin Neufeld <[email protected]>

* pre-commit

Signed-off-by: Robin Neufeld <[email protected]>

* pre-commit

Signed-off-by: Robin Neufeld <[email protected]>

* Test offline_write_batch

Signed-off-by: Robin Neufeld <[email protected]>

* Bump to trigger CI

Signed-off-by: Robin Neufeld <[email protected]>

* another bump for ci

Signed-off-by: Robin Neufeld <[email protected]>

---------

Signed-off-by: Robin Neufeld <[email protected]>

* fix: Add aws-sts dependency in java sdk so that S3 client acquires IRSA role (feast-dev#3696)

Add aws-sts dependency in java sdk

Signed-off-by: harmeet-singh-discovery <[email protected]>

* Adding initial update changes

* Added formatting changes

* Revert "Merge branch 'feast-dev:master' into msudhir/add-vector-update-functionality"

This reverts commit 8487678, reversing
changes made to 0578b9b.

* Added more tests and functionality

* updating tests

* updated functionality and added more tests

* correcting a test case

* Making formatting corrections and changeing log

* Improved tests and added functionality to convert feast schema to milvus readable schema

* Added PR Review comments

* Fixed failing test

---------

Signed-off-by: Danny C <[email protected]>
Signed-off-by: Gyumin Lee <[email protected]>
Signed-off-by: Ben Fletcher <[email protected]>
Signed-off-by: Jiwon Park <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Ross Donnachie <[email protected]>
Signed-off-by: Hai Nguyen <[email protected]>
Signed-off-by: williamfoschiera <[email protected]>
Signed-off-by: Robin Neufeld <[email protected]>
Signed-off-by: harmeet-singh-discovery <[email protected]>
Co-authored-by: Danny C <[email protected]>
Co-authored-by: 이규민 <[email protected]>
Co-authored-by: Gyumin Lee <[email protected]>
Co-authored-by: Ben Fletcher <[email protected]>
Co-authored-by: Jiwon Park <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ross Donnachie <[email protected]>
Co-authored-by: Harry <[email protected]>
Co-authored-by: William Foschiera <[email protected]>
Co-authored-by: williamfoschiera <[email protected]>
Co-authored-by: feast-ci-bot <[email protected]>
Co-authored-by: Robin Neufeld <[email protected]>
Co-authored-by: harmeet-singh-discovery <[email protected]>
Co-authored-by: Manisha Sudhir <[email protected]>
  • Loading branch information
15 people authored Aug 11, 2023
1 parent d421016 commit 2c7b7b1
Show file tree
Hide file tree
Showing 5 changed files with 551 additions and 29 deletions.
16 changes: 16 additions & 0 deletions sdk/python/docs/source/feast.protos.feast.core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,22 @@ feast.protos.feast.core.FeatureView\_pb2\_grpc module
:undoc-members:
:show-inheritance:

feast.protos.feast.core.VectorFeatureView\_pb2 module
-----------------------------------------------

.. automodule:: feast.protos.feast.core.VectorFeatureView_pb2
:members:
:undoc-members:
:show-inheritance:

feast.protos.feast.core.VectorFeatureView\_pb2\_grpc module
-----------------------------------------------------

.. automodule:: feast.protos.feast.core.VectorFeatureView_pb2_grpc
:members:
:undoc-members:
:show-inheritance:

feast.protos.feast.core.Feature\_pb2 module
-------------------------------------------

Expand Down
165 changes: 162 additions & 3 deletions sdk/python/feast/expediagroup/vectordb/milvus_online_store.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,37 @@
import logging
from datetime import datetime
from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple

from pydantic.typing import Literal
from pymilvus import (
Collection,
CollectionSchema,
DataType,
FieldSchema,
connections,
utility,
)

from feast import Entity, RepoConfig
from feast.expediagroup.vectordb.vector_feature_view import VectorFeatureView
from feast.expediagroup.vectordb.vector_online_store import VectorOnlineStore
from feast.field import Field
from feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProto
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
from feast.repo_config import FeastConfigBaseModel
from feast.types import (
Array,
FeastType,
Float32,
Float64,
Int32,
Int64,
Invalid,
String,
)
from feast.usage import log_exceptions_and_usage

logger = logging.getLogger(__name__)


class MilvusOnlineStoreConfig(FeastConfigBaseModel):
Expand All @@ -17,13 +40,47 @@ class MilvusOnlineStoreConfig(FeastConfigBaseModel):
type: Literal["milvus"] = "milvus"
"""Online store type selector"""

alias: str = "default"
""" alias for milvus connection"""

host: str
""" the host URL """

username: str
""" username to connect to Milvus """

password: str
""" password to connect to Milvus """

port: int = 19530
""" the port to connect to a Milvus instance. Should be the one used for GRPC (default: 19530) """


class MilvusConnectionManager:
def __init__(self, online_config: RepoConfig):
self.online_config = online_config

def __enter__(self):
# Connecting to Milvus
logger.info(
f"Connecting to Milvus with alias {self.online_config.alias} and host {self.online_config.host} and default port {self.online_config.port}."
)
connections.connect(
host=self.online_config.host,
username=self.online_config.username,
password=self.online_config.password,
use_secure=True,
)

def __exit__(self, exc_type, exc_value, traceback):
# Disconnecting from Milvus
logger.info("Closing the connection to Milvus")
connections.disconnect(self.online_config.alias)
logger.info("Connection Closed")
if exc_type is not None:
logger.error(f"An exception of type {exc_type} occurred: {exc_value}")


class MilvusOnlineStore(VectorOnlineStore):
def online_write_batch(
self,
Expand All @@ -49,6 +106,7 @@ def online_read(
"to be implemented in https://jira.expedia.biz/browse/EAPC-7972"
)

@log_exceptions_and_usage(online_store="milvus")
def update(
self,
config: RepoConfig,
Expand All @@ -58,9 +116,41 @@ def update(
entities_to_keep: Sequence[Entity],
partial: bool,
):
raise NotImplementedError(
"to be implemented in https://jira.expedia.biz/browse/EAPC-7970"
)
with MilvusConnectionManager(config.online_store):
for table_to_keep in tables_to_keep:
collection_available = utility.has_collection(table_to_keep.name)
try:
if collection_available:
logger.info(f"Collection {table_to_keep.name} already exists.")
else:
schema = self._convert_featureview_schema_to_milvus_readable(
table_to_keep.schema,
table_to_keep.vector_field,
table_to_keep.dimensions,
)

collection = Collection(name=table_to_keep.name, schema=schema)
logger.info(f"Collection name is {collection.name}")
logger.info(
f"Collection {table_to_keep.name} has been created successfully."
)
except Exception as e:
logger.error(f"Collection update failed due to {e}")

for table_to_delete in tables_to_delete:
collection_available = utility.has_collection(table_to_delete.name)
try:
if collection_available:
utility.drop_collection(table_to_delete.name)
logger.info(
f"Collection {table_to_delete.name} has been deleted successfully."
)
else:
logger.warning(
f"Collection {table_to_delete.name} does not exist or is already deleted."
)
except Exception as e:
logger.error(f"Collection deletion failed due to {e}")

def teardown(
self,
Expand All @@ -71,3 +161,72 @@ def teardown(
raise NotImplementedError(
"to be implemented in https://jira.expedia.biz/browse/EAPC-7974"
)

def _convert_featureview_schema_to_milvus_readable(
self, feast_schema: List[Field], vector_field, vector_field_dimensions
) -> CollectionSchema:
"""
Converting a schema understood by Feast to a schema that is readable by Milvus so that it
can be used when a collection is created in Milvus.
Parameters:
feast_schema (List[Field]): Schema stored in VectorFeatureView.
Returns:
(CollectionSchema): Schema readable by Milvus.
"""
boolean_mapping_from_string = {"True": True, "False": False}
field_list = []
dimension = None

for field in feast_schema:
if field.name == vector_field:
field_name = vector_field
dimension = vector_field_dimensions
else:
field_name = field.name

data_type = self._feast_to_milvus_data_type(field.dtype)

if field.tags:
description = field.tags.get("description", " ")
is_primary = boolean_mapping_from_string.get(
field.tags.get("is_primary", "False")
)

# Appending the above converted values to construct a FieldSchema
field_list.append(
FieldSchema(
name=field_name,
dtype=data_type,
description=description,
is_primary=is_primary,
dim=dimension,
)
)
# Returning a CollectionSchema which is a list of type FieldSchema.
return CollectionSchema(field_list)

def _feast_to_milvus_data_type(self, feast_type: FeastType) -> DataType:
"""
Mapping for converting Feast data type to a data type compatible wih Milvus.
Parameters:
feast_type (FeastType): This is a type associated with a Feature that is stored in a VectorFeatureView, readable with Feast.
Returns:
DataType : DataType associated with what Milvus can understand and associate its Feature types to
"""

return {
Int32: DataType.INT32,
Int64: DataType.INT64,
Float32: DataType.FLOAT,
Float64: DataType.DOUBLE,
String: DataType.STRING,
Invalid: DataType.UNKNOWN,
Array(Float32): DataType.FLOAT_VECTOR,
# TODO: Need to think about list of binaries and list of bytes
# FeastType.BYTES_LIST: DataType.BINARY_VECTOR
}.get(feast_type, None)
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ class VectorFeatureView(BaseFeatureView):

# inheriting from FeatureView wouldn't work due to issue with conflicting proto classes
# therefore using composition instead
name: str
feature_view: FeatureView
vector_field: str
dimensions: int
Expand Down Expand Up @@ -106,7 +107,7 @@ def __init__(
tags=tags,
owner=owner,
)

self.name = name
self.feature_view = feature_view
self.vector_field = vector_field
self.dimensions = dimensions
Expand Down
4 changes: 4 additions & 0 deletions sdk/python/feast/repo_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,8 @@ def __init__(self, **data: Any):
self._offline_config = "redshift"
elif data["provider"] == "azure":
self._offline_config = "mssql"
elif data["provider"] == "milvus":
self._online_config = "milvus"

self._online_store = None
if "online_store" in data:
Expand All @@ -216,6 +218,8 @@ def __init__(self, **data: Any):
self._online_config = "dynamodb"
elif data["provider"] == "rockset":
self._online_config = "rockset"
elif data["provider"] == "milvus":
self._online_config = "milvus"

self._batch_engine = None
if "batch_engine" in data:
Expand Down
Loading

0 comments on commit 2c7b7b1

Please sign in to comment.