Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Reorganize registry docs #4407

Merged
merged 4 commits into from
Aug 23, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
* [Feature view](getting-started/concepts/feature-view.md)
* [Feature retrieval](getting-started/concepts/feature-retrieval.md)
* [Point-in-time joins](getting-started/concepts/point-in-time-joins.md)
* [Registry](getting-started/concepts/registry.md)
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved
* [Permission](getting-started/concepts/permission.md)
* [\[Alpha\] Saved dataset](getting-started/concepts/dataset.md)
* [Components](getting-started/components/README.md)
Expand All @@ -45,7 +44,6 @@
* [Real-time credit scoring on AWS](tutorials/tutorials-overview/real-time-credit-scoring-on-aws.md)
* [Driver stats on Snowflake](tutorials/tutorials-overview/driver-stats-on-snowflake.md)
* [Validating historical features with Great Expectations](tutorials/validating-historical-features.md)
* [Using Scalable Registry](tutorials/using-scalable-registry.md)
* [Building streaming features](tutorials/building-streaming-features.md)

## How-to Guides
Expand Down Expand Up @@ -114,6 +112,12 @@
* [Hazelcast (contrib)](reference/online-stores/hazelcast.md)
* [ScyllaDB (contrib)](reference/online-stores/scylladb.md)
* [SingleStore (contrib)](reference/online-stores/singlestore.md)
* [Registries](reference/registries/README.md)
* [Local](reference/registries/local.md)
* [S3](reference/registries/s3.md)
* [GCS](reference/registries/gcs.md)
* [SQL](reference/registries/sql.md)
* [Snowflake](reference/registries/snowflake.md)
* [Providers](reference/providers/README.md)
* [Local](reference/providers/local.md)
* [Google Cloud Platform](reference/providers/google-cloud-platform.md)
Expand Down
50 changes: 33 additions & 17 deletions docs/getting-started/components/registry.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,47 @@
# Registry

The Feast feature registry is a central catalog of all the feature definitions and their related metadata. It allows data scientists to search, discover, and collaborate on new features.
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved
Feast uses a registry to store all applied Feast objects (e.g. Feature views, entities, etc). It allows data scientists to search, discover, and collaborate on new features. The registry exposes methods to apply, list, retrieve and delete these objects, and is an abstraction with multiple implementations.
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved

Each Feast deployment has a single feature registry. Feast only supports file-based registries today, but supports four different backends.
Feast comes with built-in file-based and sql-based registry implementations. By default, Feast uses a file-based registry, which stores the protobuf representation of the registry as a serialized file in the local file system. For more details on which registries are supported, please see [Registries](../../reference/registries/).

* `Local`: Used as a local backend for storing the registry during development
* `S3`: Used as a centralized backend for storing the registry on AWS
* `GCS`: Used as a centralized backend for storing the registry on GCP
* `[Alpha] Azure`: Used as centralized backend for storing the registry on Azure Blob storage.
## Updating the registry

The feature registry is updated during different operations when using Feast. More specifically, objects within the registry \(entities, feature views, feature services\) are updated when running `apply` from the Feast CLI, but metadata about objects can also be updated during operations like materialization.
We recommend users store their Feast feature definitions in a version controlled repository, which then via CI/CD
automatically stays synced with the registry. Users will often also want multiple registries to correspond to
different environments (e.g. dev vs staging vs prod), with staging and production registries with locked down write
access since they can impact real user traffic. See [Running Feast in Production](../../how-to-guides/running-feast-in-production.md#1.-automatically-deploying-changes-to-your-feature-definitions) for details on how to set this up.

Users interact with a feature registry through the Feast SDK. Listing all feature views:
## Accessing the registry from clients

Users can specify the registry through a `feature_store.yaml` config file, or programmatically. We often see teams
preferring the programmatic approach because it makes notebook driven development very easy:

### Option 1: programmatically specifying the registry

```python
fs = FeatureStore("my_feature_repo/")
print(fs.list_feature_views())
repo_config = RepoConfig(
registry=RegistryConfig(path="gs://feast-test-gcs-bucket/registry.pb"),
project="feast_demo_gcp",
provider="gcp",
offline_store="file", # Could also be the OfflineStoreConfig e.g. FileOfflineStoreConfig
online_store="null", # Could also be the OnlineStoreConfig e.g. RedisOnlineStoreConfig
)
store = FeatureStore(config=repo_config)
```

Or retrieving a specific feature view:
### Option 2: specifying the registry in the project's `feature_store.yaml` file

```python
fs = FeatureStore("my_feature_repo/")
fv = fs.get_feature_view(“my_fv1”)
```yaml
project: feast_demo_aws
provider: aws
registry: s3://feast-test-s3-bucket/registry.pb
online_store: null
offline_store:
type: file
```

{% hint style="info" %}
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved
The feature registry is a [Protobuf representation](https://github.com/feast-dev/feast/blob/master/protos/feast/core/Registry.proto) of Feast metadata. This Protobuf file can be read programmatically from other programming languages, but no compatibility guarantees are made on the internal structure of the registry.
{% endhint %}
Instantiating a `FeatureStore` object can then point to this:

```python
store = FeatureStore(repo_path=".")
```
4 changes: 0 additions & 4 deletions docs/getting-started/concepts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
[point-in-time-joins.md](point-in-time-joins.md)
{% endcontent-ref %}

{% content-ref url="registry.md" %}
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved
[registry.md](registry.md)
{% endcontent-ref %}

{% content-ref url="dataset.md" %}
[dataset.md](dataset.md)
{% endcontent-ref %}
Expand Down
107 changes: 0 additions & 107 deletions docs/getting-started/concepts/registry.md

This file was deleted.

23 changes: 23 additions & 0 deletions docs/reference/registries/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Registies

Please see [Registry](../../getting-started/architecture-and-components/registry.md) for a conceptual explanation of registries.

{% content-ref url="local.md" %}
[local.md](local.md)
{% endcontent-ref %}

{% content-ref url="s3.md" %}
[s3.md](s3.md)
{% endcontent-ref %}

{% content-ref url="gcs.md" %}
[gcs.md](gcs.md)
{% endcontent-ref %}

{% content-ref url="sql.md" %}
[sql.md](sql.md)
{% endcontent-ref %}

{% content-ref url="snowflake.md" %}
[snowflake.md](snowflake.md)
{% endcontent-ref %}
23 changes: 23 additions & 0 deletions docs/reference/registries/gcs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# GCS Registry

## Description

GCS registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) uing Google Cloud Storage.

While it can be used in production, there are still inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_gcp
registry:
path: gs://[YOUR BUCKET YOU CREATED]/registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
23 changes: 23 additions & 0 deletions docs/reference/registries/local.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Local Registry

## Description

Local registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) in local file system. It is only intended to be used for experimentation with Feast and should not be used in production.

There are inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_local
registry:
path: registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
23 changes: 23 additions & 0 deletions docs/reference/registries/s3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# S3 Registry

## Description

S3 registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) in S3 file system.

While it can be used in production, there are still inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_aws_s3
registry:
path: s3://[YOUR BUCKET YOU CREATED]/registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Snowflake registry
# Snowflake Registry

## Description

Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
---
description: >-
Tutorial on how to use the SQL registry for scalable registry updates
---

# Using Scalable Registry
# SQL Registry

## Overview

Expand Down
Loading