Getting started with Feast
This guide provides the necessary resources to install Feast alongside Kubeflow, describes the usage of Feast with Kubeflow components, and provides examples that users can follow to test their setup.
For an overview of Feast, please read Introduction to Feast.
Installing Feast with Kubeflow
- This guide assumes that you have a running Kubeflow cluster already. If you don’t have Kubeflow installed, then head on over to the Kubeflow installation guide.
- This guide also assumes that you have a running online feature store that Feast supports (Redis, Datastore, DynamoDB).
- The latest version of Feast does not need to be installed into Kubernetes. It is possible to run Feast completely from CI or as a client library (during training or inference)
- Feast requires a bucket (S3, GCS, Minio, etc) to maintain a feature registry, requires an online feature store for serving feature values, and it requires a scheduler to keep the online store up to date.
To use Feast with Kubeflow, please follow the following steps
- Install Feast into your development environment, as well as any environment where you want to register feature views or read features from the feature store.
- Create a feature repository to store your feature views and entities. Make sure to configure your feature_store.yaml to point to your online store. Pleas see the online store configuration reference here for more details.
- Deploy your feature store. This step configures your online store and sets up your feature registry.
- Build a training dataset. This step is typically executed from a Kubeflow Pipeline from which you’d train a model.
- Load features into the online store. This step can also be executed from a Kubernetes cron job.
- Read features from the online store. This step is typically executed from your model serving service, right before calling your model for a prediction.
- Please see this guide which provides best practices for running Feast in a production context.
- Please see this guide for upgrading from Feast 0.9 (Spark-based) to the latest Feast (0.12+).
Accessing Feast from Kubeflow
Once Feast is installed within the same Kubernetes cluster as Kubeflow, users can access its APIs directly without any additional steps.
Feast APIs can roughly be grouped into the following sections:
Feature definition and management: Feast provides both a Python SDK and CLI for interacting with Feast Core. Feast Core allows users to define and register features and entities and their associated metadata and schemas. The Python SDK is typically used from within a Jupyter notebook by end users to administer Feast, but ML teams may opt to version control feature specifications in order to follow a GitOps based approach.
Model training: The Feast Python SDK can be used to trigger the creation of training datasets. The most natural place to use this SDK is to create a training dataset as part of a Kubeflow Pipeline prior to model training.
Model serving: The Feast Python SDK can also be used for online feature retrieval. This client is used to retrieve feature values for inference with Model Serving systems like KFServing, TFX, or Seldon.
Please see our tutorials section for a full list of examples
For more details on Feast concepts please see the Feast documentation
Please use GitHub issues for any feedback, issues, or feature requests.