This guide describes the Kubeflow Operator and the current supported releases of Kubeflow Operator.
Kubeflow Operator helps deploy, monitor and manage the lifecycle of Kubeflow. Built using the Operator Framework which offers an open source toolkit to build, test, package operators and manage the lifecycle of operators.
The operator is currently in incubation phase and is based on this design doc. It is built on top of KfDef CR, and uses kfctl as the nucleus for Controller. Current roadmap for this Operator is listed here. The Operator is also published on OperatorHub.
Applications and components to be deployed as part of Kubeflow platform are defined in the KfDef configuration manifest. Each application has a kustomize configuration with all its resource manifests. KfDef
spec includes the
applications field. Application are specified in the
overlays may be used to provide custom setting for the application.
repoRef field specifies the path to retrieve the application’s kustomize configuration.
spec may also include a
plugins field for certain cloud platforms, including AWS and GCP. It is used by the platforms to preprocess certain tasks before Kubeflow deployment.
An example of KfDef is as follow:
apiVersion: kfdef.apps.kubeflow.org/v1 kind: KfDef metadata: namespace: kubeflow spec: applications: # Install Istio - kustomizeConfig: repoRef: name: manifests path: stacks/ibm/application/istio-stack name: istio-stack # Install Kubeflow applications. - kustomizeConfig: repoRef: name: manifests path: stacks/ibm name: kubeflow-apps # Other applications - kustomizeConfig: repoRef: name: manifests path: stacks/ibm/application/spark-operator name: spark-operator # Model Serving applications - kustomizeConfig: repoRef: name: manifests path: knative/installs/generic name: knative - kustomizeConfig: repoRef: name: manifests path: kfserving/installs/generic name: kfserving repos: - name: manifests uri: https://github.com/kubeflow/manifests/archive/master.tar.gz version: master
More KfDef examples may be found in Kubeflow manifests repo. Users can pick one there and make some modification to fit their requirements. OpenDataHub project also maintains a KfDef manifest for Kubeflow deployment on OpenShift Container Platforms.
The operator watches on all KfDef configuration instances in the cluster as custom resources (CR) and manage them. It handles reconcile requests to all the KfDef instances. To understand more on the operator controller behavior, refer to this controller-runtime link.
Kubeflow Operator shares the same packages and functions as the
kfctl CLI, which is the command line approach to deploy Kubeflow. Therefore, the deployment flow is similar except that the
ownerReferences metadata is added for each application’s Kubernetes object. The KfDef CR is the parent of all these objects. Kubeflow Operator does better in tearing down the Kubeflow deployment than the CLI approach. When the KfDef CR is deleted, Kubernetes garbage collection mechanism then takes over the responsibility to remove all and only the resources deployed through this KfDef configuration.
One of the many good reasons to use an operator is to monitor the resources. The Kubeflow Operator also watches all child resources of the KfDef CR. Should any of these resources be deleted, the operator would try to apply the resource manifest and bring the object up again.
The operator responds to following events:
When a KfDef instance is created or updated, the operator’s reconciler will be notified of the event and invoke the
Applyfunctions provided by the
kfctlpackage to deploy Kubeflow. The Kubeflow resources specified with the manifests will be owned by the KfDef instance with their
When a KfDef instance is deleted, since the owner is deleted, all the secondary resources owned by it will be deleted through the garbage collection. In the mean time, the reconciler will be notified of the event and remove the finalizers.
When any resource deployed as part of a KfDef instance is deleted, the operator’s reconciler will be notified of the event and invoke the
Applyfunctions provided by the
kfctlpackage to re-deploy the Kubeflow. The deleted resource will be recreated with the same manifest as specified when the KfDef instance is created.
Current Tested Operators and Pre-built Images
Kubeflow Operator controller logic is based on the
kfctl package, so for each major release of
kfctl, an operator image is built and tested with that version of
manifests to deploy a KfDef instance. Following table shows what releases have been tested.
|branch tag||operator image||manifests version||kfdef example||note|
|master||aipipeline/kubeflow-operator:master||master||kfctl_ibm.yaml||as of 07/29/2020|
Note: if building a customized operator for a specific version of Kubeflow is desired, you can run
git checkoutto that specific branch tag. Keep in mind to use the matching version of manifests.