Istio Integration (for TF Serving)

Istio provides a lot of functionality that we want to have, such as metrics, auth and quota, rollout and A/B testing.

Install Istio

We assume Kubeflow is already deployed in the kubeflow namespace.

kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/install/crds.yaml
kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/install/istio-noauth.yaml
kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/kf-istio-resources.yaml

kubectl label namespace kubeflow istio-injection=enabled

The first command installs Istio’s CRDs.

The second command installs Istio’s core components (without mTLS), with some customization: 1. sidecar injection configmap policy is changed from enabled to disabled 1. istio-ingressgateway is of type NodePort instead of LoadBalancer

The third command deploys some resources for Kubeflow. The fourth command label the kubeflow namespace for sidecar injector.

See this table for sidecar injection behavior. We want to have configmap disabled, and namespace enabled, so that injection happens if and only if the pod has annotation.

Kubeflow TF Serving with Istio

After installing Istio, we can deploy the TF Serving component as in README with additional params:

ks param set ${MODEL_COMPONENT} injectIstio true

This will inject an istio sidecar in the TF serving deployment.

Routing with Istio vs Ambassador

With the ambassador annotation, a TF serving deployment can be accessed at HOST/tfserving/models/MODEL_NAME. However, in order to use Istio’s Gateway to do traffic split, we should use the path provided by Istio routing: HOST/istio/tfserving/models/MODEL_NAME

Metrics

The istio sidecar reports data to Mixer. Execute the command:

kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=grafana -o jsonpath='{.items[0].metadata.name}') 3000:3000

Visit http://localhost:3000/dashboard/db/istio-mesh-dashboard in your web browser. Send some requests to the TF serving service, then there should be some data (QPS, success rate, latency) like istio dashboard

Define and view metrics

See istio doc.

Expose Grafana dashboard behind ingress/IAP

Grafana needs to be configured to work properly behind a reverse proxy. We can override the default config using environment variable. So do kubectl edit deploy -n istio-system grafana, and add env vars

  - name: GF_SERVER_DOMAIN
    value: YOUR_HOST
  - name: GF_SERVER_ROOT_URL
    value: '%(protocol)s://%(domain)s:/grafana'

Rolling out new model

A typical scenario is that we first deploy a model A. Then we develop another model B, and we want to deploy it and gradually move traffic from A to B. This can be achieved using Istio’s traffic routing.

  1. Deploy the first model as described here. Then you will have the service (Model) and the deployment (Version).

  2. Deploy another version of the model, v2. This time, no need to deploy the service part.

   MODEL_COMPONENT2=mnist-v2
   ks generate tf-serving-deployment-gcp ${MODEL_COMPONENT2}
   ks param set ${MODEL_COMPONENT2} modelName mnist  // modelName should be the SAME as the previous one
   ks param set ${MODEL_COMPONENT2} versionName v2   // v2 !!
   ks param set ${MODEL_COMPONENT2} modelBasePath gs://kubeflow-examples-data/mnist
   ks param set ${MODEL_COMPONENT2} gcpCredentialSecretName user-gcp-sa
   ks param set ${MODEL_COMPONENT2} injectIstio true   // This is required

   ks apply ${KF_ENV} -c ${MODEL_COMPONENT2}
  1. Update the traffic weight ks param set mnist-service trafficRule v1:90:v2:10 // This routes 90% to v1, and 10% to v2 ks apply ${KF_ENV} -c mnist-service