Getting Started with Kubeflow

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.

Who should consider using Kubeflow?

Based on the current functionality you should consider using Kubeflow if:

  • You want to train/serve TensorFlow models in different environments (e.g. local, on prem, and cloud)
  • You want to use Jupyter notebooks to manage TensorFlow training jobs
  • You want to launch training jobs that use resources – such as additional CPUs or GPUs – that aren’t available on your personal computer
  • You want to combine TensorFlow with other processes
    • For example, you may want to use tensorflow/agents to run simulations to generate data for training reinforcement learning models.

This list is based ONLY on current capabilities. We are investing significant resources to expand the functionality and actively soliciting help from companies and individuals interested in contributing (see Contributing).

Set up Kubernetes

This documentation assumes you have a Kubernetes cluster available. If not, set up one of these environments first:

  • Local - there are a several options:
    • Minikube setup
      • Minikube leverages virtualization applications like Virtual Box or VMware Fusion to host the virtual machine and provides a CLI that can be leveraged outside of the VM.
      • Minikube defines a fully baked ISO that contains a minimal operating system and kubernetes already installed.
      • This option may be useful if you are just starting to learn and already have one of the virtualization applications already installed.
    • Multipass & Microk8s setup
      • Multipass is a general purpose CLI that launches virtual machines, with Ubuntu cloud-images already integrated. Multipass uses lightweight, native operating system mechanisms (e.g. Hypervisor Framework on MacOS, Hyper-V on Windows 10, QEMU/KVM for linux), which means you don’t need to install a virtualization application.
      • Microk8s is used to create the Kubernetes cluster inside the virtual machine. It is installed as a snap, which means it has strong isolation and update semantics - your cluster will be updated within a short period after upstream Kubernetes releases.
      • The primary benefits of this approach are - you can use the same VMs locally as you would in the cloud (ie cloud-images), you can use cloud-init to customize the VM (as you might in a cloud), and the Kubernetes cluster you create with Microk8s will be updated at regular intervals.
  • Cloud:

For more general information on setting up a Kubernetes cluster please refer to Kubernetes Setup. If you want to use GPUs, be sure to follow the Kubernetes instructions for enabling GPUs.

Kubeflow quick start

Requirements:

  • ksonnet version 0.11.0 or later.
  • Kubernetes 1.8 or later
  • kubectl

Download, set up, and deploy:

  1. Run the following script to download kfctl.sh:

    mkdir ${KUBEFLOW_SRC}
    cd ${KUBEFLOW_SRC}
    export KUBEFLOW_TAG=v0.3.0
    curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
    
    • KUBEFLOW_SRC a directory where you want to download the source to
    • KUBEFLOW_TAG a tag corresponding to the version to check out, such as master for the latest code.
    • Note you can also just clone the repository using git.
  2. Run the following scripts to set up and deploy Kubeflow:

    ${KUBEFLOW_REPO}/scripts/kfctl.sh init ${KFAPP} --platform none
    cd ${KFAPP}
    ${KUBEFLOW_REPO}/scripts/kfctl.sh generate k8s
    ${KUBEFLOW_REPO}/scripts/kfctl.sh apply k8s
    
    • ${KFAPP} The name of a directory to store your configs. This directory will be created when you run init.
      • The ksonnet app will be created in the directory ${KFAPP}/ks_app

Important: The commands above will enable collection of anonymous user data to help us improve Kubeflow; for more information including instructions for explicitly disabling it please refer to the usage reporting guide.

Troubleshooting

For detailed troubleshooting instructions, please refer to the troubleshooting guide.

Resources