Lightweight Python Components
The easiest way to get started authoring components is by creating a Lightweight Python Component. We saw an example of a Lightweight Python Component with
say_hello in the Hello World pipeline example. Here is another Lightweight Python Component that adds two integers together:
from kfp import dsl @dsl.component def add(a: int, b: int) -> int: return a + b
Lightweight Python Components are constructed by decorating Python functions with the
@dsl.component decorator. The
@dsl.component decorator transforms your function into a KFP component that can be executed as a remote function by a KFP conformant-backend, either independently or as a single step in a larger pipeline.
Python function requirements
To decorate a function with the
@dsl.component decorator it must meet two requirements:
Type annotations: The function inputs and outputs must have valid KFP type annotations.
There are two categories of inputs and outputs in KFP: parameters and artifacts. There are specific types of parameters and artifacts within each category. Every input and output will have a specific type indicated by its type annotation.
In the preceding
addcomponent, both inputs
bare parameters typed
int. There is one output, also typed
Valid parameter annotations include Python’s built-in
typing.List. Artifact annotations are discussed in detail in Data Types: Artifacts.
Hermetic: The Python function may not reference any symbols defined outside of its body.
For example, if you wish to use a constant, the constant must be defined inside the function:
@dsl.component def double(a: int) -> int: """Succeeds at runtime.""" VALID_CONSTANT = 2 return VALID_CONSTANT * a
By comparison, the following is invalid and will fail at runtime:
# non-example! INVALID_CONSTANT = 2 @dsl.component def errored_double(a: int) -> int: """Fails at runtime.""" return INVALID_CONSTANT * a
Imports must also be included in the function body:
@dsl.component def print_env(): import os print(os.environ)
For many realistic components, hermeticism can be a fairly constraining requirement. Containerized Python Components is a more flexible authoring approach that drops this requirement.
dsl.component decorator arguments
In the above examples, we used the
@dsl.component decorator with only one argument: the Python function. The decorator accepts some additional arguments.
Most realistic Lightweight Python Components will depend on other Python libraries. You can pass a list of requirements to
packages_to_install and the component will install these packages at runtime before executing the component function.
This is similar to including requirements in a
@dsl.component(packages_to_install=['numpy==1.21.6']) def sin(val: float = 3.14) -> float: return np.sin(val).item()
Note: As a production software best practice, prefer using Containerized Python Components when your component specifies
packages_to_install to eliminate installation of your dependencies at runtime.
pip_index_urls exposes the ability to pip install
packages_to_install from package indices other than the default PyPI.org.
Take the following component:
@dsl.component(packages_to_install=['custom-ml-package==0.0.1', 'numpy==1.21.6'], pip_index_urls=['http://myprivaterepo.com/simple', 'http://pypi.org/simple'], ) def comp(): from custom_ml_package import model_trainer import numpy as np ...
These arguments approximately translate to the following
pip install command:
pip install custom-ml-package==0.0.1 numpy==1.21.6 kfp==2 --index-url http://myprivaterepo.com/simple --trusted-host http://myprivaterepo.com/simple --extra-index-url http://pypi.org/simple --trusted-host http://pypi.org/simple
Note that when you set
pip_index_urls, KFP does not include
'https://pypi.org/simple' automatically. If you wish to pip install packages from a private repository and the default public repository, you should include both the private and default URLs as shown in the preceding component
When you create a Lightweight Python Component, your Python function code is extracted by the KFP SDK to be executed inside a container at pipeline runtime. By default, the container image used is
python:3.7. You can override this image by providing an argument to
base_image. This can be useful if your code requires a specific Python version or other dependencies not included in the default image.
@dsl.component(base_image='python:3.8') def print_py_version(): import sys print(sys.version)
install_kfp_package can be used together with
pip_index_urls to provide granular control over installation of the
kfp package at component runtime.
By default, Python Components install
kfp at runtime. This is required to define symbols used by your component (such as artifact annotations) and to access additional KFP library code required to execute your component remotely. If
kfp will not be installed via the normal automatic mechanism. Instead, you can use
pip_index_urls to install a different version of
kfp, possibly from a non-default pip index URL.
Note that setting
False is rarely necessary and is discouraged for the majority of use cases.