Compile a pipeline

Compile a pipeline definition to YAML

You can compile your pipeline or component to intermediate representation (IR) YAML. The IR YAML definition preserves a static representation of the pipeline or component. You can submit the YAML definition to the KFP backend for execution, or deserialize it using the KFP SDK for integration into another pipeline. (View an example on GitHub).

Note: Pipelines as well as components are authored in Python. A pipeline is a template representing a multistep workflow, whereas a component is a template representing a single step workflow.

Compile a pipeline

You can compile a pipeline or a template into IR YAML using the Compiler.compile method. To do this, follow these steps:

  1. Define a simple pipeline:

    from kfp import compiler
    from kfp import dsl
    
    @dsl.component
    def addition_component(num1: int, num2: int) -> int:
      return num1 + num2
    
    @dsl.pipeline(name='addition-pipeline')
    def my_pipeline(a: int, b: int, c: int = 10):
      add_task_1 = addition_component(num1=a, num2=b)
      add_task_2 = addition_component(num1=add_task_1.output, num2=c)
    
  2. Compile the pipeline to the file my_pipeline.yaml:

    cmplr = compiler.Compiler()
    cmplr.compile(my_pipeline, package_path='my_pipeline.yaml')
    
  3. Compile the component addition_component to the file addition_component.yaml:

    cmplr.compile(addition_component, package_path='addition_component.yaml')
    

The Compiler.compile method accepts the following parameters:

Name Type Description
pipeline_func function Required
Pipeline function constructed with the @dsl.pipeline or component constructed with the @dsl.component decorator.
package_path string Required
Output YAML file path. For example, ~/my_pipeline.yaml or ~/my_component.yaml.
pipeline_name string Optional
If specified, sets the name of the pipeline template in the pipelineInfo.name field in the compiled IR YAML output. Overrides the name of the pipeline or component specified by the name parameter in the @dsl.pipeline decorator.
pipeline_parameters Dict[str, Any] Optional
Map of parameter names to argument values. This lets you provide default values for pipeline or component parameters. You can override these default values during pipeline submission.
type_check bool Optional
Indicates whether static type checking is enabled during compilation.
For more information about type checking, see Component I/O: Component interfaces and type checking.

IR YAML

The IR YAML is an intermediate representation of a compiled pipeline or component. It is an instance of the PipelineSpec protocol buffer message type, which is a platform-agnostic pipeline representation protocol. It is considered an intermediate representation because the KFP backend compiles PipelineSpec to Argo Workflow YAML as the final pipeline definition for execution.

Unlike the v1 component YAML, the IR YAML is not intended to be written directly. To learn how to author pipelines and components in KFP v2 similar to authoring component YAML in KFP v1, see Author a Pipeline: Custom Container Components.

The compiled IR YAML file contains the following sections:

Section Description Example
components This section is a map of the names of all components used in the pipeline to ComponentSpec. ComponentSpec defines the interface, including inputs and outputs, of a component.
For primitive components, ComponentSpec contains a reference to the executor containing the component implementation.
For pipelines used as components, ComponentSpec contains a DagSpec instance, which includes references to the underlying primitive components.
View on Github
deployment_spec This section contains a map of executor name to ExecutorSpec. ExecutorSpec contains the implementation for a primitive component. View on Github
root This section defines the steps of the outermost pipeline definition, also called the pipeline root definition. The root definition is the workflow executed when you submit the IR YAML. It is an instance of ComponentSpec. View on Github
pipeline_info This section contains pipeline metadata, including the pipelineInfo.name field. This field contains the name of your pipeline template. When you upload your pipeline, a pipeline context name is created based on this template name. The pipeline context lets the backend and the dashboard associate artifacts and executions from pipeline runs using the pipeline template. You can use a pipeline context to determine the best model by comparing metrics and artifacts from multiple pipeline runs based on the same training pipeline. View on Github
sdk_version This section records the version of the KFP SDK used to compile the pipeline. View on Github
schema_version This section records the version of the PipelineSpec schema used for the IR YAML. View on Github
default_pipeline_root This section records the remote storage root path, such as a MiniIO URI or Google Cloud Storage URI, where the pipeline output is written. View on Github

Feedback

Was this page helpful?