Skip to main content

Migration Guide: How to make a python connector use our base image

We currently enforce our certified python connectors to use our base image. This guide will help connector developers to migrate their connector to use our base image.

N.B: This guide currently only applies to Python CDK connectors.

Prerequisite

Install the airbyte-ci tool

Definition of a successful migration

  1. The connector Dockerfile is removed from the connector folder
  2. The connector metadata.yaml is referencing the latest base image in the data.connectorBuildOptions.baseImage key
  3. The connector version is bumped by a patch increment
  4. A changelog entry is added to the connector documentation file
  5. The connector is successfully built and tested by our CI
  6. If you add build_customization.py to your connector, the Connector Operations team has reviewed and approved your changes.

In order for a connector to use our base image it has to declare it in its metadata.yaml file under the data.connectorBuildOptions.baseImage key:

Example:

connectorBuildOptions:
baseImage: docker.io/airbyte/python-connector-base:3.0.0@sha256:1a0845ff2b30eafa793c6eee4e8f4283c2e52e1bbd44eed6cb9e9abd5d34d844

Why are we using long addresses instead of tags?

For build reproducibility!. Using full image address allows us to have a more deterministic build process. If we used tags our connector could get built with a different base image if the tag was overwritten. In other word, using the image digest (sha256), we have the guarantee that a build, on the same commit, will always use the same base image.

What if my connector needs specific system dependencies?

Declaring the base image in the metadata.yaml file makes the Dockerfile obselete and the connector will be built using our internal build process declared here. If your connector has specific system dependencies, or has to set environment variables, we have a pre/post build hook framework for that.

You can customize our build process by adding a build_customization.py module to your connector. This module should contain a pre_connector_install and post_connector_install async function that will mutate the base image and the connector container respectively. It will be imported at runtime by our build process and the functions will be called if they exist.

Here is an example of a build_customization.py module:

from __future__ import annotations

from typing import TYPE_CHECKING

if TYPE_CHECKING:
# Feel free to check the dagger documentation for more information on the Container object and its methods.
# https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/
from dagger import Container


async def pre_connector_install(base_image_container: Container) -> Container:
return await base_image_container.with_env_variable("MY_PRE_BUILD_ENV_VAR", "my_pre_build_env_var_value")

async def post_connector_install(connector_container: Container) -> Container:
return await connector_container.with_env_variable("MY_POST_BUILD_ENV_VAR", "my_post_build_env_var_value")

Listing migrated / non migrated connectors:

To list all migrated certified connectors you can ran:

airbyte-ci connectors --support-level=certified --metadata-query="data.connectorBuildOptions.baseImage is not None" list

To list all non migrated certified connectors you can ran:

airbyte-ci connectors --metadata-query="data.supportLevel == 'certified' and 'connectorBuildOptions' not in data.keys()" list