Migration Guide: How to make a python connector use our base image
We currently enforce our certified python connectors to use our base image. This guide will help connector developers to migrate their connector to use our base image.
N.B: This guide currently only applies to Python CDK connectors.
Prerequisite
Definition of a successful migration
- The connector
Dockerfile
is removed from the connector folder - The connector
metadata.yaml
is referencing the latest base image in thedata.connectorBuildOptions.baseImage
key - The connector version is bumped by a patch increment
- A changelog entry is added to the connector documentation file
- The connector is successfully built and tested by our CI
- If you add
build_customization.py
to your connector, the Connector Operations team has reviewed and approved your changes.
Semi automated migration
- Run
airbyte-ci connectors --name=<my-connector> migrate_to_base_image <PR_NUMBER>
- Commit and push the changes on your PR
Manual migration
In order for a connector to use our base image it has to declare it in its metadata.yaml
file under the data.connectorBuildOptions.baseImage
key:
Example:
connectorBuildOptions:
baseImage: docker.io/airbyte/python-connector-base:1.1.0@sha256:bd98f6505c6764b1b5f99d3aedc23dfc9e9af631a62533f60eb32b1d3dbab20c
Why are we using long addresses instead of tags?
For build reproducibility!. Using full image address allows us to have a more deterministic build process. If we used tags our connector could get built with a different base image if the tag was overwritten. In other word, using the image digest (sha256), we have the guarantee that a build, on the same commit, will always use the same base image.
What if my connector needs specific system dependencies?
Declaring the base image in the metadata.yaml file makes the Dockerfile obselete and the connector will be built using our internal build process declared here. If your connector has specific system dependencies, or has to set environment variables, we have a pre/post build hook framework for that.
You can customize our build process by adding a build_customization.py
module to your connector.
This module should contain a pre_connector_install
and post_connector_install
async function that will mutate the base image and the connector container respectively.
It will be imported at runtime by our build process and the functions will be called if they exist.
Here is an example of a build_customization.py
module:
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
# Feel free to check the dagger documentation for more information on the Container object and its methods.
# https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/
from dagger import Container
async def pre_connector_install(base_image_container: Container) -> Container:
return await base_image_container.with_env_variable("MY_PRE_BUILD_ENV_VAR", "my_pre_build_env_var_value")
async def post_connector_install(connector_container: Container) -> Container:
return await connector_container.with_env_variable("MY_POST_BUILD_ENV_VAR", "my_post_build_env_var_value")
Listing migrated / non migrated connectors:
To list all migrated certified connectors you can ran:
airbyte-ci connectors --support-level=certified --metadata-query="data.connectorBuildOptions.baseImage is not None" list
To list all non migrated certified connectors you can ran:
airbyte-ci connectors --metadata-query="data.supportLevel == 'certified' and 'connectorBuildOptions' not in data.keys()" list