Monorepo Python Development
This guide contains instructions on how to setup Python with Gradle within the Airbyte Monorepo. If you are a contributor working on one or two connectors, this page is most likely not relevant to you. Instead, you should use your standard Python development flow.
Python Connector Development
Before working with connectors written in Python, we recommend running the following command from the airbyte root directory
python3 tools/bin/update_intellij_venv.py -modules <connector directory name> --install-venv
python tools/bin/update_intellij_venv.py -modules source-stripe --install-venv
If using Pycharm or IntelliJ, you'll also want to add the interpreter to the IDE's list of known interpreters. You can do this by adding the
--update-intellij flag. More details can be found here
python tools/bin/update_intellij_venv.py -modules <connector directory name> --install-venv --update-intellij
If working with many connectors, you can use the
--all-modules flag to install the virtual environments for all connectors
python tools/bin/update_intellij_venv.py --all-modules --install-venv
This will create a
virtualenv and install dependencies for the connector you want to work on as well as any internal Airbyte python packages it depends on.
When iterating on a single connector, you will often iterate by running
This command will:
- Install a virtual environment at
- Install local development dependencies specified in
- Runs the following pip modules:
To format and lint your code before commit you can use the Gradle command above, but for convenience we support pre-commit tool. To use it you need to install it first:
pip install pre-commit
then, to install
pre-commit as a git hook, run
pre-commit will format/lint the code every time you commit something. You find more information about pre-commit here.
At Airbyte, we use IntelliJ IDEA for development. Although it is possible to develop connectors with any IDE, we typically recommend IntelliJ IDEA or PyCharm, since we actively work towards compatibility.
Install the Pydantic plugin. This will help autocompletion with some of our internal types.
PyCharm (ItelliJ IDEA)
The following setup steps are written for PyCharm but should have similar equivalents for IntelliJ IDEA:
python tools/bin/update_intellij_venv.py -modules <your-connector-dir> --update-intellij
- Restart PyCharm
- Go to
File -> New -> Project...
- Select a project name like
airbyteand a directory outside of the
- Go to
Preferences -> Project -> Python Interpreter
- Find a gear ⚙️ button next to
Python interpreterdropdown list, click and select
Virtual Environment -> Existing
- Set the interpreter path to the one that was created by Python command, i.e.
- Wait for PyCharm to finish indexing and loading skeletons from selected virtual environment.
You should now have access to code completion and proper syntax highlighting for python projects.
If you need to work on another connector you can quickly change the current virtual environment in the bottom toolbar.
Excluding files from venv
By default, the find function in IntelliJ is not scoped and will include all files in the monorepo, including all the libraries installed as part of a connector's virtual environment. This huge volume of files makes indexing and search very slow. You can ignore files from the connectors' virtual environment with the following steps:
- Open the project structure using
- Navigate to the "Project Settings / Modules" section in the right-side of the menu
- Select the top level
airbytemodule so the change is applied to all submodules
- Add the following filter to the
- Press OK to confirm your options.
We have seen the above solution not being applied by IntelliJ. The exact reason is not clear to us but as a workaround, you can:
.gitignorein your IntelliJ
- There will be a banner saying
Some of the ignored directories are not excluded from indexing and search. Click on
- A tree with all the git ignored files should be displayed. You can exclude them from IntelliJ by clicking