Skip to main content

Low-code connector development

Airbyte’s low-code framework enables you to build source connectors for REST APIs via a connector builder UI or by modifying boilerplate YAML files via terminal or text editor.

info

Developer updates will be announced via our #help-connector-development Slack channel. If you are using the CDK, please join to stay up to date on changes and issues.

note

The low-code framework is in beta, which means that while it will be backwards compatible, it’s still in active development. Share feedback and requests with us on our Slack channel or email us at [email protected]

Why low-code?

API Connectors are common and formulaic

In building and maintaining hundreds of connectors at Airbyte, we've observed that whereas API source connectors constitute the overwhelming majority of connectors, they are also the most formulaic. API connector code almost always solves small variations of these problems:

  1. Making requests to various endpoints under the same API URL e.g: https://api.stripe.com/customers, https://api.stripe.com/transactions, etc..
  2. Authenticating using a common auth strategy such as Oauth or API keys
  3. Pagination using one of the 4 ubiquitous pagination strategies: limit-offset, page-number, cursor pagination, and header link pagination
  4. Gracefully handling rate limiting by implementing exponential backoff, fixed-time backoff, or variable-time backoff
  5. Describing the schema of the data returned by the API, so that downstream warehouses can create normalized tables
  6. Decoding the format of the data returned by the API (e.g JSON, XML, CSV, etc..) and handling compression (GZIP, BZIP, etc..)
  7. Supporting incremental data exports by remembering what data was already synced, usually using date-based cursors

and so on.

A declarative, low-code paradigm commoditizes solving formulaic problems

Given that these problems each have a very finite number of solutions, we can remove the need for writing the code to build these API connectors by providing configurable off-the-shelf components to solve them. In doing so, we significantly decrease development effort and bugs while improving maintainability and accessibility. In this paradigm, instead of having to write the exact lines of code to solve this problem over and over, a developer can pick the solution to each problem from an available component, and rely on the framework to run the logic for them.

What connectors can I build using the low-code framework?

Refer to the REST API documentation for the source you want to build the connector for and answer the following questions:

  • Does the REST API documentation show which HTTP method to use to retrieve data, and that the response is a JSON object?
  • Do the queries return data synchronously?
  • Does the API support any of the following pagination mechanisms:
    • Offset count passed either by query params or request header
    • Page count passed either by query params or request header
    • Cursor field pointing to the URL of the next page of records
  • Does the API support any of the following authentication mechanisms:
  • Does the API support static schema?
  • Does the endpoint have a strict rate limit? Throttling is not supported, but the connector can use exponential backoff to avoid API bans in case it gets rate limited. This can work for APIs with high rate limits, but not for those that have strict limits on a small time-window.
  • Are the following features sufficient:
FeatureSupport
Resource typeCollections
Sub-collection
Sync modeFull refresh
Incremental
Schema discoveryStatic schemas
Incremental syncsSync checkpointing by date
Partition routinglists, parent-resource id
Record transformationField selection
Adding fields
Removing fields
Filtering records
Error detectionFrom HTTP status code
From error message
Backoff strategiesExponential
Constant
Derived from headers

If the answer to all questions is yes, you can use the low-code framework to build a connector for the source. If not, use the Python CDK.

Prerequisites

  • An API key for the source you want to build a connector for
  • Python >= 3.9
  • Docker

Overview of the process

To use the low-code framework to build an REST API Source connector:

  1. Generate the API key or credentials for the source you want to build a connector for
  2. Set up the project on your local machine
  3. Set up your local development environment
  4. Use the connector builder UI to define the connector YAML manifest and test the connector
  5. Specify stream schemas
  6. Add the connector to the Airbyte platform

For a step-by-step tutorial, refer to the Getting Started tutorial or the video tutorial

Connector Builder UI

The main concept powering the lowcode connector framework is the Connector Manifest, a YAML file which describes the features and functionality of the connector. The structure of this YAML file is described in more detail here.

We recommend iterating on this YAML file is via the connector builder UI as it makes it easy to inspect and debug your connector in greater detail than you would be able to through the commandline. While you can still iterate via the commandline (and the docs contain instructions for how to do it), we're investing heavily in making the UI give you iteration superpowers, so we recommend you check it out!

Configuring the YAML file

The low-code framework involves editing a boilerplate YAML file. The general structure of the YAML file is as follows:

version: "0.1.0"
definitions:
<key-value pairs defining objects which will be reused in the YAML connector>
streams:
<list stream definitions>
check:
<definition of connection checker>
spec:
<connector spec>

The following table describes the components of the YAML file:

ComponentDescription
versionIndicates the framework version
definitionsDescribes the objects to be reused in the YAML connector
streamsLists the streams of the source
checkDescribes how to test the connection to the source by trying to read a record from a specified list of streams and failing if no records could be read
specA connector specification which describes the required and optional parameters which can be input by the end user to configure this connector
tip

Streams define the schema of the data to sync, as well as how to read it from the underlying API source. A stream generally corresponds to a resource within the API. They are analogous to tables for a relational database source.

For each stream, configure the following components:

ComponentSub-componentDescription
NameName of the stream
Primary key (Optional)Used to uniquely identify records, enabling deduplication. Can be a string for single primary keys, a list of strings for composite primary keys, or a list of list of strings for composite primary keys consisting of nested fields
SchemaDescribes the data to sync
Incremental syncDescribes the behavior of an incremental sync which enables checkpointing and replicating only the data that has changed since the last sync to a destination.
Data retrieverDescribes how to retrieve data from the API
RequesterDescribes how to prepare HTTP requests to send to the source API and defines the base URL and path, the request options provider, the HTTP method, authenticator, error handler components
PaginationDescribes how to navigate through the API's pages
Record SelectorDescribes how to extract records from a HTTP response
Partition RouterDescribes how to partition the stream, enabling incremental syncs and checkpointing
Cursor fieldField to use as stream cursor. Can either be a string, or a list of strings if the cursor is a nested field.
TransformationsA set of transformations to be applied on the records read from the source before emitting them to the destination

For a deep dive into each of the components, refer to Understanding the YAML file or the full YAML Schema definition

Tutorial

This section is a tutorial that will guide you through the end-to-end process of implementing a low-code connector.

  1. Getting started
  2. Creating a source
  3. Installing dependencies
  4. Connecting to the API
  5. Reading data
  6. Incremental reads
  7. Testing

Sample connectors

For examples of production-ready config-based connectors, refer to:

Reference

The full schema definition for the YAML file can be found here.