Skip to main content

ClickHouse

The ClickHouse destination connector syncs data from Airbyte sources to ClickHouse, a high-performance columnar database designed for online analytical processing (OLAP). This connector writes data directly to ClickHouse tables with proper typing, enabling fast analytical queries on your replicated data.

This is a complete rewrite of the ClickHouse destination connector built on Airbyte's Bulk CDK framework, replacing the legacy v1 connector.

How version 2 improves on version 1

Version 2.0.0 represents a complete architectural redesign of the ClickHouse destination connector with significant improvements:

  • All sync modes supported: Full Refresh (Overwrite and Append) and Incremental (Append and Append + Deduped) sync modes are now fully supported.
  • Direct Load with typed columns: Airbyte writes data directly to typed columns matching your source schema, rather than storing everything as JSON in raw tables. This improves query performance and reduces storage requirements.
  • Improved performance: The new architecture uses ClickHouse's native binary protocol and batch inserts for faster data loading.
  • Active maintenance: Built on Airbyte's modern CDK framework with ongoing development and support from the Airbyte team.

Supported sync modes

The connectors supports all sync modes.

FeatureSupported?(Yes/No)Notes
Full Refresh SyncYes
Incremental - Append SyncYes
Incremental - Append + DedupedYesLeverages ReplacingMergeTree
NamespacesYes

Deduplication

For optimal deduplication in Incremental - Append + Deduped sync mode, use a cursor column with one of these types:

  • Integer types (Int64, etc.)
  • Date
  • Timestamp (DateTime64)

If you use a different cursor column type, like string, the connector falls back to using the _airbyte_extracted_at timestamp for deduplication ordering. This fallback may not accurately reflect the natural ordering of your source data, and you'll see a warning in the sync logs.

Requirements

To use the ClickHouse destination connector, you need:

  • A ClickHouse instance (ClickHouse Cloud or self-hosted)
  • ClickHouse server version 21.8.10.19 or later
  • Network access from Airbyte to your ClickHouse instance
  • A ClickHouse user with appropriate permissions (see below)

Setup guide

1. Configure network access

Ensure your ClickHouse database is accessible from Airbyte.

Airbyte deploymentClickhouse deploymentDo this
CloudCloudWhitelist Airbyte Cloud's IP addresses in your ClickHouse Cloud settings.
CloudSelf-managedConfigure your firewall to allow inbound connections on port 8443 (HTTPS) or 8123 (HTTP) from Airbyte Cloud's IP addresses.
Self-managedCloudWhitelist your Airbyte server's public IP address in ClickHouse Cloud settings.
Self-managedSelf-managedEnsure port 8443 (HTTPS) or 8123 (HTTP) is accessible from your Airbyte host. If both are in the same private network, configure security groups or firewall rules to allow traffic between them.

If you can't expose ClickHouse publicly, use SSH Tunneling via a bastion host that can reach ClickHouse.

2. Create a dedicated user with permissions

tip

It's best to create a dedicated ClickHouse user for Airbyte rather than using an existing user. This improves security and makes it easier to audit Airbyte's database operations.

Create a ClickHouse user for Airbyte with the following permissions:

  • Create and manage databases
  • Create, alter, drop, and truncate tables
  • Insert and select data

To create a user with the required permissions, run the following SQL commands in your ClickHouse instance:

-- Create the user (replace 'your_password' with a secure password)
CREATE USER airbyte_user IDENTIFIED BY 'your_password';

-- Grant permissions on the default database
GRANT CREATE ON * TO airbyte_user;
GRANT CREATE ON {database}.* TO airbyte_user;
GRANT ALTER ON {database}.* TO airbyte_user;
GRANT TRUNCATE ON {database}.* TO airbyte_user;
GRANT INSERT ON {database}.* TO airbyte_user;
GRANT SELECT ON {database}.* TO airbyte_user;
GRANT CREATE DATABASE ON {database}.* TO airbyte_user;
GRANT CREATE TABLE ON {database}.* TO airbyte_user;
GRANT DROP TABLE ON {database}.* TO airbyte_user;

Replace {database} with the database name you configure in the connector settings. It's typically default.

If you configure custom namespaces in your Airbyte connections, grant permissions for each namespace:

GRANT CREATE ON {namespace}.* TO airbyte_user;
GRANT ALTER ON {namespace}.* TO airbyte_user;
GRANT TRUNCATE ON {namespace}.* TO airbyte_user;
GRANT INSERT ON {namespace}.* TO airbyte_user;
GRANT SELECT ON {namespace}.* TO airbyte_user;
GRANT CREATE DATABASE ON {namespace}.* TO airbyte_user;
GRANT CREATE TABLE ON {namespace}.* TO airbyte_user;
GRANT DROP TABLE ON {namespace}.* TO airbyte_user;

Replace {namespace} with each custom namespace you plan to use.

3. Configure the connector

  1. In Airbyte, click Destinations > ClickHouse.

  2. Configure the destination with the following information.

    • Hostname: Your ClickHouse server hostname (without protocol prefix like http:// or https://)
    • Port: HTTP port for ClickHouse (defaults are 8123 for HTTP and 8443 for HTTPS)
    • Protocol (self-hosted only): Choose HTTP or HTTPS. In Airbyte Cloud, this option is hidden and managed by the platform.
    • Database: Target database name (default: default)
    • Username: The ClickHouse user you created (for example, airbyte_user)
    • Password: The password for the ClickHouse user
    • Enable JSON: Whether to use ClickHouse's JSON type for object fields (recommended if your ClickHouse version supports it)

4. SSH tunnel (optional)

warning

SSH tunneling support is currently in Beta.

If your ClickHouse instance isn't directly accessible from Airbyte, you can use SSH tunneling to establish a secure connection. Configure the SSH tunnel settings in the connector configuration with your SSH host, port, username, and authentication method (password or private key).

Output schema

Airbyte writes each stream to its own table in ClickHouse. It creates tables in either the configured default database, typically default, or in a database corresponding to the namespace you specify for the stream when you set up your connection.

The connector converts Airbyte data types to ClickHouse types as follows:

  • Decimal types → Decimal(38, 9) (38 digit precision with 9 decimal places)
  • Timestamp types → DateTime64(3) (millisecond precision)
  • Object types → JSON if you enable JSON in the connector configuration, otherwise → String
  • Integer types → Int64
  • Boolean types → Bool
  • String types → String
  • Union types → String
  • Array types → String
note

The connector converts arrays and unions to strings for compatibility. If you need to query these as structured data, use ClickHouse's JSON functions to parse the string values.

Reference

Config fields reference

Field
Type
Property name
string
database
string
host
string
password
string
port
string
protocol
string
username
boolean
enable_json
integer
record_window_size
object
tunnel_method

Changelog

Expand to review
VersionDatePull RequestSubject
2.1.142025-11-1369245Upgrade to CDK 0.1.78
2.1.132025-11-1169116Upgrade to CDK 0.1.74 (internal refactor for schema evolution)
2.1.122025-11-0669226Improved additional statistics handling
2.1.112025-11-0569200Add support for observability metrics
2.1.102025-11-0369154Fix decimal validation
2.1.92025-10-3069100Upgrade to CDK 0.1.61 to fix state index bug
2.1.82025-10-2868186Upgrade to CDK 0.1.59
2.1.72025-10-2167153Implement new proto schema implementation
2.1.62025-10-1668144Implement TableOperationsSuite component tests.
2.1.52025-10-0967598Improve handling of heavily interleaved streams.
2.1.42025-09-2966743Activate speed mode.
2.1.32025-09-2966743Promoting release candidate 2.1.3-rc.1 to a main version.
2.1.3-rc.12025-09-2566699Prepare for speed mode. Fix interleaved stream state handling.
2.1.22025-09-0966143Improve schema propagation.
2.1.12025-09-0966134Update the type we are setting for the number type to Decimal(38, 9).
2.1.02025-09-0365929Promoting release candidate 2.1.0-rc.2 to a main version.
2.1.0-rc.22025-08-29#65626Pick up CDK fix for rare array OOB exception.
2.1.0-rc.12025-08-21#65144Migrate to dataflow model.
2.0.132025-08-20#65125Update docs permissioning advice.
2.0.122025-08-20#65120Check should properly surface protocol related config errors.
2.0.112025-07-23#65117Fix a bug related to the column duplicates name.
2.0.102025-07-23#64104Add an option to configure the batch size (both bytes and number of records).
2.0.92025-07-23#63738Set clickhouse as an airbyte connector.
2.0.82025-07-23#63760Throw an error if an invalid target table exist before the first sync.
2.0.72025-07-23#63751Only copy intersection columns when there is a dedup change.
2.0.62025-07-22#63724Apply clickhouse column name transformation for columns.
2.0.52025-07-22#63721Fix schema change with PKs.
2.0.42025-07-21#62948SSH support BETA.
2.0.32025-07-11#62946Publish metadata changes.
2.0.22025-07-10#62928Makes json optional in spec to work around UI issue.
2.0.12025-07-10#62906Adds bespoke validation for legacy hostnames that contain a protocol.
2.0.02025-07-10#62887Cut 2.0.0 release. Replace existing connector.
0.1.112025-07-09#62883Only set JSON properties on client if enabled to support older CH deployments.
0.1.102025-07-08#62861Set user agent header for internal CH telemetry.
0.1.92025-07-03#62509Simplify union stringification behavior.
0.1.82025-06-30#62100Add JSON support.
0.1.72025-06-24#62047Remove the use of the internal namespace.
0.1.62025-06-24#62047Hide protocol option when running on cloud.
0.1.52025-06-24#62043Expose database protocol config option.
0.1.42025-06-24#62040Checker inserts into configured DB.
0.1.32025-06-24#62038Allow the client to connect to the resolved DB.
0.1.22025-06-23#62028Enable the registry in OSS and cloud.
0.1.12025-06-23#62022Publish first beta version and pin the CDK version.
0.1.02025-06-23#62024Release first beta version.