Skip to main content

Doris [ARCHIVED]

destination-doris is a destination implemented based on Apache Doris stream load, supports batch rollback, and uses http/https put request

Sync overview

Output schema

Each stream will be output into its own table in Doris. Each table will contain 3 columns:

  • _airbyte_ab_id: an uuid assigned by Airbyte to each event that is processed. The column type in Doris is VARCHAR(40).
  • _airbyte_emitted_at: a timestamp representing when the event was pulled from the data source. The column type in Doris is BIGINT.
  • _airbyte_data: a json blob representing with the event data. The column type in Doris is String.

Features

This section should contain a table with the following format:

FeatureSupported?(Yes/No)Notes
Full Refresh SyncYes
Incremental - Append SyncYes
Incremental - Append + DedupedNoit will soon be realized
For databases, WAL/Logical replicationYes

Performance considerations

Batch writes are performed. mini records may impact performance. Importing multiple tables will generate multiple Doris stream load transactions, which should be split as much as possible.

Getting started

Requirements

To use the Doris destination, you'll need:

  • A Doris server version 0.14 or above
  • Make sure your Doris fe http port can be accessed by Airbyte.
  • Make sure your Doris database host can be accessed by Airbyte.
  • Make sure your Doris user with read/write permissions on certain tables.

Target Database and tables

You will need to choose a database that will be used to store synced data from Airbyte. You need to prepare tables that will be used to store synced data from Airbyte, and ensure the order and matching of the column names in the table as much as possible.

Setup the access parameters

  • Host
  • HttpPort
  • QueryPort
  • Username
  • Password
  • Database

Changelog

VersionDatePull RequestSubject
0.1.02022-11-1417884Initial Commit