Skip to main content

Sample Data

Sync overview

The Sample Data source generates sample data using the Python mimesis library.

Output schema

This source generates an e-commerce-like dataset with three streams: users, products, and purchases.

Users

Each user record contains identity and demographic fields such as name, email, age, gender, occupation, nationality, and an embedded address object. The number of user records is controlled by the count configuration option. User records use id as their primary key and updated_at as the cursor field for incremental syncs.

Products

Product records represent vehicles with fields including make, model, year, and price. The products stream draws from a fixed catalog of 100 products. The count configuration option limits how many of those 100 products are emitted; setting count higher than 100 still produces at most 100 products. Product records use id as their primary key and updated_at as the cursor field.

Purchases

Purchase records link users to products and include timestamps for added_to_cart_at, purchased_at, and returned_at. The connector generates roughly one purchase per user, so the total number of purchases scales with count. Purchase records use id as their primary key and updated_at as the cursor field.

Features

FeatureSupported?Notes
Full Refresh SyncYes
Incremental SyncYes
NamespacesNo

When using incremental sync, the connector maintains state between syncs. If always_updated is set to false, the connector stops emitting records after the initial sync produces count records. If always_updated is true (the default), every sync emits all records with fresh updated_at timestamps.

You can set a specific seed value to guarantee that the same records are generated on each sync. Leave seed at its default value of -1 to generate random data on each sync.

Requirements

None!

Reference

Config fields reference

Field
Type
Property name
boolean
always_updated
integer
count
integer
parallelism
integer
records_per_slice
integer
seed

Changelog

Expand to review
VersionDatePull RequestSubject
7.0.02026-03-0574318Test breaking change to validate breaking change infrastructure
6.2.382025-11-1269289Add externalDocumentationUrls field to metadata
6.2.372025-10-2168572Update dependencies
6.2.362025-10-1467806Update dependencies
6.2.352025-10-0767290Update dependencies
6.2.342025-09-3065779Update dependencies
6.2.332025-09-0365914Upgrade CDK to 6.28.0 and remove pendulum dependency
6.2.322025-08-2365273Update dependencies
6.2.312025-08-1665006Update dependencies
6.2.302025-08-0964799Update dependencies
6.2.292025-07-2663953Update dependencies
6.2.282025-07-1963534Update dependencies
6.2.272025-07-1763354Updated icon
6.2.262025-07-1663342Rendered name changed to Sample Data
6.2.26-rc.12025-06-1661645Update for testing
6.2.25-rc.12025-04-0757500Update for testing
6.2.242025-04-0557263Update dependencies
6.2.232025-03-2956502Update dependencies
6.2.222025-03-2246821Update dependencies
6.2.212025-03-1155705Promoting release candidate 6.2.21-rc.1 to a main version.
6.2.21-rc.12024-11-1348013Update for testing.
6.2.202024-10-3048013Promoting release candidate 6.2.20-rc.1 to a main version.
6.2.20-rc.12024-10-2146678Testing release candidate with RC suffix versioning.
6.2.19-rc.12024-10-2147221Testing release candidate with RC suffix versioning.
6.2.18-rc.12024-10-0946678Testing release candidate with RC suffix versioning.
6.2.172024-10-0546398Update dependencies
6.2.162024-09-2846207Update dependencies
6.2.152024-09-2145740Update dependencies
6.2.142024-09-1445567Update dependencies
6.2.132024-09-0745327Update dependencies
6.2.122024-09-0445126Test a release candidate release
6.2.112024-08-3145025Update dependencies
6.2.102024-08-2444659Update dependencies
6.2.92024-08-1744221Update dependencies
6.2.82024-08-1243753Update dependencies
6.2.72024-08-1043570Update dependencies
6.2.62024-08-0343102Update dependencies
6.2.52024-07-2742682Update dependencies
6.2.42024-07-2042367Update dependencies
6.2.32024-07-1341848Update dependencies
6.2.22024-07-1041467Update dependencies
6.2.12024-07-0941180Update dependencies
6.2.02024-07-0739935Update CDK to 2.0.
6.1.62024-07-0640956Update dependencies
6.1.52024-06-2540426Update dependencies
6.1.42024-06-2139935Update dependencies
6.1.32024-06-0439029[autopull] Upgrade base image to v1.2.1
6.1.22024-06-0338831Bump CDK to allow and prefer versions 1.x
6.1.12024-05-2038256Replace AirbyteLogger with logging.Logger
6.1.02024-04-0836898Update car prices and years
6.0.32024-03-1536167Make 'count' an optional config parameter.
6.0.22024-02-1235174Manage dependencies with Poetry.
6.0.12024-02-1235172Base image migration: remove Dockerfile and use the python-connector-base image
6.0.02024-01-3034644Declare 'id' columns as primary keys.
5.0.22024-01-1734344Ensure unique state messages
5.0.12023-01-0834033Add standard entrypoints for usage with AirbyteLib
5.0.02023-08-0829213Change all *id fields and products.year to be integer
4.0.02023-07-1928485Bump to test publication
3.0.22023-07-0728060Bump to test publication
3.0.12023-06-2827807Fix bug with purchase stream updated_at
3.0.02023-06-2327684Stream cursor is now updated_at & remove records_per_sync option
2.1.02023-05-0825903Add user.address (object)
2.0.32023-02-2023259bump to test publication
2.0.22023-02-2023259bump to test publication
2.0.12023-01-3022117source-faker goes beta
2.0.02022-12-1420492 and 20741Decouple stream states for better parallelism
1.0.02022-11-2819490Faker uses the CDK; rename streams to be lower-case (breaking), add determinism to random purchases, and rename
0.2.12022-10-1419197Emit AirbyteEstimateTraceMessage
0.2.02022-10-1418021Move to mimesis for speed!
0.1.82022-10-1217889Bump to test publish command (2)
0.1.72022-10-1117848Bump to test publish command
0.1.62022-09-0716418Log start of each stream
0.1.52022-06-1013695Emit timestamps in the proper ISO format
0.1.42022-05-2713298Test publication flow
0.1.32022-05-2713248Add options for records_per_sync and page_size
0.1.22022-05-2613293Test publication flow
0.1.12022-05-2613235Publish for AMD and ARM (M1 Macs) & remove User.birthdate
0.1.02022-04-1211738The Faker Source is created
Was this page helpful?