Skip to main content

Salesforce

This page contains the setup guide and reference information for the Salesforce source connector.

Prerequisites

  • Salesforce Account with Enterprise access or API quota purchased
  • (Optional, Recommended) Dedicated Salesforce user
  • (For Airbyte Open Source) Salesforce OAuth credentials
tip

To use this connector, you'll need at least the Enterprise edition of Salesforce or the Professional Edition with API access purchased as an add-on. Reference the Salesforce docs about API access for more information.

Setup guide

While you can set up the Salesforce connector using any Salesforce user with read permission, we recommend creating a dedicated read-only user for Airbyte. This allows you to granularly control the data Airbyte can read.

To create a dedicated read only Salesforce user:

  1. Log in to Salesforce with an admin account.
  2. On the top right of the screen, click the gear icon and then click Setup.
  3. In the left navigation bar, under Administration, click Users > Profiles. The Profiles page is displayed. Click New profile.
  4. For Existing Profile, select Read only. For Profile Name, enter Airbyte Read Only User.
  5. Click Save. The Profiles page is displayed. Click Edit.
  6. Scroll down to the Standard Object Permissions and Custom Object Permissions and enable the Read checkbox for objects that you want to replicate via Airbyte.
  7. Scroll to the top and click Save.
  8. On the left side, under Administration, click Users > Users. The All Users page is displayed. Click New User.
  9. Fill out the required fields:
    1. For License, select Salesforce.
    2. For Profile, select Airbyte Read Only User.
    3. For Email, make sure to use an email address that you can access.
  10. Click Save.
  11. Copy the Username and keep it accessible.
  12. Log into the email you used above and verify your new Salesforce account user. You'll need to set a password as part of this process. Keep this password accessible.

For Airbyte Open Source: Obtain Salesforce OAuth credentials

If you are using Airbyte Open Source, you will need to obtain the following OAuth credentials to authenticate:

  • Client ID
  • Client Secret
  • Refresh Token

To obtain these credentials, follow this walkthrough with the following modifications:

  1. If your Salesforce URL is not in the X.salesforce.com format, use your Salesforce domain name. For example, if your Salesforce URL is awesomecompany.force.com then use that instead of awesomecompany.salesforce.com.
  2. When running a curl command, run it with the -L option to follow any redirects.
  3. If you created a read-only user, use the user credentials when logging in to generate OAuth tokens.

Step 2: Set up the Salesforce connector in Airbyte

For Airbyte Cloud:

  1. Log in to your Airbyte Cloud account.
  2. In the left navigation bar, click Sources. In the top-right corner, click + New source.
  3. Find and select Salesforce from the list of available sources.
  4. Enter a Source name of your choosing to help you identify this source.
  5. To authenticate: For Airbyte Cloud: Click Authenticate your account to authorize your Salesforce account. Airbyte will authenticate the Salesforce account you are already logged in to. Please make sure you are logged into the right account.
  6. Toggle whether your Salesforce account is a Sandbox account or a production account.
  7. (Optional) For Start Date, use the provided datepicker or enter the date programmatically in either YYYY-MM-DD or YYYY-MM-DDTHH:MM:SSZ format. The data added on and after this date will be replicated. If this field is left blank, Airbyte will replicate the data for the last two years by default. Please note that timestamps are in UTC.
  8. (Optional) In the Filter Salesforce Object section, you may choose to target specific data for replication. To do so, click Add, then select the relevant criteria from the Search criteria dropdown. For Search value, add the search terms relevant to you. You may add multiple filters. If no filters are specified, Airbyte will replicate all data.
  9. Click Set up source and wait for the tests to complete.

For Airbyte Open Source:

  1. Navigate to your Airbyte Open Source dashboard.
  2. In the left navigation bar, click Sources. In the top-right corner, click + New source.
  3. Find and select Salesforce from the list of available sources.
  4. Enter a Source name of your choosing to help you identify this source.
  5. To authenticate: For Airbyte Open Source: Enter your Client ID, Client Secret, and Refresh Token.
  6. Toggle whether your Salesforce account is a Sandbox account or a production account.
  7. (Optional) For Start Date, use the provided datepicker or enter the date programmatically in either YYYY-MM-DD or YYYY-MM-DDTHH:MM:SSZ format. The data added on and after this date will be replicated. If this field is left blank, Airbyte will replicate the data for the last two years by default. Please note that timestamps are in UTC.
  8. (Optional) In the Filter Salesforce Object section, you may choose to target specific data for replication. To do so, click Add, then select the relevant criteria from the Search criteria dropdown. For Search value, add the search terms relevant to you. You may add multiple filters. If no filters are specified, Airbyte will replicate all data.
  9. Click Set up source and wait for the tests to complete.

Supported sync modes

The Salesforce source connector supports the following sync modes:

Supported Streams

The Salesforce connector supports reading both Standard Objects and Custom Objects from Salesforce. Each object is read as a separate stream. See a list of all Salesforce Standard Objects here.

Airbyte allows exporting all available Salesforce objects dynamically based on:

  • If the authenticated Salesforce user has the Role and Permissions to read and fetch objects
  • If the salesforce object has the queryable property set to true. Airbyte can only fetch objects which are queryable. If you don’t see an object available via Airbyte, and it is queryable, check if it is API-accessible to the Salesforce user you authenticated with.

Limitations & Troubleshooting

Expand to see details about Salesforce connector limitations and troubleshooting.

Connector limitations

Rate limiting

The Salesforce connector is restricted by Salesforce’s Daily Rate Limits. The connector syncs data until it hits the daily rate limit, then ends the sync early with success status, and starts the next sync from where it left off. Note that picking up from where it ends will work only for incremental sync, which is why we recommend using the Incremental Sync - Append + Deduped sync mode.

A note on the BULK API vs REST API and their limitations

Syncing Formula Fields

The Salesforce connector syncs formula field outputs from Salesforce. If the formula of a field changes in Salesforce and no other field on the record is updated, you will need to reset the stream and sync a historical backfill to pull in all the updated values of the field.

Syncing Deletes

The Salesforce connector supports retrieving deleted records from the Salesforce recycle bin. For the streams which support it, a deleted record will be marked with isDeleted=true. To find out more about how Salesforce manages records in the recycle bin, please visit their docs.

Usage of the BULK API vs REST API

Salesforce allows extracting data using either the BULK API or REST API. To achieve fast performance, Salesforce recommends using the BULK API for extracting larger amounts of data (more than 2,000 records). For this reason, the Salesforce connector uses the BULK API by default to extract any Salesforce objects, unless any of the following conditions are met:

  • The Salesforce object has columns which are unsupported by the BULK API, like columns with a base64 or complexvalue type
  • The Salesforce object is not supported by BULK API. In this case we sync the objects via the REST API which will occasionally cost more of your API quota. This includes the following objects:
    • AcceptedEventRelation
    • Attachment
    • CaseStatus
    • ContractStatus
    • DeclinedEventRelation
    • FieldSecurityClassification
    • KnowledgeArticle
    • KnowledgeArticleVersion
    • KnowledgeArticleVersionHistory
    • KnowledgeArticleViewStat
    • KnowledgeArticleVoteStat
    • OrderStatus
    • PartnerRole
    • RecentlyViewed
    • ServiceAppointmentStatus
    • ShiftStatus
    • SolutionStatus
    • TaskPriority
    • TaskStatus
    • UndecidedEventRelation

More information on the differences between various Salesforce APIs can be found here.

Force Using Bulk API

If you set the Force Use Bulk API option to true, the connector will ignore unsupported properties and sync Stream using BULK API.

Troubleshooting

Tutorials

Now that you have set up the Salesforce source connector, check out the following Salesforce tutorials:

  • Check out common troubleshooting issues for the Salesforce source connector on our Airbyte Forum.

Reference

Config fields reference

Field
Type
Property name
string
client_id
string
client_secret
string
refresh_token
boolean
is_sandbox
auth_type
"Client"
auth_type
string
start_date
boolean
force_use_bulk_api
string
stream_slice_step
array<object>
streams_criteria

Changelog

VersionDatePull RequestSubject
2.5.52024-04-1837392Ensure python return code != 0 in case of error
2.5.42024-04-1837392Update CDK version to have partitioned state fix
2.5.32024-04-1737376Improve rate limit error message during check command
2.5.22024-04-1537105Raise error when schema generation fails
2.5.12024-04-1137001Update airbyte-cdk to flush print buffer for every message
2.5.02024-04-1136942Move Salesforce to partitioned state in order to avoid stuck syncs
2.4.42024-04-0836901Upgrade CDK for empty internal_message empty when ExceptionWithDisplayMessage raised
2.4.32024-04-0836885Add missing retry on REST API
2.4.22024-04-0536862Upgrade CDK for updated error messaging regarding missing streams
2.4.12024-04-0336385Retry HTTP requests and jobs on various cases
2.4.02024-03-1235978Upgrade CDK to start emitting record counts with state and full refresh state
2.3.32024-03-0435791Fix memory leak (OOM)
2.3.22024-02-1935421Add Stream Slice Step option to specification
2.3.12024-02-1235147Manage dependencies with Poetry.
2.3.02023-12-1533522Sync streams concurrently in all sync modes
2.2.22024-01-0433936Prepare for airbyte-lib
2.2.12023-12-1233342Added new ContentDocumentLink stream
2.2.02023-12-1233350Sync streams concurrently on full refresh
2.1.62023-11-2832535Run full refresh syncs concurrently
2.1.52023-10-1831543Base image migration: remove Dockerfile and use the python-connector-base image
2.1.42023-08-1729538Fix encoding guess
2.1.32023-08-1729500handle expired refresh token error
2.1.22023-08-1028781Fix pagination for BULK API jobs; Add option to force use BULK API
2.1.12023-07-0628021Several Vulnerabilities Fixes; switched to use alpine instead of slim, CVE-2022-40897, CVE-2023-29383, CVE-2023-31484, CVE-2016-2781
2.1.02023-06-2627726License Update: Elv2
2.0.142023-05-0425794Avoid pandas inferring wrong data types by forcing all data type as object
2.0.132023-04-3025700Remove pagination and query limits
2.0.122023-04-2525507Update API version to 57
2.0.112023-04-2025352Update API version to 53
2.0.102023-04-0524888Add more frequent checkpointing
2.0.92023-03-2924660Set default start_date. Sync for last two years if start date is not present in config
2.0.82023-03-3024690Handle rate limit for bulk operations
2.0.72023-03-1424071Remove regex pattern for start_date, use format validation instead
2.0.62023-03-0322891Specified date formatting in specification
2.0.52023-03-0123610Handle different Salesforce page size for different queries
2.0.42023-02-2422636Turn on default HttpAvailabilityStrategy for all streams that are not of class BulkSalesforceStream
2.0.32023-02-1723190In case properties are chunked, fetch primary key in every chunk
2.0.22023-02-1322896Count the URL length based on encoded params
2.0.12023-02-0822597Make multiple requests if a REST stream has too many properties
2.0.02023-02-0222322Remove ActivityMetricRollup stream
1.0.302023-01-2722016Set AvailabilityStrategy for streams explicitly to None
1.0.292023-01-0520886Remove ActivityMetric stream
1.0.282022-12-2920927Fix tests; add expected records
1.0.272022-11-2919869Remove AccountHistory from unsupported BULK streams
1.0.262022-11-1519286Bugfix: fallback to REST API if entity is not supported by BULK API
1.0.252022-11-1319294Use the correct encoding for non UTF-8 objects and data
1.0.242022-11-0118799Update list of unsupported Bulk API objects
1.0.232022-11-0118753Add error_display_message for ConnectionError
1.0.222022-10-1217615Make paging work, if cursor_field is not changed inside one page
1.0.212022-10-1017778Add EventWhoRelation to the list of unsupported Bulk API objects.
1.0.202022-09-3017453Check objects that are not supported by the Bulk API (v52.0)
1.0.192022-09-2917314Fixed bug with decoding response
1.0.182022-09-2817304Migrate to per-stream states.
1.0.172022-09-2317094Tune connection check: fetch a list of available streams
1.0.162022-09-2117001Improve writing file of decode
1.0.152022-08-3016086Improve API type detection
1.0.142022-08-2916119Exclude KnowledgeArticleVersion from using bulk API
1.0.132022-08-2315901Exclude KnowledgeArticle from using bulk API
1.0.122022-08-0915444Fixed bug when Bulk Job was timeout by the connector, but remained running on the server
1.0.112022-07-0713729Improve configuration field descriptions
1.0.102022-06-0913658Correct logic to sync stream larger than page size
1.0.92022-05-0612685Update CDK to v0.1.56 to emit an AirbyeTraceMessage on uncaught exceptions
1.0.82022-05-0412576Decode responses as utf-8 and fallback to ISO-8859-1 if needed
1.0.72022-05-0312552Decode responses as ISO-8859-1 instead of utf-8
1.0.62022-04-2712335Adding fixtures to mock time.sleep for connectors that explicitly sleep
1.0.52022-04-2512304Add Describe stream
1.0.42022-04-2012230Update connector to use a spec.yaml
1.0.32022-04-0411692Optimised memory usage for BULK API calls
1.0.22022-03-0110751Fix broken link anchor in connector configuration
1.0.12022-02-2710679Reorganize input parameter order on the UI
1.0.02022-02-2710516Speed up schema discovery by using parallelism
0.1.232022-02-1010141Processing of failed jobs
0.1.222022-02-0210012Increase CSV field_size_limit
0.1.212022-01-289499If a sync reaches daily rate limit it ends the sync early with success status. Read more in Performance considerations section
0.1.202022-01-269757Parse CSV with "unix" dialect
0.1.192022-01-258617Update connector fields title/description
0.1.182022-01-209478Add available stream filtering by queryable flag
0.1.172022-01-199302Deprecate API Type parameter
0.1.162022-01-189151Fix pagination in REST API streams
0.1.152022-01-119409Correcting the presence of an extra else handler in the error handling
0.1.142022-01-119386Handling 400 error, while sobject doesn't support query or queryAll requests
0.1.132022-01-118797Switched from authSpecification to advanced_auth in specefication
0.1.122021-12-238871Fix examples for new field in specification
0.1.112021-12-238871Add the ability to filter streams by user
0.1.102021-12-239005Handling 400 error when a stream is not queryable
0.1.92021-12-078405Filter 'null' byte(s) in HTTP responses
0.1.82021-11-308191Make start_date optional and change its format to YYYY-MM-DD
0.1.72021-11-248206Handling 400 error when trying to create a job for sync using Bulk API.
0.1.62021-11-168009Fix retring of BULK jobs
0.1.52021-11-157885Add Transform for output records
0.1.42021-11-097778Fix types for anyType fields
0.1.32021-11-067592Fix getting anyType fields using BULK API
0.1.22021-09-306438Annotate Oauth2 flow initialization parameters in connector specification
0.1.12021-09-216209Fix bug with pagination for BULK API
0.1.02021-09-085619Salesforce Aitbyte-Native Connector