Salesforce Incremental Load

This article is part of the series on Incremental Load Tools and Shared Jobs

 

Overview

The Salesforce Incremental Load is a tool designed to allow users to easily configure a pre-existing (read-only) Shared Job that will perform an incremental load from Salesforce. This will generate a unique component on the canvas, with a custom configuration given by the user during the wizard setup.

The Salesforce Incremental Load wizard can be found in the Tools section of the Components panel.

Below is an explanation of what users can expect to find on each section of the wizard, and what each option entails.

 

OAuth and Connection Options

On the following page, users can specify a Salesforce OAuth to use in order for Matillion ETL to access a Salesforce database, which must be set up in advance.

Connection options can also be specified here but are not strictly required.

 

Data Sources

Following this, users can select the data they wish to load by choosing from a selection of Data Sources.

Since we are setting up an incremental load, all Data Sources are available to choose from here and can all be included in the load. Columns for each Data Source are available on the next screen.

Data Sources in the left-hand list can be included. Data sources in the right-hand list are chosen to be included in the load. Data Sources can be selected and moved from list to list using the arrow buttons.

 

Columns

From here, users can select which data columns are included in the load. The options for data columns are dependent on which data sources were selected.

Each Data Source selected on the above page has a cogwheel beside it that can be used to specify which columns are loaded from that Data Source, and a summary of each is then given under Columns.

 

Staging and Target properties

A variety of configuration options are available to users to customise their load, allowing them to specify where and how data is staged and loaded into tables.

Property Setting Description
Staging Bucket Select The Bucket used to stage the data.
Staging Table Prefix Text The prefix given to tables this component will create.
Stage Schema Select The name of the Schema used to stage the data.
Target Table Prefix Text The prefix common to tables that will be created from the load. Default is empty (no prefix).
Target Schema Select The name of the Schema used to create the target tables.
Stage Style Select Choose between loading data into external tables or standard tables.
Concurrency Select The number of concurrent queries. This value can be changed using the Alter WLM Slots component.
Distribution Style Select Even - The default option, distribute rows around the Redshift Cluster evenly.

All - Copy rows to all nodes in the Redshift Cluster.

Key - Distribute rows around the Redshift cluster according to the value of a key column.

Table distribution is critical to good performance - see the Amazon Redshift documentation for more information.
Handle Deletions Checkbox Synchronise record deletions from source data to the target table.
Property Setting Description
Staging Bucket Select The bucket used to stage the data.
Staging Table Prefix Text The prefix given to any tables that this component will create.
Staging Warehouse Select Choose a Snowflake warehouse that will stage the load.
Staging Database Select Choose a database to create each new table in.
Staging schema Select The name of the schema used to stage the data.
Target Table Prefix Text The prefix common to tables that will be created from the load. Default is empty (no prefix).
Target Warehouse Select Choose a Snowflake warehouse to load the data in.
Target database Select Decide which database will hold your new tables.
Target Schema Select The name of the schema used to create the target tables.
Concurrency Select Select between concurrent and sequential queries.
Sync Deleted Records Checkbox Decide whether you want to synchronise record deletions from source data to the target table.

Most options can be selected from a drop-down list that Matillion ETL generates based on the current available environments and S3 Buckets.

Salesforce Staging & Targeting Wizard Redshift  

Here is part one of the configuration.

Wizard Setup Part One

And here is part two.

Salesforce Incremental Load Target Window SF
 

Completion

Once the tool has been configured, a Salesforce Incremental Load component will be present on the canvas. This is not identical to the Salesforce Query component, but it does utilise the same ideas to perform an incremental load from Salesforce.

Clicking on the new component reveals several of the options given during the setup that can be easily changed at the user's discretion (Warning: This may cause the component to stop working and should never be done during a running workflow). This component can be used in all of the ways that any other component can be.

Salesforce Incremental Load Properties Redshift