Nested Data Load Component

Nested Data Load

This component takes an External Table that contains a nested data structure and converts it to a standard table, unpacking the nested structure into a more practical structure.

Users should be aware of the Create External Table component and its Metadata property that allows for nested external tables to be created.

 

Properties

Property Setting Description
Name Text The descriptive name for the component. This is automatically determined from the table name when the Table Name property is first set.
Input Schema Select Select the schema for the nested external table to be used as a source. Note that external tables require external schemas and regular schemas will not work. To learn more about external schemas, please consult the 'Configuring The Matillion ETL Client' section of the Getting Started With Amazon Redshift Spectrum documentation. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see Schema Support.
Input Table Text The name of the source external standard table to create or replace.
Target Schema Select Select the schema for the target table. This is a non-external (standard) schema.
New Table Name Text The name of the target table to create or replace.
Column Names Select Multiple A representation of the nested data structure that allows users to check boxes for the structures, arrays and fields that they wish to load.
Column Aliases Select/Text Select a Column Name from the ones included in the Column Names property and choose an arbitrary new name for that column as it will appear in the target table.
Data Source Filter Input Column The input column to filter rows to be loaded by.
Qualifier Is - Compares the column to the value using the comparator.
Not - Reverses the effect of the comparison, so "equals" becomes "not equals", "less than" becomes "greater than or equal to", etc.
Comparator Choose a method of comparing the column to the value. Possible comparators include: 'Equal To', 'Greater than', 'Less than', 'Greater than or equal to', 'Less than or equal to', 'Like', 'Null'.
'Equal To' can match exact strings and numeric values while other comparators such as 'Greater than' will work only with numerics. The 'Like' operator allows the wildcard character (%) to be used at the start and end of a string value to match a column. The Null operator matches only Null values, ignoring whatever the value is set to.
Not all data sources support all comparators, thus it is likely only a subset of the above comparators will be available to choose from.
Value The value to be compared.
Limit Number Limits the number of rows that are loaded from file.
Joins Name Array data is loaded as though it were a table in its own right and thus must be joined to the other incoming data. These joins are LEFT OUTER by default but may be specified here by selecting an Array name.
Type The Join type specifies how the join is to be performed.
INNER – Performs an SQL Inner Join.
LEFT OUTER – Performs an SQL Left Outer Join. (Default)
Create/Replace Select Create - the default option, creates a new table. This will generate an error if a table with the same name already exists, but will never destroy existing data.
Create if not exists - This will only create a new table if a table of the same name does not already exist. It will not destroy existing data.
If the schema of the existing table does not match the schema defined in this component, no attempt is made to fix or correct it, which could lead to errors later in the job if you did not expect an existing table to exist, or to have a different schema to the one defined in this component.
Replace - This drops any existing table of the same name, and then creates a new table. This guarantees that after the component succeeds the table matches the schema defined in this component, however any existing data in an existing table will be lost.
Note: Since other database objects may depends upon this table,
drop ... cascade
is used which may actually remove many other database objects.
Distribution Style Select Even - the default option, distribute rows around the Redshift Cluster evenly.
All - copy rows to all nodes in the Redshift Cluster.
Key - distribute rows around the Redshift cluster according to the value of a key column.
Table-distribution is critical to good performance - see the Amazon Redshift documentation for more information.
Sort Key Select This is optional, and specifies the columns from the input that should be set as the table's sort-key.
Sort-keys are critical to good performance - see the Amazon Redshift documentation for more information.
Primary Key Select Optionally select one or more columns to be designated as primary keys in the target table. Columns fit for this purpose should contain data that is unique between records and should not contain Null values.
Backup Table Select Specify whether the created table is to be included in automated and manual cluster snapshots. The BACKUP NO setting has no effect on automatic replication of data to other nodes within the cluster, so tables with BACKUP NO specified are restored in a node failure. The default is BACKUP YES.