Rewrite External Table
Rewrite External Table
The Rewrite External Table component uses SQL provided by the input connection and writes the result out to a new External Table.
Note This will overwrite any existing data on the chosen S3 location and it is generally not recommended to perform this action on the same location from which the source data is referenced. The Matillion ETL instance must have access to the chosen bucket and location.
External tables are part of Amazon Redshift Spectrum and may not be available in all regions. For a list of supported regions see the Amazon documentation.
For full information on working with external tables, see the official documentation here.
|Name||Text||The descriptive name for the component.
This is automatically determined from the table name when the Table Name property is first set.
|Schema||Select||Select the table schema. Note that an external schema must be used. For more information on using multiple schemas, see Schema Support.|
|Target Table||Text||The name of the newly created External Table.|
|Location||Select||The file target location, including S3 bucket path. The Matillion instance must have access to this data (typically, access is granted according to the AWS credentials on the instance or if the bucket is public). A directory named after the Target Table will be created at this location and then populated with files.|
|Partition||Select Multiple||(Optional) Select source columns to be partitions when writing data. Chosen columns will be queried for distinct values and partitioned file directories will be created (if they don't exist) for those values.|
In this example, we have a regular table that holds the latest project data. After some transformation, we want to write the resultant data to an external table so that it can be occasionally queried without the data being held on Redshift. However, since this is an external table and may already exist, we use the Rewrite External Table component. The job is shown below.
We want our new data to overwrite the old table so we've chosen the old table's name in the Target Table property and the same Location in the S3 bucket.
When run, this component will overwrite the written data on the S3 bucket as well as rewrite the external table. We can check the data by sampling the component.
Finally we can also check the S3 bucket for the files through several methods. Below we use the S3 Object Put component to see files within the bucket and confirm that our data is there.