Add Partition

Add Partition

Allows users to define the S3 directory structure for partitioned external table data. This works by attributing values to each partition on the table. On S3, a single folder is created for each partition value and is named according to the corresponding partition key and value.

For example, the partition value for the key 'Salesdate' might be '2016-01-07'. A directory would then be created on S3 named 'Salesdate=2017-04-31' containing this partition's data.

It is important to note that this component only adds physical partitions on S3 and the table itself must already have partitions defined in the Create External Table component.

Properties

Property Setting Description
Name Text The descriptive name for the component.
Schema Select Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Table Text The table to add partitions to. This should be an external table that already has partitions defined through the Create External Table component.
Partition Values Text The table or view to unload to S3.
Location Text The URL of the S3 bucket to load the partition data into.

Example

In this example we will be creating an external table and adding partitions to it. The orchestration job is shown below.

The Create External Table component is set up as shown below. We add table metadata through the component so that all expected columns are defined. Using these definitions, you can now assign columns as partitions through the 'Partition' property. At least one column must remain unpartitioned but any single column can be a partition. An S3 Bucket location is also chosen as to host the external table data.

Now the Add Partition component can be configured as shown below. Although not strictly necessary, we select the same S3 Bucket that our table data is held on to also hold our partition data. In the 'Table' property, the same table that we just created is chosen. If this option does not appear, try right-clicking on the Create External Table and selecting 'Run Component' to create the table.

When a table is chosen, its partition columns are automatically loaded and can be assigned values using the 'Partition Values' property, shown below. Each column has a value assigned that is used to name a folder on the S3 bucket that will contain partition data.