Delete Partition

Delete Partition

Allows users to delete the S3 directory structure created for partitioned external table data. On S3, a single folder is created for each partition value and is named according to the corresponding partition key and value. These folders can be removed by defining partition values through the Delete Partition component.

For example, the partition value for the key 'Salesdate' might be '2016-01-07'. A directory would then be created on S3 named 'Salesdate=2017-04-31' containing this partition's data.

Properties

Property Setting Description
Name Text The descriptive name for the component.
Schema Select Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Table Text The table that has partitions to be deleted. This should be an external table that already has partitions defined through the Create External Table component.
Partition Values Text The values for each partition column that are to be deleted. These correspond to folder names in the S3 bucket that the external table references.
Ignore Missing Select If 'No', the component will error when expecting a partition to delete that is not present. If 'Yes', errors will be ignored and the workflow will continue regardless.

Example

In this example we will be deleting partitions from an external table. To better illustrate the process, we have included steps for creating the external table and adding partitions to it. The orchestration job is shown below.

The Create External Table component is set up as shown below. We add table metadata through the component so that all expected columns are defined. Using these definitions, you can now assign columns as partitions through the 'Partition' property. At least one column must remain unpartitioned but any single column can be a partition. An S3 Bucket location is also chosen as to host the external table data.

Now the Add Partition component can be configured as shown below. Although not strictly necessary, we select the same S3 Bucket that our table data is held on to also hold our partition data. In the 'Table' property, the same table that we just created is chosen. If this option does not appear, try right-clicking on the Create External Table and selecting 'Run Component' to create the table.

When a table is chosen, its partition columns are automatically loaded and can be assigned values using the 'Partition Values' property. Each column has a value assigned that is used to name a folder on the S3 bucket that will contain partition data. To remove these partitions, a Delete Partitions component is used and set up as below.

It is usually best to set 'Ignore Missing' to 'Yes' to avoid errors that are unlikely to affect the workflow. In the 'Partition Values' property, we define the same values that we used to add partitions, as shown below. This way, the folders on S3 storage that contain the partition data will be deleted and the partitions are removed.