Calculator Component

Calculator Component

Adds new columns by performing calculations. Each row in produces one row of output. Any pre-existing columns that share a name with a created column will be overwritten.

By default, all input columns are mapped into output columns of the same name. You can then use the Expression editor to add new columns with a user defined calculation.

Properties

Property Setting Description
Name Text The descriptive name for the component.
Include Input Column Yes (default)/No When Yes all input columns are mapped without modification. When No the component will output only columns that have been mapped manually via the Expression editor.
Calculations List of Expression
A list of expressions calculated per row of data. The name of the expression becomes the output Column. Each expression must be valid SQL and can use the all the built in Redshift Functions.

Expressions can refer to an earlier expression by enclosing the existing expression name in double quotes.
A list of expressions calculated per row of data. The name of the expression becomes the output Column. Each expression must be valid SQL and can use the all the built in BigQuery Functions.

Expressions can refer to an earlier expression by enclosing the existing expression name in double quotes.
A list of expressions calculated per row of data. The name of the expression becomes the output Column. Each expression must be valid SQL and can use the all the built in Snowflake Functions.

Expressions can refer to an earlier expression by enclosing the existing expression name in double quotes.

Note that data types of columns may be expressed as icons as explained in the Variables documentation.

Strategy

Generates a select query where the calculated expressions are in-line after the select statement.

Example

In this example, we use the Calculator component to check flight delay times and flag delays as 'long' accordingly. The job layout is shown below.

In the Table Input component (here named 'flights'), we have loaded a table of flight data and included columns for the flight number, arrival and departure delays and actual elapsed time of the flight. If we go to the 'Sample' tab of the Table Input component we can see what this data looks like.

Although the columns and data are loaded correctly, we note that there are undesirable NULL values in the data. We remove these with the addition of a 'Filter' component.

The Filter component has a single filter that is set up as below, only allowing rows where the 'actualelapsedtime' column value is not NULL.

Now turning our attention toward the Calculator component, we choose to include all columns that are passed into it.

Additionally, the Calculator component has 2 'calculations' set. The first expression "Long Delay Ratio" adds together the departure delay, the arrival delay and divides by the flight time. Thus, we judge delays as relative to the flight's total time. The flight time is cast to a decimal to ensure the output is decimal (all input columns are integer).

The second expression uses a CASE function to add a flag. Flights with a delay of more than 20% of the flight time are flagged as "long" delay, which is either True or False.

Note that each 'Expression' in the Calculator component constitutes a unique output column. In this case we have 2 expressions and thus, along with our original data, we will have 2 additional output columns that are the results of the calculations in each expression.

We can see that each row is now correctly flagged when the delay time is 0.2x (or more) of the actual flight time. Also note that there are no empty fields left in our data due to the filter we set up.

Video