Using Variables

Overview

Variables are name-value pairs stored within each environment.

Variables can be used in all sorts of parameters and expressions to allow the user to pass and centralise environment specific configuration. Variables are substituted at run time and are referenced with the syntax:

${<variable name>}

All alpha-numberic characters as well as underscores are valid for variable names. Note, however, that a variable name cannot begin with a digit.

${my_table_82}        Is a valid variable name.

${82_my_table}        Is not a valid variable name.

In addition, variables can be javascript expressions. Such as:

${new Date().getFullYear()}
 

Types

​The value of any given variable must be one of the following types. This type defines how a variable can be used and is important to consider when passing the variable to components and functions.
 

Type Description
Text This type can hold any type of data, subject to a maximum size. More...
Integer This type is suitable for whole-number types (no decimals). More...
Numeric This type is suitable for numeric types, with or with decimals. More...
Boolean This type is suitable for data that is either 'true' or 'false'. More...
Date This type is suitable for dates without times. More...
DateTime This type is suitable for dates, times and timestamps. More...
Type Description
String This type can hold any type of data, subject to a maximum size. More...
Integer This type is suitable for whole-number types (no decimals). More...
Float This type is suitable for numeric types, with or with decimals. More...
Boolean This type is suitable for data that is either 'true' or 'false'. More...
Timestamp This type is suitable for dates without times. More...
Type Description
VARCHAR: This type can hold any type of data, subject to a maximum size. More...
NUMBER: This type is suitable for whole-number types (no decimals). More...
FLOAT: This type is suitable for numeric types, without or with decimals. More...
BOOLEAN: This type is suitable for data that is either 'true' or 'false'. More...
DATE: This type is suitable for dates without times. More...
DATETIME: This type is suitable for dates, times or timestamps (both date and time). More...
TIMESTAMP: This type is suitable for timestamps. More...
TIME: This type is suitable for times only. More...

For Date and DateTime variables, Matillion ETL also comes with some useful proprietary methods. See Date and Time Methods for more information...

 

Declaring Variables

All variables to be used in your job must be declared before the job begins. You set a variable by selecting Project -> Manage Environment Variables


A variable can be added by clicking the + icon and edited by clicking on a field. Each variable has a name and type as documented above. Additionally, each variable has a scope and a default value for each environment. In the example below, we have a single environment named 'Test'.

The scope of a variable determines its behaviour in iterator components but otherwise cannot extend beyond a single task and can usually be set to 'global'.

NOTE: Any local variables that exist in the workflow when going through an Or Component will be reset to their default values. This is to prevent the need to resolve two potentially different values of a local variable when two parallel workflows meet the Or component. If this is not the case for your workflow, consider using global variables instead.
 

Setting Variable Values

When a job begins, all variables are initialised with their default value that is set in the 'Manage Environment Variables' menu. The real power of a variable is that its value can be updated through the following methods:

 

Iteration Components

Iteration components work by setting variables to a new value for each iteration. So you must define variables you wish to iterate in advance, and then use them when configuring iteration components. Iterations can be run in parallel too - in that case the variable scope must be set to local to ensure each parallel components sees its own local copy of the variable.

 
Python Scripts

Python scripts can push new values into variable using their built-in context object. See the python component documentation for an example of this.

 
SQS Runs

Job executions triggered by SQS messages can also set variable values using the optional 'variables' item when sending the message. The syntax shows in the SQS Integration page shows where variables can be set on the incoming message.
 

See the Integration with Amazon SQS documentation for more information.

 

 
Variable Exports

The "Export" tab allows you to edit mappings between run time information that the component makes available, and variables you have already defined.

All orchestration components and some transformation components support exporting runtime information into variables during job execution. A list of common exports are given below.
 

Exported Value Type  Description
Component Text Name of the component.
Status Text Status of the component.
Started At DateTime Time the component began running in the job.
Completed At DateTime Time the component finished running in the job.
Duration Numerical Wall-clock time the component ran for.
Row Count Numerical Number of rows the component has run over.
Message Text Output message from component (usually empty). 
Iteration Count Numeric Number of iterations from an Iterator component.


Component export values are also used to populate the Tasks view and the Run history. To export a value, it must be mapped to an existing variable through the component's 'Export' tab. It is important to map the value to a variable of the correct type according to the above table.

 

Automatic Variables

The following variables are automatically available without first needing to be defined:
 
Variable Description
project_group_name Name of the current project group. Can be set through Project → Rename Project Group.
project_name Name of the current project. Can be set through Project → Manage Project.
version_name Name of the current version. Versions can be renamed through Project → Rename Version unless locked. See Version Control for more...
environment_name Name of the current environment. Can be set by right-clicking the environment in the Environments panel and selecting 'Edit Environment'. See Managing Environments for more...
job_name Name of the current job. Can be set by right-clicking the job in the Explorer panel and selecting 'Manage Job'.
component_name The name of the current component, as defined by the user. Components can be renamed by selecting them and editing the 'Name' property.
job_id The ID of a given job in Matillion. All jobs have a unique ID that can be used to prefer to it within a project. Note that this is not the ID of a particular run of a job (see run_history_id).
run_history_id The ID of a particular run of a particular job. Since all tasks have a unique ID, this can be used to keep track of a specific job run. Can be viewed manually through Project Task → History. See Task Management for more...
 

All of the above also have an id variable that is an internally generated ID and should be avoided in most cases.
 

Example 1

As a most basic example, we can use a defined variable to perform a simple calculation in the following Transformation job:

The job consists of a 'Generate Sequence' component linked to a 'Calculator' component. We can use the 'Generate Sequence' component to create a column of numbers that are passed to the Calculator. First, select the 'Generate Sequence' component and set its Starting, Increment and End values to 1, 1 and 10, respectively (below, left). This will create a column of data that counts from 1 to 10 (below, right).


  

Selecting the Calculator component, we can enter the Calculation Editor, Adding a new calculation, we can multiple values in the 'sequence' row (output by the 'Generate Sequence' component) and multiply that value by a number. In this case, we multiply each value in the 'sequence' column by 4 using the following calculation:

"sequence" * 4

The Calculator's output can be viewed by selecting the 'Sample' tab while the Calculator component is selected and clicking 'Retrieve Sample'. Clearly, the sequence column is being multiplied by the number specified in the calculation.


However, we could have used a variable in this calculation, instead. First a variable must be declared in the 'Project' drop down menu and clicking 'Manage Environment Variables'. A new variable can be declared here and given a type, value and scope. Since the variable for this example will be used in a simple calculation, the 'Numeric' type is appropriate. The Scope is of little consequence for this job and can be set to 'Global'. Finally, the default value can be set to any number - we choose 6 in this example.

Finally, the Calculator component must be instructed to use this variable in its calculation. Reentering the Calculation Editor, the constant multiplier can be replaced with the newly-declared variable using the following calculation: 

"sequence" * ${example_var}

Checking the sample output confirms that the Calculator component has correctly used the variable's default value in the calculation.

 


Example 2

In this example we have two environments, 'Test' and 'Live'. A common use case would be to work on a more limited data set in 'Test' but have a full data set in 'Live'. We have a starting_year variable that will define how much data we will use in the 'Test' and 'Live' environments when applied in a filter.

First notice we have configured two environments.

Variables must be predefined and we must have a default value for each environment. It is not necessarily true that two environments have different default values. However, in this example, we wish to set our variable starting_year to an earlier value for use with the 'Live' environment. In the 'Test' environment, we require less data and can thus start collecting it from a much later year.

Now that we have defined it, our variable can normally be used in all expression editors and free text fields. In this example we use a Filter Component to limit our data set by the variable.

The filter is set up to filter any table rows that have value greater than ${starting_year) in the 'year' column. Due to how we defined default values for starting_year, if we run this job in the 'Test' environment, only flights from 2016 onward will remain. In the 'Live' environment, flights from 2006 onward will remain.



The SQL generated by this component and its settings can be viewed in the 'SQL' tab. The example below shows the SQL for the 'Live' environment.



Instead of relying on variables with default values, you can also use a Javascript expression. Javascript expressions do not require defining any variables in the environment. An example for the 'Live' environment is given below.

  

Which would generate the following SQL, given that the current year generated from the Javascript Date() function is 2016.