Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Setting variables


I would like to have a variable containing the date that a job was last run. I intend to use this variable in the data source filter of a MongoDB query component the next time the job runs to avoid pulling the same records multiple times. How could I do that with Matillion?

Thank you,

3 Community Answers

Matillion Agent  

David Lipowitz —

Hi Julian,

I suggest using the Python Script component here combined with a table where you can persist your last_run date (hypothetically we might call this table “run_log” and create it with a date column called “last_run”).

Documentation on the python component can be found here.

Before running this job you will need to configure a ${last_run} environment variable from the Profile pull down menu using a DateTime data type. Then you can design a job that does the following steps:

1) A Python Script component that sets the ${last_run} environment variable based on the results of a cursor. Something like this, for example:

cur = context.cursor()
cur.execute('select max(last_run) from run_log')
rec = cur.fetchone()
context.updateVariable('last_run', rec[0])

The code above will set the ${last_run} environment variable to the max of the run_log.last_run column.

2) Execute your MongoDB Query component filtering for documents that have a last_run attribute >= ${last_run} variable.

3) Insert a new record with the current date into the run_log table for use in the next iteration of this job.

I hope that makes sense and is helpful. Please let us know if you have more questions about this.

Best Regards,

Matillion Agent  

David Lipowitz —

Hi Julian,

One more thing on this. Depending on your data structure and design, you may be able to set up your job flow without persisting the last_run date in your database at all. Please see the following article and note the video at the end which offers a good demonstration of what article discusses:

Incremental or High Water Mark data loading

I hope that’s useful.

Best Regards,

Julian Konomi —

Hi David,

The python script component that you suggested above seems to be working.

Thank you for your help,

Post Your Community Answer

To add an answer please login