I would like to load a fact table 1 day at a time, from the latest day of data in the fact table until the current day. What is the best pattern to do this?
Having trouble loading a max value from a redshift table to a variable, then using this date(day) variable in IF logic and inside sql / rewrite table components.
1 Community Answers
Laura Malins —
You can use a python script to set the variable (latestdatevar) to the max value from the table. The code you need in the script is below:
cursor = context.cursor()
cursor.execute(‘select max(modifieddate) from tablename’)
latestdatevar = cursor.fetchone()0
You can then use a loop iterator to iterate round the values to pull the data from the source in the fact. The loop iterator works by iterating through incremental values. To determine how many days worth of data you need to iterate through, you can use python to determine the number of days between the latest date in the table and the current day and write this to another variable, let’s call it difference variable. This difference variable should be the end value of the iterator.
The iterator then needs to be connected to an Orchestration job which loads the data from the source for the date = latestdatevar + difference variable days and then it needs to copy it to a redshift table.