I am loading files from S3 into a temp table on redshift, after which using a table iterator to load data into my fact tables.
I want to do a row count check between data loaded into temp table from s3 with data loaded into all fact tables combined with the iterator.
I am able to export the count from S3 Load component, is it possible to get the total rows loaded by the iterator ?
one potential solution is to create a variable and as part of the iterator job have a python script that sums up the current value and saves to the total, but can it be done in a simper way.
1 Community Answers
Kalyan Arangam —
Row counts are exposed by Load components or transformation jobs. Iterator does not fall into either category.
There may be many ways of getting the final count.
Here’s one -
declare a Shared, Project-level variable (row_count_final) and initialise it to Zero (0) using a python script component
Add a child job to the iterator which contains the S3 load component.
In your child job, Declare a job level variable row_count
Configure the variable row_count on the export tab of S3 Load component
Add a python script component after S3 Load and add teh following lines
row_count_final = row_count_final + row_count
context.updateVariable( 'row_count_final', row_count_final )Next time you load files using the iterator, variable row_count_final should have the total rows loaded via the S3 Load component through multiple iterations.A simpler approach may be to capture a row count into a variable before before you load the files and then get a final figure after you finish processing. the difference may give you the expected number.https://support.matillion.com/agent#Hope this helps.Best