I am using the "Excel Query" component, to load data from and Excel file I have stored on S3.
When I go into S3 in the AWS console, I see each file has a "Last modified" timestamp.
In Matillion, I would like to run the "Excel Query" component if the Excel file was last modified in the last, say, 24 hours. (If the source Excel file has not been modified recently, then I don't want to bother loading it again, during my scheduled job.)
Is there any way I can do this?
(For example, from Matillion, can I somehow access the "Last modified" timestamp of a file in S3? If so, then I suppose I could store the timestamp value in an environment variable, then use an "If" component to run the "Excel Query" only if the timestamp meets my criteria.)
2 Community Answers
Craig Rouse —
You could used a bash or python script to access the AWS S3 bucket (aws s3 ls s3://myBucket) and retrieve the filenames and timestamps, then loop through the files and write out another file to this (or another) S3 containing filenames that fit the criteria.
Then use this new file in a file iterator over the Excel component, passing in the filename as a variable.
That’s one way, I’m sure there are others.