I am loading a file into a table (single column) - which has several file names. - This is a S3 load
Now in the second step - I have to check one by one, if the file exists in one certain folder of S3 - if yes then then I have to mark Yes against it -- Using may be with Update script.
Do we have a file checking mechanism here which could particularly useful in this scenario ?
2 Community Answers
Nachiket Mehendale —
So basically like I will check the column value against the files stored in a folder of S3 bucket, If found, I have flag it as Yes, I shall be creating and updating the SQL table through SQL object. But how to check whether the file exists or not and then update the status is a challenge. I can make use of Table Iterator to run through the table that has file names but how can we check the existence of files in the particular folder of S3 bucket ?
You’re right, Matillion doesn’t have any component other than the File Iterator for checking if a file exists.
If you want to do this without the Manifest Builder then one solution might be to have a Bash Script containing just
aws s3 ls s3://your-bucket/your-file
and use the fact that it will fail if it can’t read the file. Connect the green (success) output to the job that marks the file as present. Connect the red (failure) connector output to the job that marks the file as missing.