Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

File iterator duplicating data


I had to generate a filename from the files uploaded to S3, hence I found a very good article here by adding a filename to table which helped me immensely achieve my purpose. However, my data is duplicating in the target table.

I have 2 files uploaded to S3 and I need to load the data from the files into a target table with the filename. i used File iterator on my orchestration job. Now it loads the data from 2 files with filename, but also in the last iteration its duplicating the last loaded file data. I am not able to figure out the exact problem here. I am using table output with append. I even tried using table update/insert, but its the same. I also added a filter condition in my sql script to not pass the file if it already exists in the target table. but still its loading again. i am storing the file in variable and passing it in my job.

Can somebody help me debug this problem? I am new to Matillion and I don't find much of resources online too.


2 Community Answers

Matillion Agent  

Paul Johnson —

Hi Shweta,

When you say duplicated data from the last update – are you obtaining this info from the sample tab on your table output component?
I would actually query the table either using a SQL client, or adding a new table input component and doing the sample there.

If you check the SQL generated by the table output component you can see the select statement uses the input table in the from clause, which may can sometimes be a bit misleading.

There is also an article on how to do what you are looking for here

shweta sagar —

Hi Paul, Thanks for your reply! I figured out the problem. It was a silly mistake from my side. After every load i forgot to truncate the staging table hence it was loading the same data again. the issue is resolved now!


Post Your Community Answer

To add an answer please login