Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Data-staging components, and the DATE type

I was very glad to see that with the new version 1.29, "All data-staging components now create a target table with a wider range of target data types." Particularly for BOOLEAN types, which now seem to work "as expected." Nice job Matillion team -- thanks!

However, these components still do not seem to handle DATE types "as expected." Here are two examples, which I am mentioning because not only is their behavior "unexpected," but each has a DIFFERENT unexpected behavior!

1. One of my sources is an on-premises SQL Server database. When using the Database Query component to extract a DATE column, the target table ends up with a VARCHAR(10) column. Populated with strings, like '2017-10-02'.

2. Another one of my sources is a PostgreSQL database on RDS. When I use the RDS Query component to extract a DATE column, the target table ends up with a TIMESTAMP column. Populated with timestamps, all having 00:00:00 in the "time" portion.

Right? What is the story with this -- why wouldn't these components simply create a DATE-type column in Redshift?

And, Matillion team: might you ever change this, in the future, so that these components will create DATE-type columns?

1 Community Answers

Matillion Agent  

Ian Funnell —

Hi Kevin,

Dates and timestamps are always tricky because the intermediate representation (CSV file) does not distinguish between real dates and strings that look like dates. Matillion calculates the target structure by querying the JDBC metadata, and there’s always a mapping onto Redshift’s own datatypes, and sometimes a dependency on which version of a JDBC driver is being used.

There are two main approaches for gaining more explicit control over the target datatypes when using the Database Query:

  1. Cast at source (i.e. alter the SQL query to convert the column into a specific datatype)
  2. Use the new feature (added in v1.29) where Matillion does not drop and re-create the target Redshift table if you switch off the Load Option “Recreate Target Table”. This also applies to other load components: not just the Database Query.

If you do take advantage of not recreating the target table, Matillion will re-use the existing target table, which you have created manually beforehand with exactly the datatypes you want, instead of dropping and re-creating it (with perhaps different datatypes). There will still be some implicit conversions during the Redshift bulk upload, but you are guaranteed to have the datatypes you want.

Best regards,
Ian

Post Your Community Answer

To add an answer please login