I'm working on data transfer between two databases, one Oracle 12c and one Snowflake Data Warehouse. In the Oracle DB, there is a CLOB field which needs to be loaded into the Snowflake DB. The values in the CLOB field are larger than 4000 bytes, and therefore too large to be converted to the Oracle VARCHAR2 data type. But Snowflake has no equivalent of CLOB, and expects VARCHAR.
I am using a Database Query component to load the Snowflake table, and have tried various methods (via PL/SQL) of getting the CLOB data into a format that Snowflake will accept. But no matter what I try, Matillion returns Oracle errors (from the Oracle JDBC Thin driver). The errors are related to buffer size, such as:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small ORA-64203: Destination buffer too small to hold CLOB data after character set conversion.
I'm wondering if there is a setting within Matillion (or perhaps the JDBC driver) related to buffer size which might affect this? The Snowflake VARCHAR data type can store large amounts of data, so I know that's not the issue.
7 Community Answers
Kalyan Arangam —
Unfortunately, I am not proficient on these settings for oracle. Oracle support or Oracle forums may be able to advice on overcoming buffer issues with the JDBC driver and converting CLOB to VARCHAR types.
The Database Query component exposes additional options supported by your oracle JSBC driver under Connection Options property. These are additional properties supported by the Oracle JDBC driver which may help you here.
One recommendation I have is to test your queries outside matillion so you are sure these work. I recommend using a PL-SQL editor and writing the query result to disk. Once these succeed then you may try those in matillion.
SQL> spool c:\temp\out.txt
SQL> SELECT * FROM USERS;
SQL> spool off
Also, how do you intend to over come the 4k limit on varchar datatype? Whats the maximum with of content in your CLOB field?
Would splitting them into two or more fields be an option?
Thanks for the quick response...the query does execute fine when run via Oracle SQL Developer, and returns the expected results. The maximum length is 479083 characters.
"How do you intend to over come the 4k limit on varchar datatype? Whats the maximum with of content in your CLOB field? Would splitting them into two or more fields be an option? "
Splitting is not an option unfortunately. I decided to try loading the data to Snowflake without casting the CLOB as a VARCHAR, which worked. (I can do the necessary transformations in Snowflake instead of Oracle.) And, I thought that in doing so, the 4K limit would not be an issue. However, it appears that some implicit conversion is being done, because once the data is loaded in Snowflake, querying it reveals it's truncated to 3999 characters. So I'm now back to square one.
I built an orchestration job which has a SQL Script task to create (or replace) the destination table in Snowflake. From there it goes to a Database Query task, with a SQL command run against the Oracle database, to load into the new destination table. I am actually relatively new to Matillion, could you point me to some info on how to the generate metadata?
I was able to generate the metadata outside of Matillion and found that the column in question is a VARCHAR(4000). It was not specified as such in the SQL code to create the table...first it was specified simply as VARCHAR (automatically determines max size), then I tried forcing it to VARCHAR(16777216), which is the actual max size allowed. But in both cases, Snowflake adjusts the column to 4000. That said, it seems this may be an issue with Snowflake, so I have raised a support question with their support community as well.
You mention that you pre-created the target table in Snowflake. As such, have you set the following Load Option in the Database Query component:
Option = Recreate Target Table
Value = Off
The default behavior of the Database Query component is to drop and recreate the target table. So, if you have pre-created the target table, you need to override this default behavior. I’m wondering if that’s why you are seeing the table in Snowflake getting set with the field defined as VARCHAR?