Adding A 3rd Party JDBC Driver
Matillion ETL’s Database Query orchestration component uses JDBC to query relational databases that are acting as data sources for your application.
A number of commonly-used JDBC drivers are provided with Matillion ETL, and can be used out of the box. Some others, notably Oracle, Teradata, DB2 and MySQL, have licensing restrictions that require you to upload your own copy of the JDBC driver into Matillion ETL, using the Project / Manage Database Drivers menu.
Many other JDBC drivers can be used with Matillion ETL, with a couple of extra steps that are described in this document.
Downloading your JDBC driver.
The first step is to acquire a suitable JDBC driver for your source database. This will be in the form of one or more .jar files that you can download. The download package may actually be a .zip or .tgz file, in which case you will need to un-zip or decompress it to extract the individual .jar files.
RDBMS vendors often offer many different drivers, in which case you should look for:
- JDBC Type 4 (sometimes known as “Pure Java” or “Direct to Database”). Other types of driver require additional client software and will not work.
- Compatible with Java 7 (sometimes known as JDBC version 4) Some drivers are simply provided in the form of a single .jar file, which is ideal.
Some drivers are implemented as multiple .jar files, and you will need to download all of them to use the driver.
If a driver has further non-JDBC dependencies, such as logging libraries, the RDBMS vendor should provide documentation for these cases. Users are also required to download all of the dependency .jar files before continuing.
The RDBMS vendor should also provide documentation on some of the main features of the driver, such as the Java class name, and the URL format. You will need this information in order to configure Matillion ETL.
Configuring Matillion ETL
The list of JDBC drivers that you can see in the “Database Type” property of the Database Query component is governed by a single configuration file.
To add a new entry to this list, you will need to log onto the running Matillion ETL instance and edit the file.
Use an SSH client (such as PuTTY) to connect to the instance, sudo to the root user, and locate the file: /usr/share/emerald/WEB-INF/classes/jdbc-providers.properties
sudo su - cd /usr/share/emerald/WEB-INF/classes vi jdbc-providers.properties
The file is made up of a single JSON array, with one entry per supported database. Carefully add a new entry, using any text editor, remembering to end the new entry with a comma if it’s partway down the list.
Follow the syntax of one of the existing entries and set values for:
- name - the display name.
- driver - the Java class name of the driver, which should be in the vendor’s documentation.
- url - a sample URL, used for documentation only.
- allowUpload - always set this to true.
- fetchSize - set this to 500 initially, but the value can be tuned.
- limit - this is a string which Matillion uses internally to create an SQL command for sampling. The allowable values are:
- fetch-first-n - uses a JDBC loop (this is the best default option)
- limit-inline - uses the LIMIT keyword
- limit-outer - uses LIMIT outside a nested SELECT
- rownum - uses a ROWNUM rowstop (Oracle style)
- top-n - uses the TOP keyword (SQL Server style)
Depending on your driver, you can optionally also set additional properties such as:
- autocommit - normally true, but can be set to false.
- defaultProperties - another JSON array with properties that are meaningful to this particular driver.
Once you have finished editing the file, it’s worth checking that the JSON syntax is still:
cat jdbc-providers.properties | jq "."
The command will fail with an error message if the JSON is invalid. Note that syntax errors in the file will prevent Matillion from re-starting.
After editing the jdbc-providers.properties file, you must restart Matillion from the admin screen. Wait a few minutes while the new configuration comes into effect.
Upload the JAR files into Matillion ETL
Once Matillion has restarted, go to the Project / Manage Database Drivers menu. This should open a dialog which allows you to upload the .jar files for all of the JDBC drivers which have their allowUpload property set true.
Upload your .jar file (or files) one at a time, remembering to also include any dependency .jar files.
Once all the .jar files are uploaded, you now have everything in place to begin to use the new JDBC driver.
Test the Database Query component
Drag a Database Query component onto an Orchestration job, and open its Database Type dialog. You should find your new driver available to select.
The example URL should help you to configure the correct Connection URL for your database, normally by configuring the host, port and database name.
The SQL syntax of the SQL Query statement is in the dialect of your new source database.
There are three common symptoms which are often caused by driver configuration problems:
- Database Query component won’t accept your selection - it just reverts back to the default setting
- The component complains that “Parameters contain errors”
- Errors occur during execution of the component, even though the SQL is valid
- If you are receiving the "Unable to find driver of database type [x]" error, ensure that the URL in the jdbc-providers.properties file is a valid connection string pattern. e.g. Using 1234 for a placeholder port number instead of <port>.
If these errors occur, get hold of your Matillion ETL Server Log and contact support. The Server Log is a file called catalina.out, and can be obtained from the Matillion ETL admin screen under the menu More / Download Server Log.