Matillion ETL for Redshift Release Notes
Important Notice: The preferred (and safest) way to upgrade is now to launch a new copy of Matillion ETL running the latest version, use the Migration Tool to move and validate the new version, before deleting the existing instance. In-place upgrade may be removed in future versions.
Matillion ETL for Redshift 1.39 (Features Video)
- S3 Load Generator tool interface has been reworked for a smoother experience.
- New Data Staging components to quickly and easily bring your data into Redshift:
- New streamlined connectors are now available to easily pull data from your services into Amazon Redshift Spectrum external tables:
- Lead/Lag Component now has the option of ignoring Null values.
- The Matillion ETL API (v1) now includes an endpoint that allows creating Versions.
- Database Query and RDS Query components now have Basic Mode similar to other Data Staging components that offers a codeless alternative to using these services.
- Additional endpoints available for the Zuora Query component.
- “Auto” Distribution style option added to many components to make use of Redshift’s automatic table distribution.
- A Default S3 Bucket may now be specified in when managing an Environment, allowing users to skip configuring a staging area in many components.
Matillion ETL for Redshift 1.38.8
- Updated Magento Query driver.
- Updated Google Analytics driver.
- New method of single-account authentication for Stripe Query.
- Improved handling of cluster synchronisation
- Reworked logged diagnostic information.
- Fixed an error where jobs could hang indefinitely due to a problem with the handling of temporary files.
Matillion ETL for Redshift 1.38 (Features Video)
- Configuring Environments now uses a multi-step wizard configuration complete with tooltips to help guide you through creating and editing Environments.
- CDC task creation is simplified with a new wizard UI and can pull data from two new source databases - PSQL and MSSQL.
- Reintroducing a dialog option that allows “Text Mode” configuration of Variables (namely, Manage Environment Variables) and Manage Job Variables). Use the Text Mode to enter or copy variables data in a plain text format.
- S3 Unload Component now allows unloading to other AWS Regions, can insert header rows to unloaded files, perform BZip2 compression and can add source data types to the generated S3 Manifests.
- Calculator Component has been updated to include additional JSON and Regex functionality.
- Product Improvement Metrics can now be gathered within Matillion ETL. When you update your instance of Matillion ETL for Amazon Redshift you will be prompted to share anonymized Product Improvement Metrics data. Matillion will use this information to build you a better product. Absolutely no personal data is collected through this service.
- Create Project now has a multi-step wizard configuration complete with tooltips to help guide you through creating a Matillion ETL for Amazon Redshift Project.
- A new Sharepoint Query component allows users to connect to their Sharepoint account to bring data into Amazon Redshift and transform it using Matillion.
- New DynamoDB Query component loads data from Amazon’s DynamoDB.
Matillion ETL for Redshift 1.37
Note: The Data Transfer component is now the preferred way to move files/objects between storage providers. Existing components such as S3 Get/Put, GCS Get will continue to work in existing jobs, but new jobs should use Data Transfer.
- New Data Loading orchestration components:
- All Data Loading orchestration components can now display data directly sampled from their source, allowing users to easily check their current configuration.
- All Data Loading orchestration components can now display the SQL generated by the component.
- CDC based on DMS gives users the power to automatically keep their Redshift tables in-sync with their source database tables using Matillion ETL.
- Support for creating External Tables over semi-structured external data hosted on S3.
- A new Nested Data Load component can flatten incoming nested data according to a user-defined structure. Unlike nested data, the flattened data can be fully-functional on Redshift.
- A new JDBC Table Metadata to Grid Component connects to many types of JDBC database and can export the metadata from a source table into a Matillion ETL Grid Variable.
- A new Migration Tool can help move any number of Matillion ETL assets directly from one Matillion ETL instance to another.
Matillion ETL for Redshift 1.36
- A selection of new components to connect to various services:
- Shopify Query connects to the user’s Shopify account.
- Survey Monkey Query loads data from a SurveyMonkey database.
- Zoho CRM Query retrieves Zoho CRM data.
- Dynamics 365 Sales Query connects to the Sales service in Dynamics 365.
- Dynamics 365 Business Central Query connects to the Business Central services in Dynamics 365.
- Many UX improvements including automatically connecting components on the canvas, improved variables workflow and new keyboard shortcuts.
- ORC and PARQUET file formats now supported in S3 Load.
- Selected Environments are now user-specific. Users can now specify their environments independently of one another.
- Users can now freely copy, cut, and paste jobs within a project.
- Autocompletion prompts now appear in many places when using Matillion ETL variables in code.
- Users can now search their environment tree to quickly find tables and views.
Matillion ETL for Redshift 1.35
- New “Data Transfer” Component that boasts all the functionality of the existing S3 Get, S3 Put and Cloud Storage Put components, plus additional source and target destinations (Azure Blob Storage).
- In addition to AWS and GCP credentials, environments can now reference Azure credentials to interact with Azure services such as Blob Storage.
- A new “Apache Hive Query” component connects to your Apache Hive data warehouse.
- A new “LinkedIn Query” component connects to your company’s LinkedIn apps.
- A new “Bing Search Query” component connects to the Bing Search API.
- A new “Bing Ads Query” component connects to the Bing Ads service.
- A new “Dynamics 365 Sales Query” component connects to the Dynamics 365 API.
- Allow uploading the native Microsoft SQL Server JDBC Driver (the bundled jTDS driver is often the fastest in scenarios where it works)
- New ‘Extract To New Job’ function available by left-clicking a selection of multiple components on the canvas. Allows users to instantly create new jobs from a group of components, tidying up workflows and helping to create reusable jobs.
- A new Table Properties panel (accessible via a right-click on a table in the Environment tree) shows table metadata. For key distributed tables, a chart showing data skew is shown. For external (Spectrum) tables, file locations and partition information are shown.
- The new “Salesforce Incremental Load” Wizard allows users to quickly and easily set up incremental loads from Salesforce.
- The new “JDBC Incremental Load” Wizard allows users to quickly and easily create incremental loads from a variety of popular database types.
Matillion ETL for Redshift 1.34.5
Matillion ETL for Redshift 1.34
Important Notice: On upgrade, a background task will restructure the task history. During this time not all historic tasks will be available to view in the UI or API. The process only takes a few minutes in the general case but can take several hours if you have millions of run history items. This will require additional disk space (either on the instance or on RDS depending on your setup) so ensure you have at least 50% free space before attempting the upgrade.
- Shared Jobs:
- You can now turn your reusable orchestration jobs into their own components with their own parameters, help and Icon.
- Shared jobs can be packaged and distributed across multiple ETL instances with Import and Export.
- Historic Task Viewer:
- Previously completed tasks can be viewed on the canvas along with any parameter errors.
- You can understand the canvas state of a job and also see the jobs contained in a Shared Job.
- An “Unconditional” connector:
- Its now simpler to build orchestrations where the next orchestration step is run regardless of the success or failure of the prior step. This avoids use of extra “and” and “or” components to achieve the same thing.
- “Auto Debug” for all Data Loading components:
- Data Loaders come with the Auto Debug property. When switched on, allows users to choose between 5 levels of Debug Logging verbosity.
- Makes it easier to retrieve logging information without console access to the Matillion ETL Instance. Include these logs in your support requests for much faster turnaround!
- Warning: Can potentially consume large amounts of disk space. Do not leave this switched on unless directly in need of it!
- It is now possible to import, export and modify permissions via the API.
- Window Calculation component now supports “Standard Deviation” and “Standard Deviation Population”.
- Data Lineage:
- Allows you to understand the effect that your complex transformation jobs will have on your data. Track a column backwards to its source to determine where and how calculations are applied.
- This is an Enterprise Feature and thus is available to customers using large and xlarge instance types.
- OpenID Connect support for third party login providers:
- You can now configure Matillion ETL to authenticate with any Open ID Connect provider.
- Default support for Google, Microsoft and Okta plus a “Generic” option.
Matillion ETL for Redshift 1.33.10
- Hot fixes for Salesforce OOM
Matillion ETL for Redshift 1.33
Important: Queries using the Advanced Mode of the Google BigQuery Query Component will pass the SQL directly to BigQuery without any interpretation. If this causes any problem, please set the Connection Option 'Query Passthrough' to FALSE
Important: Users with the API role bypass some permissions checks (reads) during API calls.
- Open Exchange Rates Query component connects to the Open Exchange Rates API.
- Grid Iterator allow iterating the values of a Grid Variable, similarly to iterating through a table of values.
- SQL Editor (in all Query components) now shows available Tables/Columns and Variables to help you author and test SQL queries from source systems.
- A new “Notices” V1 API endpoint allows you to query the current system notifications and post new messages which notify all users.
- New “User Configuration” and "Permission" V1 API endpoints allow user management via the Matillion API.
- Matillion no longer requires “listAllBuckets” permission (although this is still recommended)
- Job Variables (scalar and grid) now have a “Visibility” that determines how they are used elsewhere.
- All variables now have a description.
- 100+ bug fixes across all areas of Matillion ETL.
- When browsing recent STL Load Errors, more details are provided on the exact cause of parsing problems.
Matillion ETL for Redshift1.32.8
- Change: Allow Tomcat to start even if there is a corrupt schedule attached to a job. A bug fix to prevent the corrupt schedules attached to jobs in Matillion ETL 1.32 is currently in progress.
- Bug Fix: Prevented inappropriate authentication errors when using basic OAuth in the Zuora Query component.
IMPORTANT : Ensure you have a backup before you upgrade. Security configuration changes are applied on upgrade. These changes cannot be reversed, so do not use “yum downgrade” (or similar) to attempt to get back to versions prior to 1.32.
Matillion ETL for Redshift 1.32
Enterprise Only: This version of Matillion introduces a new Permissions system that allows users to:
- Setup users with fine grained permission sets that can limit the 100+ core functions of the tool
- Provides default permission groups:
- Reader - Read only user who can’t modify a project
- Reader with Comments - Reader with ability to add notes to jobs
- Runner - A user who can execute but not modify jobs
- Scheduler - A user who can execute, schedule and change related config
- Writer - A user who can create ETL jobs but not delete projects
- Additional permission groups can be added at any time and are organised hierarchically making them easy to set up.
- A new suite of Grid Variable components are now included to make populating and manipulating them simpler - often without requiring any scripting:
- A new “SendGrid Query” component connects to the sendgrid email delivery platform
- A new “ElasticSearch Query” component to connect to the elasticsearch search engine
- A new “Magento Query” component to connect to the Magento content eCommerce system
- A new “Zuora Query” component to connect to the Zuora subscription software platform
- A new “GMail Query” component to connect to Google’s email service
- A new “Run Now” action has been added when defining a schedule
- Double-clicking a component on the canvas now opens the components “default” editor, if it has one. For example, double-clicking a Bash Script component will begin editing the script.
- Internal User: When using the “internal” security option tomcat user passwords are now hashed when stored on disk.
- Alter WLM Slots Component
- This allows queries run by Matillion to make use of multiple WLM slots to give very complex queries additional memory without having to spill intermediate results to disk. You can determine which queries spill to disk with SVL_QUERY_SUMMARY and similar system views.
- This can be increased and decreased during a flow, so only those parts of the ETL that require it consume additional resources
- External (Domain-based) Login: You can now encrypt your Realm Password with the AWS Key Management Service (KMS)
Matillion ETL for Redshift 1.31.8
The "Google AdWords Query" component has been updated to support the latest Google AdWords API's.
Matillion ETL for Redshift 1.31.7
Important (possible breaking change): API Profiles ("RSD’s") that handle paging may need to be tweaked to disable “auto” paging. Please see here for more details.
Important (possible breaking change): API profile limits are now applied. Where the default of 100 is set it will now be applied. This could affect API Query Components which previously ignored that limit.
- Zendesk Query orchestration component for loading data from the Zendesk customer relationship system.
- Mixpanel Query orchestration component for loading data from Mixpanel product analytics system.
- Xero Query orchestration component for loading data from the Xero accounting system.
- Dynamics 365 Query orchestration component for loading data from Microsoft Dynamics CRM/ERP.
- API Profile RSD Generator
- Accelerate the development of API Profiles using a new tool that automatically generates a basic XML “RSD descriptor” for any API endpoint, based on a sample of data returned.
- REST API Version 1 - Matillion ETL now has full API coverage:-
- You can now read/write more assets (JDBC Drivers, credentials, SQS configuration) as well as allowing finer-control of which resources to include.
- A map of the v1 API is available here.
- The “v0” api is still available and unchanged.
- Grid Variables System
- In addition to “scalar” (single-valued) variables, you can now define grid variables to hold lists and grids of values; use them wherever a compatible list or grid of values is required.
- Grid variables can be manipulated/modified in Python.
- You can pass values for grid variables when starting a job via SQS and/or the V1 API.
- You can now disable parts of an Orchestration job.
- Improved Matching in column mappings - Many transformation component “Column Mapping” parameters can now be automatically mapped, even when the input and output column names are similar but not identical.
- You can now delete tables/views directly from the Environment tree.
- External Tables based on Amazon Redshift Spectrum now support skipping header rows.
Matillion ETL for Redshift 1.30.6
- Redshift now supports Real/Double data types
- The External Table Output component for Redshift Spectrum now has partitioning support.
- New "Copy Table to External Schema" GUI tool generates new "Table Input/External Table Output" components with full partitioning support.
- Perfect for customers who want to try out Redshift Spectrum with their existing data.
- Redshift now supports "late binding" views on input components and view creation.
- The Google BigQuery Query component now supports standard SQL.
- New Server Migration Tool makes it easy to migrate all configuration including Oauth, API Profiles and Drivers in addition to projects to a new Matillion Cluster.
- Redesigned "Scheduler" user interface to simplify the management of scheduled orchestration jobs.
- New "Task Info" panel and "Task" panel make it much easier to understand complex tasks both at run time and after job execution.
- Matillion variables can be defined and scoped at job level making jobs much more reusable. Variables can now be passed to and returned from jobs.
- New Quickbooks Online Query component to connect to the popular online accounting system.
- New Square Query component to connect to the payment system.
- New Google Custom Search component allows google search data to be ingested.
- All data-staging components can append rows to an existing table as well as creating new tables.
Matillion ETL for Redshift 1.29.9
- Matillion ETL for Redshift Introduces the ability to configure Matillion ETL in a highly available topology with fully active-active cluster. This feature is only available on large and xlarge instance types.
- Jobs run from SQS, the API or the built-in Scheduler will now fail-over in the event of an instance failure.
- Scheduled runs missed because a server is offline will be run when it becomes available again.
- Once two or more members are in the cluster, a Cluster Info tab shows membership status and activity.
- OAuth tokens, Database Drivers and RSD Profiles are replicated via the persistence database (postgres).
- Logging from each node is sent to Cloudwatch.
- Cloudformation Templates help you get started with a clustered Matillion.
- New Jira Query component loads data from Atlassian's popular Software Development Platform.
- New PayPal Query component can load payment and other data from Paypal Business accounts.
- New ServiceNow Query component loads data from Servicenow’s IT Service Management (ITSM) platform.
- New Stripe Query component loads data from Stripe’s payment platform
- New Email Query component can query an IMAP based email system.
- New YouTube Analytics component can query data from the YouTube Analytics API.
- Excel Query can now load files from Google Cloud Storage, as well as Amazon S3
- You only see S3 and/or GCS when you have credentials in the environment, otherwise they are hidden.
- New option to drop a schema from the Environment Tree.
- Specify a region in S3 Unload (to allow writing to buckets outside of the Redshift cluster's region)
- S3 / Google Cloud Storage file browser enhancements.
- Set advanced connection options during OAuth flow (e.g. to connect to a Salesforce Sandbox)
- Warning: Manage Backups and View Audit haven't been removed, they have been moved to the Admin menu.
- New External Table Output component - similar to Table Output but creates a Amazon Redshift Spectrum table over S3 data.
- New "Add Partition" component (Amazon Redshift Spectrum only).
- New "Delete Partition" component (Amazon Redshift Spectrum only).
- Other Amazon Redshift Spectrum components have new Table Partitioning parameters.
IMPORTANT Upgrade Notes: All data-staging components now create a target table with a wider range of target data types. Mostly this will be transparent, however if your source data contains variables with the Boolean type, these will now be Boolean in Redshift too (previously, they were varchar true/false strings). This may have an impact on downstream logic so please test jobs after upgrade.
Matillion ETL for Redshift 1.28.7
- Amazon Redshift Spectrum support. You can now run SQL Queries in redshift directly against data sets in your S3 data lake in Text, Parquet, SequenceFile and other formats. Matillion ETL 1.28 introduces first-class support for all key Redshift Spectrum features and will allow users to combine Amazon Redshift Spectrum data with regular Redshift data in transformations. These include:
- Components for creating an External Tables over S3 Data.
- Rewrite External Table writes redshift data into S3 and defines an external table to reference it.
- All data-staging orchestration components (all components ending in "query") can write data to S3 and generate a compatible External Table to reference it. This will allow users to keep both small data sets and very large data sets in S3.
- Amazon Redshift Spectrum schemas and tables are displayed in the Environment Tree.
- Matillion no longer relies on creating views in Redshift to represent components.
- Post upgrade we recommend using the “Delete Views” function on the Environment to remove existing views generated by Matillion. These will not be recreated and any other v_xxxxxxxxxx_xxxxxxxxxx views can be safely, manually removed.
- A new Admin Menu allows administrators to:
- Get the server log.
- Update the Matillion server version.
- Configure users (using either an internal user database or external directory server).
- Configure SSL.
- Note: This will replace the existing /admin application. This is currently retained on upgrade but will be removed in a future update. Please use the new Admin Menu where possible.
- All transformation components support multiple outputs.
- Separate Replicate component no longer mandatory.
- Enterprise Features (these features are only enabled for users running m4.large or m4.xlarge instance types).
- Automatic Job Documentation. Matillion ETL can automatically generate documentation for your ETL process. This tool will recursively search your jobs and include all job detail including linked notes and descriptions.
- Auditing of User Actions with searchable Audit Log provides fine grained audit of every change to an ETL process.
- Ability to use Matillion ETL with an external postgresql repository on RDS. Allows you to externalise all your Matillion Job and configuration data to RDS and take advantage or RDS features such as backups and point-in-time recovery. Please contact Matillion Support if you wish to take advantage of this.
- Database Query now supports IBM Netezza data warehouses via JDBC.
- S3 Server Side Encryption. Data written to S3 from any Query component, the S3 Put component or the S3 Unload component can now apply Server Side Encryption (SSE-S3 or SSE-KMS)
Matillion ETL for Redshift 1.27.4
- The Python Script component has been upgraded and now supports use of Jython (the default), Python 2 and Python 3. This is useful for customers who wish to use pip modules that are not pure python.
- New Search system. Find jobs, notes and component properties anywhere in a project via a new Search tab.
- An upgraded UI toolkit delivers a smoother, faster user experience.
- Upgrades to the Task Panel add the ability for the user to:-
- Multi-select tasks to cancel in the task panel.
- Collapse all expanded items.
- Remove all completed tasks.
- Export Jobs now allows you to multi-select a choice of jobs in the job tree and export them.
- The S3 Put component now allows the user to grab data from a self-signed HTTPS endpoint.
- Google Cloud users - Also check out new Matillion ETL for BigQuery.
Matillion ETL for Redshift 1.26.9-2
Matillion ETL for Redshift 1.26.9
- A new Youtube Query Component.
- A new EMR Load Component to make it easier to natively load EMR data sets.
- Environment Explorer Tree shows UDFs, Primary Keys, Sort Keys, Distribution Keys.
- Validation of Orchestration tasks now run in the background and appear in the Task panel (in the same way as Transformation tasks) this is more predictable, particularly for components that take longer to validate against 3rd party API's.
- A new Connection Manager allows you to see and control connected sessions. This will also prevent users from being locked out when they hit their connection limit.
- All data-loading components now support a "Load Options" parameter to control:-
- keeping the objects in S3 after the load completes for archive purposes.
- Turning off automatic compression analysis.
- Turning off automatic statistics gathering.
- Users (using internal security) can be added/removed without requiring a restart.
- The If component now logs the decision taken to the task panel. This will help users diagnose decision logic problems.
- Matillion now runs on Java 8.
Matillion ETL for Redshift 1.25.3
New Data Connectors
- New Text Output orchestration component simplifies export to CSV and other Text based formats.
- Similar to S3 Unload, with support for headers
- File Iterator now supports S3, you can loop over a list of files in an S3 bucket.
- KMS Encryption option in password manager allows you to use AWS managed encryption keys to encrypt passwords in Matillion ETL.
- Run Transformation / Run Orchestration components now support variable overrides to make it easier to run jobs in a reusable manner.
- Added support for the boolean data type.
- The scheduling test will check your maintenance window and warn of possible overlaps.
- Some orchestration components such as Create Table have an SQL Tab so it is easy to understand the generated SQL.
- Additional methods available on Matillion date variables will simplify using dates in variables.
- New cleaned up and simplified sample tab.
- Hundreds of other tweaks, minor improvements and bug fixes.
Matillion ETL for Redshift 1.24.6
New Data Connectors
- Delete Tables - Remove table such as temp tables as part of an Orchestration.
- S3 Get Object - Get S3 Objects and push them SFTP, HDFS and Windows File Shares.
- File Iterator - Iterate over a list of pattern matched objects in an FTP, SFTP, HDFS or Windows Fileshare.
S3 Load Generator
- This tool helps generate compatible "Create Table" and "S3 Load" components by sampling delimited data files on S3 and guessing the layout.
- Private projects can be created
- Projects have an owner who controls which other users can collaborate
- You may enable automated daily backups of the Matillion ETL for Redshift instance root volume
New Chat and Presence Features
- You can see who else is collaborating with you, and chat to them. Chats are persisted to provide context on your project
- Create Table and Fixed Flow components support additional data types (Integer, Date). More to follow.
- S3 Put Object now supports S3 as a source (in case you have ZIP files on S3 that need unpacking before loading to Redshift).
- The SQL component can now be used at the beginning of a flow.
- We now include an API profile for Matillion's API to copy the run history to Redshift. The API Query component can be used to query this data and import to Redshift.
Plus hundreds of minor improvement and bug fixes.
Matillion ETL for Redshift 1.23.6
- Updated Google Adwords driver to support latest API versions.
- Fixed an issue with environment explorer showing tables and views.
- Fixed an issue with join component validation.
Matillion ETL for Redshift 1.23.5
- The Sample tab now allows filtering to assist debugging complex transformations.
- Real-time validation of expressions in the Expression Editor
- Your syntax is checked by Redshift as you type.
- Jobs and folders in the explorer can be moved and copied in bulk.
- Improved editor windows. You can see available variables and test your code without leaving the editor when writing Python and SQL Scripts.
- Notes can now include bold, underlined and italic text, as well as hyperlinks.
- The Task History is now searchable, and opens in a separate tab.
- In the environment navigation browse the available tables, views and columns within each environment; drag and drop them into a Transformation.
- On the Table Output Component "Analyze Compression" now supported an "If not compressed already" setting.
- Python modules can now be installed with 'pip', and the latest boto3 API is now included by default for interaction with AWS services.
- The S3 Load Component can specify an IAM Role ARN that is attached to your Redshift cluster.
- The RDS Bulk Output Component now supports output to Postgresql databases.
- The S3 Put Component can now read directly from HDFS.
Matillion ETL for Redshift 1.22.5-2
- Minor fixes for some customer issues
Matillion ETL for Redshift 1.22.5
- Please backup the instance before upgrading! https://redshiftsupport.matillion.com/customer/en/portal/articles/1991953-upgrading-matillion-etl-for-redshift?b_id=8915
- If you have version 1.21.5 and have the admin app set up you can upgrade using the Admin App.
- If for any reason the upgrade fails, restore from backup and please contact support.
- Non-blocking task queue allows users to collaborate more seamlessly without being blocked by each others requests.
- Multiple runs of the same job will queue.
- All other runs may happen concurrently, regardless of the environment.
- New Components:-
- Load data from Hubspot with the HubSpot Query Component.
- Load Odata Sources with the OData Query Component.
- Load Microsoft Excel Spreadsheets with the Excel Query component.
- Load Google AdWords data with the Google AdWords Query component.
- SFTP Put Object component will allow you to write transformed data from Redshift back to an SFTP server.
- Retry Component allows automatic retrying and backoff which is most useful for 3rd party API's that are not 100% reliable.
- S3 Put Object now supports copying a file from a Windows File Share.
- You can now run an orchestration job from part way through.
- Profile editor for bulding data profiles to describe how API's map to tables and columns that can then be queried from the API Query Component
- Import/Export can now include details of Variables and Environments
- Notices/warnings/errors are now displayed on a new "Notices" tab.
- Preview API to import/export entire projects, run jobs, monitor running jobs.
- Can be used for integration to 3rd party source control management systems.
- Ask support for more details on how to get started with this
- Plus hundreds of performance improvements and minor features.
Matillion ETL for Redshift 1.21.5
- Please backup the instance before upgrading! https://redshiftsupport.matillion.com/customer/en/portal/articles/1991953-upgrading-matillion-etl-for-redshift?b_id=8915
- If you are upgrading Matillion to 1.21.5 from a previous release, once you apply the updates and restart the application server (tomcat), the internal job repository will be re-written in a new format. Please be patient and do not restart the server (or tomcat) during this period.
- If for any reason the upgrade fails, restore from backup and please contact support.
- New Components
- Google Spreadsheets Query
- Marketo Query
- RDS Bulk Output (for Aurora, MySQL and MariaDB)
- Bash Script
- CloudWatch Publish
- Project Selection and Organisation
- Open a job directly from the project chooser
- Recently opened jobs are tracked
- Project Selection is searchable
- Create your own folder structure in the project tree
- Other Enhancements
- Centrally Managed Passwords
- Manage Users, Software Upgrades and more through a new Admin screen
- Copy/Paste settings between spreadsheets/text files and the Grid Editor
- SNS/SQS/RDS Components will offer Topics, Queues and Endpoints to choose from (if the given credentials allow it)
- Plus dozens of other minor improvements and fixes
Matillion ETL for Redshift 1.20.4
- If you upgrade an existing instance, you may get errors running Python scripts that worked previously.See here.
- If you previously committed changes made in the Python component (e.g. via cursor.connection.commit()) this is no longer necessary and may actually fail - please remove any such commits from scripts.
- Concurrent execution of Orchestration Tasks
- If your existing jobs make any assumptions about the order that components run in, be careful! Use the And component to ensure all of the components before it complete before the orchestration continues.
- Scoped Environment Variables.
- Variables can now be local or global (default). Concurrently running jobs see copies of local variables, but share global variables. This is useful if you re-use the same variables concurrently. For more information, see Using Variables
- Connectors for:
- Dynamics CRM
- Google Big Query
- Create View component
- End a transformation job with a view definition instead of a table
- Database Query now supports Teradata
- S3 Put can unpack a zip file, and place the contents onto S3
- This will use local storage on the instance temporarily
- Table Input, S3 Unload and Table Iterator can all use a View as well as a table
Matillion ETL for Redshift 1.19.4
Major new features:
- Ingest data from Twitter, Facebook and Google Analytics into Redshift
- Twitter, Facebook and Google Analytics all support OAuth authentication to keep your data secure
- New components to provide full transaction control - Begin, Commit and Rollback
- These will allow you to guarantee the consistency of your ETL jobs output.
- Improved task cancellation will now also cancel RDS Load, Database Query and S3 Put, as well as give you a way of checking for cancellation within custom Python scripts.
- New Detect Changes component will compare two input flows and detect if rows are identical, new, deleted or changed. This component supports a number of Data Warehousing use cases such as simplifying development of slowly changing dimensions
Minor features and fixes:
- Support for EXPLICIT_IDS, this allows you to override ID fields when ingesting data from S3.
- ALL incoming SQS messages get a response to the success or fail queue, even if the project/version/job was not found
Plus over a hundred other tweaks and improvements.
Matillion ETL for Redshift 1.18.5
- S3 Manifest File Writer - This components add all S3 objects matching a regularly expression into a manifest file, ready to use in the S3 Load component.
- Transpose Rows - Aggregate data into delimited lists, a.k.a. List Aggregate
- If Component- in an Orchestration job flow evaluates variables to conditionally execute parts of the orchestration flow
- Create Table now allows specification of identity (auto-increment) columns, primary keys and sort-key style
- Components that create tables (Create Table, Rewrite Table, RDS Load, Database Load) can now specify whether the sort key is Compound or Interleaved.
- S3 Load now supports the AVRO file format.
S3 Load and S3 Unload now support a master symmetric key for client-side encryption.
- Aggregate component now supports an Approximate Count
- Many editors now support syntax-highlighting and auto-completion, including the Expression Editor.
- User Defined Functions are now listed in the Expression Editor along with all the built in Redshift functions
- Enable SSL between your AMI and the Redshift Cluster
- Each components can export runtime information into a user-defined variable. That variable can then be used in other components, including the new If component to direct the flow of a job.
- Support for DB2 in the Database Query component (subject to uploading your own driver)
- Dozens of minor enhancements and bug fixes to components, memory management, performance and documentation
- Revalidation may be required before new component properties appear
- A forced-refresh of the page may be required- this is usually [Ctrl-F5] in modern browsers.
- You will not be able to 'undo' changes to before the time of upgrade.
- After doing the upgrade, restarting tomcat and force-reloading your browser, if you have any issues please contact support: https://redshiftsupport.matillion.com/customer/portal/articles/2282577-getting-support"
Matillion ETL for Redshift 1.17.2
- You can reference input/output data from other schemas in your Redshift database.
- Iterate the values of a table, a fixed list of values, or an integer sequence.
- S3 Put Component - Upload data from HTTP/HTTPS/FTP/SFTP to S3 as part of orchestration.
- Python Component - Run simple python scripts to interact with your AWS infrastructure.
- Manifest support in the S3 loader
Matillion ETL for Redshift 1.16.2
- SQS Queue Integration (https://redshiftsupport.matillion.com/customer/en/portal/articles/2144265-integration-with-amazon-sqs?b_id=8915)
- Orchestration Jobs can now be nested
- You can now specify column encoding on create table or Automatically analyze and compress columns on Table Output.
- Cancel Task
- SQS Message - post a message to an SQS Queue.
- SNS Message - post a message to an SNS Topic.
- Schema Copy - copy a number of tables from one schema to another in a transaction.
- Database Query - Read Data from JDBC sources.
Matillion ETL for Redshift 1.15.2
- Job scheduler. Launch orchestration jobs on a regular basis and configured scheduled data transformations
- UI Improvements including Snap-to-Grid, Zoom
- RDS Query - Read data from RDS sources.
- SNS Message - post a message to an SNS Topic.
- Analyze - Initiate a Redshift Analyze on a table
- Vacuum - Initiate a Redshift Vacuum on a table
- Truncate - Truncate a table
Matillion ETL for Redshift 1.14.5
- New S3 Load/Unload components to help get data into and out of Redshift
- Ability to manually define AWS credentials after AMI Launch
- Internal caching makes running the same jobs repeatedly much faster.
- New components can (i) update table rows, (ii) delete table rows, (iii) split a field on a delimiter
- 72 other bug fixes and minor improvements