API Extract
    • Dark
      Light

    API Extract

    • Dark
      Light

    Article Summary

    This article is specific to the following platforms - Snowflake - Redshift - BigQuery.

    API Extract

    The API Extract component lets users create their own custom Matillion ETL connector by extracting and loading data from their desired API to transform that data.

    Using this component may return structured data that requires flattening. For help with flattening such data, we recommend using the Nested Data Load Component for Amazon Redshift and the Extract Nested Data Component for Snowflake or Google BigQuery.

    Using the API Extract component requires at least one configured Extract Profile. Read our Manage Extract Profiles guide for more information on completing the Manage Extract Profile setup, including adding a new endpoint. When adding a new endpoint within the wizard, users can opt to authenticate with a username and password or an API token. Enabling authentication will make the properties available (either Username and Password, or API token).

    The Manage Extract Profiles wizard only supports the receipt of, and sending of, JSON objects. Other formats, such as XML, are not supported.

    For guidance when using variables with API profiles, read Using Variables with Parameters.

    An error will occur if an endpoint name is any one of the following:

    • datasourcelists
    • environments
    • versionlists
    • versions

    Please avoid naming an endpoint as such. Users specify the endpoint's name on page 1 of the Configure Extract Connector wizard. To learn more about this wizard, read Manage Extract Profiles.


    Properties

    Snowflake Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    APISelectSelect the API extract profile. To manage extract profiles, click ProjectManage Extract Profiles. Read our Manage Extract Profiles guide for more information.
    Data SourceSelectSelect the data source.
    URI ParamsParameter NameAny parameters that are configured for this endpoint in the wizard will be displayed here. URI parameters cannot be set as constants. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Query ParamsParameter NameSpecify any query parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Header ParamsParameter NameSpecify any header parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Post BodyJSONA request body for the post.
    UserStringThe username used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    PasswordStringThe password used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    Bearer TokenStringThe API bearer token used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Bearer Token.
    OAuthSelectSelect an OAuth entry to authenticate this component. An OAuth entry must be set up in advance. For more information, read Manage OAuth.
    Only available when the Extract Profile's Auth type is set to OAuth.
    Page LimitIntegerSpecify the maximum number of pages to stage.
    LocationStorage LocationProvide an S3 bucket path (AWS only), GCS bucket path (GCP only), or Azure Blob Storage path (Azure only) that will be used to store the data. A folder will be created at this location with the same name as the target table.
    IntegrationSelectChoose your Google Cloud Storage Integration. Integrations are required to permit Snowflake to read data from and write to a Google Cloud Storage bucket. Integrations must be set up in advance of selecting them in Matillion ETL. To learn more about setting up a storage integration, read our Storage Integration Setup Guide.
    WarehouseSelectSelect the Snowflake warehouse. The special value, [Environment Default], will use the warehouse defined in the Matillion ETL environment. For more information, read Virtual Warehouses.
    DatabaseSelectSelect the Snowflake database. The special value, [Environment Default], will use the database defined in the Matillion ETL environment. For more information, read Databases, Tables, & Views.
    SchemaSelectSelect the Snowflake schema. The special value, [Environment Default], will use the schema defined in the Matillion ETL environment. For more information, read Database, Schema, & Share DDL.
    Target TableStringSpecify the table to be used.
    Warning: This table will be recreated and any existing table of the same name will be dropped.

    Redshift Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    APISelectSelect the API extract profile. To manage extract profiles, click ProjectManage Extract Profiles. Read our Manage Extract Profiles guide for more information.
    Data SourceSelectSelect the data source. The Data Source property is only displayed when there is more than one data source (endpoint) configured for the selected API profile. If there is only one endpoint configured for this API profile, then the Data Source property is automatically configured and hidden.
    URI ParamsParameter NameAny parameters that are configured for this endpoint in the wizard will be displayed here. URI parameters cannot be set as constants. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Query ParamsParameter NameSpecify any query parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Header ParamsParameter NameSpecify any header parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Post BodyJSONA request body for the post.
    UserStringThe username used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    PasswordStringThe password used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    Bearer TokenStringThe API bearer token used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Bearer Token.
    OAuthSelectSelect an OAuth entry to authenticate this component. An OAuth entry must be set up in advance. For more information, read Manage OAuth.
    Only available when the Extract Profile's Auth type is set to OAuth.
    Page LimitIntegerSpecify the maximum number of pages to stage.
    LocationFile Structure | StringSpecify the S3 bucket. Users can click through the tree structure to locate the preferred S3 bucket, or specify the URL of the S3 bucket in the URL field, following the template: s3://<bucket>/<path>
    TypeDropdownSelect between a standard table and an external table.
    Standard SchemaDropdownSelect the Redshift schema. The special value, [Environment Default], will use the schema defined in the Matillion ETL environment.
    External SchemaSelectSelect the table's external schema. To learn more about external schema, please consult the Configuring The Matillion ETL Client section of the Getting Started With Amazon Redshift Spectrum documentation.
    For more information on using multiple schema, see Schema Support.
    Target TableStringSpecify the external table to be used.
    Warning: This table will be recreated and any existing table of the same name will be dropped.

    BigQuery Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    APISelectSelect the API extract profile. To manage extract profiles, click ProjectManage Extract Profiles. Read our Manage Extract Profiles guide for more information.
    Data SourceSelectSelect the data source.
    URI ParamsParameter NameAny parameters that are configured for this endpoint in the wizard will be displayed here. URI parameters cannot be set as constants. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Query ParamsParameter NameSpecify any query parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Header ParamsParameter NameSpecify any header parameters. Any parameters that are configured for this endpoint in the wizard will be displayed here. Constants will not appear here.
    Users can toggle the Text Mode checkbox to navigate between grid mode and text mode.
    Can use Grid, Job, and Environment Variables.
    Parameter ValueSpecify the values for any added parameters.
    Can use Grid, Job, and Environment Variables.
    Post BodyJSONA request body for the post.
    UserStringThe username used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    PasswordStringThe password used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Basic Auth.
    Bearer TokenStringThe API bearer token used to authenticate the endpoint.
    Only available when the Extract Profile's Auth type is set to Bearer Token.
    OAuthSelectSelect an OAuth entry to authenticate this component. An OAuth entry must be set up in advance. For more information, read Manage OAuth.
    Only available when the Extract Profile's Auth type is set to OAuth.
    Page LimitIntegerSpecify the maximum number of pages to stage.
    Table TypeSelectSelect whether the table is Native (by default in BigQuery) or an external table.
    ProjectSelectSelect the Google BigQuery project. The special value, [Environment Default], will use the project defined in the environment.
    For more information, refer to the BigQuery documentation.
    DatasetSelectSelect the Google BigQuery dataset to load data into. The special value, [Environment Default], will use the dataset defined in the environment.
    For more information, refer to the BigQuery documentation.
    Target TableStringA name for the table.
    Warning: This table will be recreated and will drop any existing table of the same name.
    Only available when the table type is Native.
    New Target TableStringA name for the new external table.
    Only available when the table type is External.
    Cloud Storage Staging AreaCloud Storage BucketSpecify the target Google Cloud Storage bucket to be used for staging the queried data. Users can either:
    1. Input the URL string of the Cloud Storage bucket following the template provided: gs://<bucket>/<path>
    2. Navigate through the file structure to select the target bucket.

    Only available when the table type is Native.
    LocationCloud Storage BucketSpecify the target Google Cloud Storage bucket to be used for staging the queried data. Users can either:
    1. Input the URL string of the Cloud Storage bucket following the template provided: gs://<bucket>/<path>
    2. Navigate through the file structure to select the target bucket.
    Only available when the table type is External.
    Load OptionsMultiple SelectClean Cloud Storage Files: Destroy staged files on Cloud Storage after loading data. Default is On.
    Cloud Storage File Prefix: Give staged file names a prefix of your choice. The default setting is an empty field.
    Recreate Target Table: Choose whether the component recreates its target table before the data load. If Off, the component will use an existing table or create one if it does not exist. Default is On.
    Use Grid Variable: Check this checkbox to use a grid variable. This box is unchecked by default.


    Using Variables with Parameters

    The API Extract component supports the use of variables with parameters.

    Grid Variables in Matillion ETL can now be used for both the Parameter Name and Parameter value in the following properties:

    • URI Params
    • Query Params
    • Header Params

    Job Variables and Environment Variables can be used for the Parameter Value (but not Parameter Name) in the following property:

    • URI Params

    Job Variables and Environment Variables can be used for the Parameter Name and/or Parameter Value in the following properties:

    • Query Params
    • Header Params

    Users, therefore, can set up a Job or Environment Variable in the format ${variableName} in place of the Parameter Value for a URI, Header, or Query parameter within API Extract.

    Please Note

    Variable names will not be replaced or converted to display the literal value at validation—instead, the variable name will continue to be displayed in the component. At runtime, the value of the variable (at that time) will be used.

    If the default value for the variable is empty, the component will report a validation error.

    For more information about variables, please read the following: