Using incron to automatically copy data to S3
    • Dark
      Light

    Using incron to automatically copy data to S3

    • Dark
      Light

    Article Summary

    Overview

    This is not strictly related to Matillion ETL however it is a common task to get data easily into S3 from a premise or system.  This approach is especially used in a micro-batching scenario.

    The below instructions should be enough to get you started with incron however this is a powerful tool that is capable of much more than the scope of this article can cover.

     

    Installation

    To begin, incron must be installed. This deamon will set and watch a directory for changes. To get this on Amazon linux run

    sudo yum install incron

    Once installed delete the /etc/incron.allow file if it exists. This is used to whitelist who can use the tool.

    sudo rm /etc/incron.allow

    Alternatively whitelist your user in the file.

     

    Setting up aws-cli

    Assuming that the aws-cli is already installed run as it is on an Amazon Linux machine. If not use the instructions here. Run:
    aws configure

    Follow the on screen instructions. Once installed create the script that will copy your file. Run

    incrontab -e

    To edit the inotify configuration, add a line as so:

    <path to local watched directory> IN_CLOSE_WRITE /usr/local/bin/aws s3 cp $@/$# s3://<bucket name><prefix path>

    Example:

    /home/ed/lntest IN_CLOSE_WRITE /usr/local/bin/aws s3 cp $@/$# s3://matillion/delme/

    Now to test add a file to the watched directory and it will appear in S3. If anything is wrong it is written to /var/log/syslog

    Full in incron documentation is worth a read and its available here.