Skip to main content
Snowflake Connection: Deep Dive
Updated over a year ago

This article provides deeper technical details about authenticating, configuring, and running the Snowflake Connection:

Follow the steps in the primary connection article to authenticate and configure the connection. Use the information in this page to inform your decisions and to understand how the connection works under the hood.

Snowflake Connection

For instructions on configuring and running the Snowflake Connection, see the main Snowflake Connection page.

Authentication for the Snowflake Connection

In the connection's Set up and run page, enter your Snowflake and Amazon S3 or Google Cloud Storage credentials to authenticate your connection with those two systems. This section contains tips for authenticating with both systems.

Snowflake Authentication.png

Snowflake authentication

  • You’ll need your Snowflake account name, username, and password.

  • You can select a storage provider, either Google Cloud Storage or Amazon Web Services (S3).

    BlueConic Snowflake Authentication Options.png
  • Snowflake users can be normal business users with sufficient access. But you may want to consider creating an authorization for external integrations with Snowflake so if the specific user account is removed, the connection does not need to be reconfigured.

  • These authentication details are used to get metadata on database, schema, and tables, and to run import and export SQL commands to get and send data.

  • For all databases that will be accessed via the BlueConic Snowflake Connection, the Snowflake account must have the following grants:

    • SELECT

    • INSERT

    • UPDATE

    • DELETE

    • CREATE STAGE

    • CREATE FILE FORMAT

    • CREATE OR REPLACE FILE FORMAT

Google Cloud Storage authentication

  • If you choose Google Cloud Storage as storage provider for the data exchange, the Snowflake Connection requires your Google Cloud Storage credentials: Client ID, client secret, project ID, and bucket to be used to retrieve and store files.

  • You must be connected to your Google Cloud Storage account first.

Amazon Web Services (S3) authentication

  • The Snowflake Connection requires your S3 key, secret key, and bucket name.

  • AWS IAM user can be regular business users with sufficient access. But it is also common to create an authorization for external integrations with AWS, so if the specific user account is removed the connection does not need to be reconfigured.

  • These credentials are used for the staging of data for import and export into S3 via CSV staging files.

  • IAM access should allow the following actions for the bucket where the files are staged:

    • s3:PutObject

    • s3:GetObject

    • s3:GetObjectVersion

    • s3:DeleteObject

    • s3:DeleteObjectVersion

Importing data from Snowflake to BlueConic

Follow the configuration steps outlined in the Snowflake Connection page to set up your import goals. The image below shows the execution flow for importing data from Snowflake to BlueConic via Amazon Web Services (S3), but you can also use Google Cloud Storage.

Snowflake Import Batch Flow.png
  1. The connection authenticates with Snowflake and with Amazon S3.

  2. It executes a "COPY INTO" SQL statement mapping columns used in config to csv columns to have Snowflake generate CSV. Note: the entire table is copied with this command (no deltas are available).

  3. It copies the CSV S3 stage files to BlueConic. This is done to prevent timeouts. Handling the CSV data to update profiles costs more than downloading, so in that case streaming would likely encounter S3 timeouts for large files.

  4. The connection processes the CSV file based on the configuration set up in connection (data processors, mappings, etc.) just as any other CSV import connection would.

  5. Profiles are updated or created with Snowflake data.

Execution flow for exporting via the Snowflake Connection

Follow the standard configuration steps in the Snowflake Connection article to set up your data export. The image below shows the execution flow for exporting data from BlueConic to Snowflake.

  1. The connection reads the configuration from BlueConic, authenticates with Snowflake and AWS, and loads profiles from the database based on the segment selected.

  2. If they do not already exist, the connection creates the Snowflake database, schema, and table for export. It uses "CREATE TABLE IF NOT EXISTS" type statements

  3. The connection creates an Amazon S3 (or Google Cloud Storage) staging area from which Snowflake will import the CSV file exported from BlueConic.

  4. The connection generates the CSV based on the mappings configured in the connection and uploads it to an S3 staging area.

  5. The connection creates a temp table in Snowflake into which the file data is loaded.

  6. The connection runs SQL to load the CSV file into a temporary Snowflake table based on configuration mappings. Then it merges data into the real Snowflake table from the temp Snowflake table.

The image below shows the execution flow for exporting data from BlueConic to Snowflake via Amazon Web Services (S3), but you can also use Google Cloud Storage.

Snowflake Batch Export flow.png

Snowflake queries

The BlueConic Snowflake Connection uses the following queries:

  • USE DATABASE "${database}"

  • USE SCHEMA ${schema}

  • SHOW GRANTS TO USER "${username}"

  • SHOW DATABASES

  • SHOW SCHEMAS IN ${database}

  • SHOW TABLES

  • DESCRIBE TABLE "${table}"

  • CREATE DATABASE IF NOT EXISTS "${database}"

  • CREATE SCHEMA IF NOT EXISTS "${schema}"

  • CREATE TABLE IF NOT EXISTS "${table}"

  • CREATE TEMPORARY TABLE "${temporaryTableName}" + column map from config

  • ALTER TABLE "${table}" ADD COLUMN ${columnStatement}

  • CREATE OR REPLACE FILE FORMAT ${formatName} TYPE = 'CSV' COMPRESSION = ${compression} FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='0x22' NULL_IF = ('', ' ') SKIP_HEADER = 1

  • CREATE OR REPLACE STAGE ${stageName} url=? CREDENTIALS = (AWS_KEY_ID = ? AWS_SECRET_KEY = ?) FILE_FORMAT = ${formatName}

  • COPY INTO "${table}" ${fields} FROM ${stagepath} ON_ERROR = ${CONTINUE}

  • MERGE INTO "${targetTable}" USING "${sourceTable}" ON ${joinStatement} etc...

  • COPY INTO @${S3_STAGE_NAME_COMPRESSED}/${filename} FROM (SELECT ${columnStatement} from "${table}") OVERWRITE = TRUE HEADER = TRUE MAX_FILE_SIZE = ${maxFileSize}

Tips for using the Snowflake Connection

  • The Snowflake Connection does not support importing groups. To import groups from Snowflake, you will need to export a CSV file from Snowflake using a universal CSV import connection to load the data into BlueConic.

  • Exporting data via the Snowflake Connection always exports all profiles in a segment. There is no option for delta export.

  • You may prefer to stage CSV files to Amazon S3 or Google Cloud Storage yourself to avoid BlueConic making queries and running up costs in Snowflake. If you are concerned about costs and control, consider exporting the data using one of the universal CSV connections in BlueConic.

  • The maximum connection run cadence for the Snowflake Connection is every 3 minutes (not recommended).

Did this answer your question?