Blogs / Snowflake / Azure Data Factory – Snowflake Connector: An Overview

Azure Data Factory – Snowflake Connector: An Overview

Jan 4, 2021

SHARE

Azure Data Factory (ADF) is a cloud-based data integration solution that offers 90+ built-in connectors to orchestrate the data from different sources like Azure SQL database, SQL Server, Snowflake and API’s, etc.  Azure Data Factory has recently added the Snowflake Connector to extract/load data from Snowflake with any of your existing legacy or modern Database/Datawarehouse.  In this blog, we will outline the functionalities, architecture, features, use cases, limitations, and roadmap of this newly available Snowflake Connector.

The Snowflake Connector supports both Azure as well as the Self-hosted Integration Runtime. It supports the following ADF activities in the pipeline,

  • Copy Activity
    • Source – Snowflake on any Cloud Providers
    • Sink – Snowflake on Azure
  • Lookup Activity

Functionalities

  • Snowflake as a source, you can load/integrate data to any of the ADF supported sinks, which utilizes Snowflake’s COPY INTO <location> statement to achieve the best performance.
  • Snowflake as a target, you can ingest data from any of the ADF supported sources, which utilizes Snowflake’s COPY INTO <table> statement to achieve the best performance.
  • You can also lookup the data from Snowflake table/view for the control flows in ADF.

Architecture

Figure 1 depicts the data flow from Snowflake to any other target system using Snowflake Connector in Azure Data Factory.

Azure Data Factory – Snowflake Connector: An Overview
Figure 1: Snowflake as a Source architecture

Figure 2 depicts the data flow from any source system to Snowflake using Snowflake Connector in Azure Data Factory.

Azure Data Factory – Snowflake Connector: An Overview
Figure 2: Snowflake as a Target architecture

Features

  • When you are using Snowflake as a source, you can configure a table as a source, or can write a Select Query to read data from the existing tables or views in Snowflake.
Azure Data Factory – Snowflake Connector: An Overview
Figure 3: Snowflake as a Source setting
  • When you are using Snowflake as a sink, you can specify a SQL query for the COPY activity to run before writing data into Snowflake in each run. One available use case is to clean up the pre-loaded data.
Azure Data Factory – Snowflake Connector: An Overview
Figure 4: Snowflake as a Sink setting.
  • Additional Snowflake Copy Options- you can configure the Copy Options that are available in Snowflake’s COPY INTO <tale/location> statement. Examples: MAX_FILE_SIZE, OVERWRITE, ON_ERROR, FORCE, LOAD_UNCERTAIN_FILES, etc.,
  • Additional Snowflake Format Options- you can configure the Format Type Options that are available in Snowflake’s COPY INTO <table/location> statement. Examples: DATE_FORMAT, TIME_FORMAT, TIMESTAMP_FORMAT, etc.,

Use Cases

  • The integration of Snowflake with any of the legacy or modern databases and data warehouses becomes simple with the help of ADF’s snowflake connector.
  • Using ADF’s Snowflake connector, semi-structured data can be loaded from JSON files into the Snowflake table’s VARIANT column.
  • The ADF’s Snowflake connector uses COPY INTO <table/location> statement, which is the Snowflake’s most recommended and performant way to load/read data and can scale for a large volume of data as well.

Limitations

  • Intermediate staging in Azure Blob Storage is mandatory while using Snowflake connector. However, the automatic deletion of the intermediate stage will happen after each run.
  • Authentication of the intermediate Stage (Azure Blob Storage) using Shared Access Signature (provides secure delegated access to resources in your storage account) is needed, which will be associated with a time span. So, after the expiry, we may need to regenerate it every time.
  • Calling a Snowflake’s stored procedure is not possible in the source query or the pre-copy script section, in case of a target.

Roadmap

  • Support for Snowflake in Mapping Data Flows as a source and sink of transformations to provide seamless ETL. However, now you can use Copy Activity in conjunction with Data Flow.
  • Support for Snowflake Stored Procedures.

References

Learn more about Visual BI’s Microsoft Azure offerings here.


Corporate HQ:
5920 Windhaven Pkwy, Plano, TX 75093

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com


Copyright © Visual BI Solutions Inc.

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!

Share This!

Share this with your friends and colleagues!