Blogs / Snowflake / DBT / dbt-Snowflake Project – Tips and Tricks

dbt-Snowflake Project – Tips and Tricks

Apr 20, 2021

SHARE

dbt is  a leading transformation tool used to develop data transformations in Snowflake. Over the last few years of working with many different types of companies, we have learned a lot about how to approach both common and uncommon scenarios. This blog will walk through a few of those scenarios in a dbt-Snowflake project and some tips and tricks to handle them. 

Scenario 1: Renaming tables loaded using dbt seed 

In dbt, you can load CSV files present in the data directory as tables into Snowflake using dbt seed. However, dbt seed will create tables in Snowflake with the same name as the file name present in the data directory. 

Consider a scenario where you have three versions of the same file in the data directory, one each for development, quality, and production, as seen below. 

dbt-Snowflake Project – Tips and Tricks
Figure 1: CSV files in dbt directory

If you run the following command in the production environment, it will create a table in the name country_prod as per the dbt seed command’s default behavior. 

dbt seed –select country_prod 

If you would like to have the name of the target table as COUNTRY, you may have to do some workaround for the same irrespective of the environment. The following macro and the post hook settings for seeds in the file dbt_project.yml will ensure that the table name is always COUNTRY. 

dbt-Snowflake Project – Tips and Tricks
Figure 2: Macro to handle the seed requirement

Settings to be done in dbt_project.yml

dbt-Snowflake Project – Tips and Tricks
Figure 3: Configuration of Seeds in dbt_project.yml

Scenario 2: Creating transient tables in development and permanent table in production 

In a development environment where no business-critical data is present, it is preferable to create transient tables in Snowflake instead of permanent tables since transient tables are more cost-effective. However, you may want to create permanent tables in the production environment to store business-critical data since Snowflake offers more capabilities like Time Travel Retention and Fail-Safe periods. Depending on the target environment, you can choose whether a table must be transient or permanent. This requirement can be handled by setting up the dbt_project.yml file, as seen below. 

dbt-Snowflake Project – Tips and Tricks
Figure 4: Configuration of dbt_project.yml

Here, the string ‘prod’ refers to the target name of the production environment set up in the file dbt_project.yml

Scenario 3: Reduce run time of dbt run 

As the number of objects in the dbt project grows, you may run into scenarios where the dbt run command runs for a long time before firing commands to the database. This is because, on every dbt run, all the objects inside the project are parsed to create the Directed Acyclic Graph (DAG). The time it takes for parsing is directly proportional to the number of objects and the complexity of the dbt project. This can be avoided by the following setup in profiles.yml

dbt-Snowflake Project – Tips and Tricks
Figure 5: Configuration in projects.yml

When the above configuration is used, dbt will not parse objects on every dbt run unless there is a change in files in the dbt project after the dbt run command’s last invocation. This will save a lot of time, especially in projects with many objects with complex dependencies. 

Scenario 4: Environment specific sequence names 

At times, it is necessary to use SQL sequences for surrogate keys. However, you must not directly mention the name of the sequence in the model, as seen below, as the model design must be environment-independent. 

dbt-Snowflake Project – Tips and Tricks
Figure 6: Model that directly invokes a sequence

Like how you declare tables as Sources, sequence names can also be declared as Sources in .yml file.

dbt-Snowflake Project – Tips and Tricks
Figure 7: Declaring a sequence inside .yml file

After setting up the sources in the .yml file, the model can be designed as seen below. 

dbt-Snowflake Project – Tips and Tricks
Figure 8: Sequence being called as a Source

We hope this blog gave you an idea of some real-time scenarios in a dbt-Snowflake project and how to use available dbt features to meet the requirements.  

Please visit the link to explore more blogs on Snowflake and dbt. 

Are you looking forward to implementing a dbt-Snowflake project for your organization? Click here to reach out to us and know more about our Snowflake offerings. 


Corporate HQ:
5920 Windhaven Pkwy, Plano, TX 75093

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com


Copyright © Visual BI Solutions Inc.

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!

Share This!

Share this with your friends and colleagues!