Blogs / Self-Service BI / Alteryx / Performance enhancement in Alteryx: Significance of Calgary DB

Performance enhancement in Alteryx: Significance of Calgary DB

Nov 12, 2019

SHARE

For Alteryx developer, it is good to know the advantage of using Calgary DB over other flat files. It will ramp up your workflow performance. As mentioned in the Alteryx forum, Calgary is a data retrieval engine to perform analysis on a large dataset using indexing methodology. This blog compares Alteryx DB, Calgary DB and CSV in varies scenarios. To demonstrate each scenario, we have taken dataset which contains eleven million records.

Scenarios to Compare

We have taken the below scenarios based on the tools available in Calgary

  • Creation
    1. Creating CSV
    2. Creating Alteryx DB
    3. Creating Calgary DB
  • Data retrieval
    1. CSV
    2. Alteryx DB
    3. Calgary DB
  • Data Joining
    1. CSV
    2. Alteryx DB
    3. Calgary DB
  • Record Counting
    1. CSV
    2. Alteryx DB
    3. Calgary DB

1. Time-consuming for creation

In this scenario, we have taken our dataset as CSV format, it will act as Source. Then our workflow will create CSV, Calgary DB(CYDB), Alteryx DB(YXDB) from the source file. Here we’re capturing workflow execution time in order to determine best file format to create a large dataset.

Performance enhancement in Alteryx: Significance of Calgary DB
CSV
Performance enhancement in Alteryx: Significance of Calgary DB
Alteryx DB
Performance enhancement in Alteryx: Significance of Calgary DB
Calgary DB

2. Time-consuming for Data retrieval

In the previous workflow, we just converted our source file as Alteryx DB, Calgary DB, CSV. Then our workflow will filter records where a street column contains string ACR in all three formats. To determine, best file format to data retrieval from the large dataset we need to capture workflow execution time. Check the workflows below:

Performance enhancement in Alteryx: Significance of Calgary DB
CSV
Performance enhancement in Alteryx: Significance of Calgary DB
Alteryx DB
Performance enhancement in Alteryx: Significance of Calgary DB
Calgary data

3. Time-consuming for Data joining

So far, we’ve created and retrieved data from all three formats. How about performing joining across three formats? Let’s create the workflow to determine the best file format to perform the join operation.

Performance enhancement in Alteryx: Significance of Calgary DB
CSV
Performance enhancement in Alteryx: Significance of Calgary DB
Alteryx DB
Performance enhancement in Alteryx: Significance of Calgary DB
Calgary DB

4. Time-Consuming for Record Counting

In this scenario, we’ll perform aggregation on a large dataset using all three format to conclude the best file format for finding a record count of a specific field.

Performance enhancement in Alteryx: Significance of Calgary DB
CSV
Performance enhancement in Alteryx: Significance of Calgary DB
Alteryx DB
Performance enhancement in Alteryx: Significance of Calgary DB
Calgary DB

Performance Summarization

ProcessCSVAlteryx DBCalgary DB
Creation11.3 Seconds10 Seconds2:34 Minutes
Data Retrieval10.2 Seconds3.4 Seconds0.2 Seconds
Joining17.8 Seconds12.7 Seconds1.2 Seconds
Counting14.6 Seconds9.3 Seconds0.6 Seconds

The significant difference observed between creation for CSV, Alteryx DB Vs Calgary DB is mainly because, with Calgary, a separate index file is generated for every dimension considered which adds up the additional delay in performance.

With the dataset in consideration, it can be inferred that the performance of Calgary DB is significantly better than CSV and Alteryx DB in terms of aggregation like counting records, performing joins or retrieval of data.

Reference

https://help.alteryx.com/2018.3/calgaryindex.htm
Using Calgary in Alteryx

Read more about similar Self Service BI topics here and read more blogs from Alteryx category here.


Corporate HQ:
5920 Windhaven Pkwy, Plano, TX 75093

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com


Copyright © Visual BI Solutions Inc.

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!

Share This!

Share this with your friends and colleagues!