Blogs / Tableau / Exploring Native Tableau connector for Azure Data Lake Storage Gen2

Exploring Native Tableau connector for Azure Data Lake Storage Gen2

May 7, 2021

SHARE

With the release of the latest version of Tableau Desktop- 2021.1, users can now enjoy the benefits of a highly requested feature- a native connector for Azure Data Lake Storage. 

It is now possible to directly connect to Azure Data Lake Storage Gen2 and Azure Blob Storage from Tableau. We can access the files stored in them and visualize the data in Tableau without having to use any data warehouse or database as an intermediary. This blog will walk through the process of connecting to Azure Data Lake Storage using the new connector in detail and outline the considerations to be kept in mind while using it.

Note: This connector uses only the default tenant for the Azure account. To use Azure Data Lake Storage Gen2 resources with Tableau, they must be associated with the default tenant. 

Version information: Tableau Desktop 2021.1, Azure Data Lake StorageV2 (general purpose V2) 

Establishing connection

  1. Open Tableau Desktop and select the Azure Data Lake Storage Gen2 connector from the list of installed connectors: 
Selecting the Azure Data Lake Storage Gen2 connector
Fig 1: Selecting the Azure Data Lake Storage Gen2 connector

2. All Tableau-Azure connectors support Azure Active Directory authentication in the 2021.1 version. So, Microsoft account credentials can be used. On selecting the connector, a browser window will pop up asking for Microsoft account credentials. Provide or select the necessary credentials: 

Providing the necessary Microsoft account credentials
Fig 2: Providing the necessary Microsoft account credentials

3. After providing the credentials, the browser will display a message saying “Tableau created this window to authenticate. It is now safe to close it”. Close the browser tab and open Tableau Desktop again. The following connection window will be displayed, asking for the Azure Data Lake Storage Endpoint URL: 

Window asking for Azure Data Lake Storage Endpoint URL
Fig 3: Window asking for Azure Data Lake Storage Endpoint URL

4. The Azure Data Lake Storage Endpoint URL can be obtained from the ‘Properties’ pane of the required storage account in Azure Portal. It will be of the format:- https://YourStorageAccountName.dfs.core.windows.net/ 

Obtaining the Endpoint URL from ‘Properties’ of Storage account in Azure Portal
Fig 4: Obtaining the Endpoint URL from ‘Properties’ of Storage account in Azure Portal

5. Paste the Endpoint URL in the connection window inside Tableau and select ‘Use Endpoint’: 

Pasting the Endpoint URL in the connection window and selecting ‘Use Endpoint’
Fig 5: Pasting the Endpoint URL in the connection window and selecting ‘Use Endpoint’

6. After selecting ‘Use Endpoint’, the connection to Azure Data Lake Storage will be established and the list of Containers in the storage account will be displayed: 

List of Containers in Storage account being displayed after establishing a connection.
Fig 6: List of Containers in Storage account being displayed after establishing a connection.

Accessing Files

  1. The file types supported by this connector can be seen in the ‘File URL’ tab of the connection window. Clicking on the ‘File extension’ dropdown will show the list of supported file types: 
Dropdown list displaying the supported file types
Fig 7: Dropdown list displaying the supported file types

2. To access the files, the list of Container names in the ‘Browse’ tab can be used. Clicking on the required Container name will display the files and folders stored in that Container: 

List of files and folders inside the selected Container being displayed
Fig 8: List of files and folders inside the selected Container being displayed

3. Select the required file and click on ‘Connect’: 

Clicking on ‘Connect’ after selecting the required file
Fig 9: Clicking on ‘Connect’ after selecting the required file

4. The data inside the file will now be available in the Data Source pane: 

Data inside the file being displayed in the Tableau Data Source pane
Fig 10: Data inside the file being displayed in the Tableau Data Source pane

5. If the Browse tab is not displaying the Containers or the navigation tree is not working after establishing a connection with Azure Data Lake Storage, an alternative method can be used to access the files stored inside the Containers, using the ‘File URL’. The URL of the required file can be obtained from the ‘Properties’ pane of the file from the Azure portal: 

Getting the File URL from the ‘Properties’ pane of the File in Azure portal
Fig 11: Getting the File URL from the ‘Properties’ pane of the File in Azure portal

6. The copied URL will be of the format: https://YourStorageAccountName.blob.core.windows.net/YourContainerName/YourFileName.csv 

An important change to the URL is to be made here. Replace the word ‘blob’ in the URL structure with ‘dfs’. So the URL should look like: https://YourStorageAccountName.dfs.core.windows.net/YourContainerName/YourFileName.csv 

Copy this modified URL, paste it in the ‘File URL’ tab of the connection window and set the correct ‘File Extension’. Click on ‘Connect’: 

Pasting the modified File URL and clicking on ‘Connect’
Fig 12: Pasting the modified File URL and clicking on ‘Connect’

7. After connecting, the data in the file will be brought inside Tableau: 

Data inside the file being displayed in the Tableau Data Source pane
Fig 13: Data inside the file being displayed in the Tableau Data Source pane

8. Similarly, the data stored in other supported file types such as JSON and xls can also be brought inside Tableau. For JSON file type, ‘Schema Levels’ selection is also available. This enables control over the dimensions and measures being brought inside Tableau: 

Selecting ‘Schema Levels’ for Json file type
Fig 14: Selecting ‘Schema Levels’ for Json file type

9. Once the ‘Schema Levels’ are selected, the JSON data is automatically flattened and displayed inside Tableau as rows and columns: 

Flattened Json data displayed inside Tableau
Fig 15: Flattened Json data displayed inside Tableau

10. Thus, the data stored in Azure Data Lake Storage can easily be visualized in Tableau: 

Exploring Native Tableau connector for Azure Data Lake Storage Gen2

In this blog, we have explored the capabilities of the native Tableau connector for Azure Data Lake Storage. It is to be noted that file types such as xml, parquet and orc are not yet supported by this connector. In the next blog of our Tableau – Azure series, we’ll look at bringing data from such file types as well into Tableau. 

Using an older version of Tableau and still want to connect to Azure Data Lake and Azure Blob Storage? Check out this blog.

To learn more about Visual BI’s Tableau and Microsoft Azure consulting services, you can contact us here.


Corporate HQ:
5920 Windhaven Pkwy, Plano, TX 75093

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com


Copyright © Visual BI Solutions Inc.

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!

Share This!

Share this with your friends and colleagues!