With the release of the latest version of Tableau Desktop- 2021.1, users can now enjoy the benefits of a highly requested feature- a native connector for Azure Data Lake Storage.
It is now possible to directly connect to Azure Data Lake Storage Gen2 and Azure Blob Storage from Tableau. We can access the files stored in them and visualize the data in Tableau without having to use any data warehouse or database as an intermediary. This blog will walk through the process of connecting to Azure Data Lake Storage using the new connector in detail and outline the considerations to be kept in mind while using it.
Note: This connector uses only the default tenant for the Azure account. To use Azure Data Lake Storage Gen2 resources with Tableau, they must be associated with the default tenant.
Version information: Tableau Desktop 2021.1, Azure Data Lake StorageV2 (general purpose V2)
- Open Tableau Desktop and select the Azure Data Lake Storage Gen2 connector from the list of installed connectors:
2. All Tableau-Azure connectors support Azure Active Directory authentication in the 2021.1 version. So, Microsoft account credentials can be used. On selecting the connector, a browser window will pop up asking for Microsoft account credentials. Provide or select the necessary credentials:
3. After providing the credentials, the browser will display a message saying “Tableau created this window to authenticate. It is now safe to close it”. Close the browser tab and open Tableau Desktop again. The following connection window will be displayed, asking for the Azure Data Lake Storage Endpoint URL:
4. The Azure Data Lake Storage Endpoint URL can be obtained from the ‘Properties’ pane of the required storage account in Azure Portal. It will be of the format:- https://YourStorageAccountName.dfs.core.windows.net/
5. Paste the Endpoint URL in the connection window inside Tableau and select ‘Use Endpoint’:
6. After selecting ‘Use Endpoint’, the connection to Azure Data Lake Storage will be established and the list of Containers in the storage account will be displayed:
- The file types supported by this connector can be seen in the ‘File URL’ tab of the connection window. Clicking on the ‘File extension’ dropdown will show the list of supported file types:
2. To access the files, the list of Container names in the ‘Browse’ tab can be used. Clicking on the required Container name will display the files and folders stored in that Container:
3. Select the required file and click on ‘Connect’:
4. The data inside the file will now be available in the Data Source pane:
5. If the Browse tab is not displaying the Containers or the navigation tree is not working after establishing a connection with Azure Data Lake Storage, an alternative method can be used to access the files stored inside the Containers, using the ‘File URL’. The URL of the required file can be obtained from the ‘Properties’ pane of the file from the Azure portal:
6. The copied URL will be of the format: https://YourStorageAccountName.blob.core.windows.net/YourContainerName/YourFileName.csv
An important change to the URL is to be made here. Replace the word ‘blob’ in the URL structure with ‘dfs’. So the URL should look like: https://YourStorageAccountName.dfs.core.windows.net/YourContainerName/YourFileName.csv
Copy this modified URL, paste it in the ‘File URL’ tab of the connection window and set the correct ‘File Extension’. Click on ‘Connect’:
7. After connecting, the data in the file will be brought inside Tableau:
8. Similarly, the data stored in other supported file types such as JSON and xls can also be brought inside Tableau. For JSON file type, ‘Schema Levels’ selection is also available. This enables control over the dimensions and measures being brought inside Tableau:
9. Once the ‘Schema Levels’ are selected, the JSON data is automatically flattened and displayed inside Tableau as rows and columns:
10. Thus, the data stored in Azure Data Lake Storage can easily be visualized in Tableau:
In this blog, we have explored the capabilities of the native Tableau connector for Azure Data Lake Storage. It is to be noted that file types such as xml, parquet and orc are not yet supported by this connector. In the next blog of our Tableau – Azure series, we’ll look at bringing data from such file types as well into Tableau.
Using an older version of Tableau and still want to connect to Azure Data Lake and Azure Blob Storage? Check out this blog.
To learn more about Visual BI’s Tableau and Microsoft Azure consulting services, you can contact us here.