The amount of data being generated and used for analysis is increasing exponentially which accentuates the need for data analysis algorithms that caters the user requirements in Self-Service analytics tools. Tableau provides Automated Data Discovery feature as a subset of Smart Analytics that is based on powerful algorithms and statistical models in an approachable way that makes analysing complex datasets easier and efficient.
Explain Data and Clustering are the features provided for automated data discovery. This blog briefs about these and takes through the steps for achieving the capabilities.
Explain Data is an AI-powered feature that helps users understand the values in the data. Identifying the ‘why’ in our data by arriving at potential explanations and then validating against the data would be a time-consuming process and is limited by the user’s perceptions. This can be overcome with the Explain Data feature that uses Bayesian statistical methods to generate explanations for the chosen data point from a visualization. These explanations are delivered as a combination of visualizations and descriptions. This feature helps gain more insights specific to the data for analysis purposes.
How to use Explain Data for insights?
Select any data point for which the ‘whys’ should be identified and click on the lightbulb icon from the tooltip to open Explain Data.
Explain data provides AI-driven explanations by a combination of visuals and descriptions.
The visuals can be used to dive deeper by opening as a new worksheet that can assist in exploring the data to a greater extent using the capabilities of Tableau.
Grouping of similar dimensions can be achieved through Automatic Clustering which is a drag and drop function in Tableau that uses k-means clustering algorithm from which the users can find significant groupings in the data. This grouping can help in comparing the results across various groups present in the dataset. The desired number of clusters can either be specified or let Tableau test different values and suggest an optimal value and uses the Calinski-Harabasz criterion to access cluster quality.
How to perform Clustering?
Connect to any data source (for e.g. Sample superstore data). Let us try to create clusters based on the sales amount of various product sub-categories.
Add the measures (sales and discount amount) in the columns and rows shelves for which the visual should be created and based on which the points are to be clustered.
Categorize them by dragging ‘Sub-Category’ to ‘Detail’ under Marks.
Drag the cluster option available under the Analytics tab into the view.
A dialog box appears where the variables to generate clusters and the number of clusters to be created are defined. The variables denote the value based on which the data points are to be classified. Here we will have Sales as a variable since the clustering is to be based on the Sales amount.
Once this is done, Tableau creates clusters based on the parameters provided and classifies the points in the view as highlighted in the image below:
This makes it easier to group the data points based on any field for better analysis purposes. This can also be created based on other values like discount amount or profit and loss amount etc which gives a better understanding of the data that leads to efficient data-driven business decisions.
Thus, the Automated Data Discovery option that includes Explain Data and Clustering provided by Tableau extends the AI functionality in self-service tools to enable analytics for users from all fields. Better capabilities in analytics lead to better decision making for the furtherance of business. To know more about the Smart Analytics features offered by Tableau check out our blog on Smart Analytics.
Reach out to us to discover more from your data and arrive at data-driven decisions. To learn more about Visual BI’s Tableau Consulting & End User Training Programs, contact us here.