Using a live connection in Tableau for connecting to databases with millions of records poses a big challenge, Performance. Many performance enhancement techniques and features can be implemented to address this issue. In this blog we will cover one of the features, Assume referential integrity.
What is Assume referential integrity?
Enabling this option in Tableau creates joins in the back end only between the tables that contain the dimensions or measures used in the sheet.
Note: The option is useful only when inner join exists between the tables.
Let’s try to understand with an example. The data source used in this example is the liquor sales dataset. Create a star schema for the above dataset and join the tables as shown below:
Start the performance recording and build sales by county bar chart. After the chart is built stop the performance recording
In our sheet we have utilized Country dimension from SS_LOCATION_IOWA_LIQOUR table and the Bottle sold measure from SS_FACT_IOWA_LIQOUR table.
The query in the above image can be obtained from the workbook generated after stopping performance recording.
Even though only two tables are utilized to build the chart query in the back end will be run joining all the tables in the data source. This adds unnecessary and additional load in computing and executing the query.
We can avoid this by enabling assume referential integrity option from the data source tab.
After enabling the option, repeat the same steps as earlier. From performance recording workbook we get the following output:
After enabling assume referential integrity we can see from the above image that for the chart created only the relevant tables are joined, that is, only the tables that contain County and Bottle sold fields are joined. Thus, this option helps in improved performance by joining only the necessary tables when a live connection to a database is used.
* * *
Learn more about Visual BI’s Tableau consulting & end user training programs here.