Tableau Server delivers powerful capabilities through processes that govern extract refreshes, database connections, workbooks, data sources etc. Since so many processes are involved, it is recommended to optimize Tableau Server for better performance. In this blog we will be looking at some of the steps that help you optimize Tableau Server’s performance.
1. Using Published Data Sources (PDS)
Instead of connecting to databases every time a workbook is created, try to leverage published data sources to supply information to multiple workbooks. Leveraging published data sources provides the following benefits:
- Being a single source of Data for multiple workbooks, PDS saves lot of space in Tableau Server whereas in the case of workbook, embedded data sources space is consumed based on volume of data in each workbook
- Processing load is optimized in the server when multiple workbooks are used
- Creating formula and calculations for the Data is a onetime activity in Tableau Desktop. These modifications are saved in Published Data Sources and therefore need not be re-created every time a new workbook is developed
- Data refresh process & orchestration becomes simpler as data refresh of a PDS is reflected across all workbooks connected to it thereby negating the need for setting multiple data refreshes for each workbook which may take a lot of time
- Easily connect to data sources for creating new workbooks without connecting to Databases.
It is also essential to understand that modifications cannot be made to Published Data Sources. E.g. Editing formulas, creating new calculations etc.
2. Use Row Level Security (RLS)
Sometimes, developers create multiple versions of the same workbook catering to multiple user roles (e.g. Manager – Eastern Region should only see data pertaining to that region). They achieve this by creating filtered version of the data set catering to each role. However, this consumes lot of space in Tableau Server and hampers its performance. In addition, future changes to the dashboard need to be applied to all the versions thereby taking time & effort.
By utilizing Row Level Security (RLS) for a workbook, different users can view the same dashboard that restricts the data as per their roles.
3. Schedule Data Refresh during Non-Business Hours
Scheduling data extract refreshes during office hours may take up longer time as multiple processes might be utilizing the databases and tables.
Therefore, it is one of the best practices to set data refresh schedules during non-business hours which ensures faster data refresh.
4. Prioritize Schedule Refreshes
There may be many data refreshes pointing to different workbooks in a refresh schedule. Some of the data refreshes might involve huge amount of data that take a lot of time to refresh data extracts, while other data refreshes would have lesser data and therefore would take lesser time.
By prioritizing data refreshes, we can set the sequence in such a way that data refresh for the workbooks with fewer data is triggered first followed by the ones involving large amount of data. Therefore, we can access the reports with lesser data sooner.
5. Favor Incremental over Full Refresh whenever possible
Incremental Refresh appends new records to existing records in extracts, whereas a full refresh deletes existing records and loads old records along with the new records.
It is advisable to go for an incremental refresh as it takes much lesser time when compared to a full refresh, unless there is a mandatory business requirement to reload all the data.
6. Retain Data in Cache memory
While installing Tableau Server, under Data Connections Tab the user is prompted to choose the method of handling cache.
a) Refresh Less Often – Data from source is cached. Subsequently every time the report is accessed data from cache is displayed. This is done to reduce the load on Tableau Server by not sending query to database every time the report is accessed thereby improving performance. This option is best when used for data that changes less often. Latest Data is reflected in the report only when the report is manually refreshed, or when Tableau Server is restarted.
b) Balanced – The user can specify the time up to which data is cached. Data is not held in the cache beyond the time specified.
c) Refresh More Often – In-case of live connection to data source, every time the report is accessed data refresh will take place in the background before displaying the report. In case of extract, connection data from latest version of refreshed extract will be fetched. Although this helps us to view the latest data, there will be an additional load on Tableau Server to fetch data from the data source every time the report is viewed. This option can be chosen when new records are added to data source in very short intervals.
These are some of the options to optimize your Tableau Server for better performance.
* * *
Learn more about Visual BI’s Tableau consulting & end user training programs here.