The base for any Tableau dashboard is its data source, and tableau provides an option to its users to publish this data source. The data source can either be published along with the workbook (embedded in the workbook) or can be published as a separate standalone source to be made available to the users. This blog will give you an overview of the difference in Embedding data in workbook Vs published data source and where to implement this.
What is a published data source and its significance?
When you need to build multiple workbooks on top of the same data source it is ideal to publish this data source to Tableau Server or Tableau Public and connect all the workbooks to it. By refreshing the data source once, all the workbooks connected to it will have the updated data. The data source that is published separately acts as a single point of truth and promotes standardization of data where we have a common data definition, measure calculation etc across all the workbooks that are connected to it. Thus, reusable data models can be built with the help of published data source and promote ad-hoc/self-service analytics from tableau server to the business users.
For a published data source, tableau creates Tableau Data Source (TDS) for live connection or Tableau server Data extracted (TDSX) for extract-based connections.
A user with web-edit privileges can create dashboards on a tableau server with a published data source without a desktop license.
How to create a published data source in Tableau
Once you have the data source ready to publish, go to Server -> Publish Data Source and choose the data source that you want to publish.
Or alternatively, you can right-click on the data source from data pane and click on the option Publish to Server.
In terms of data source, the role for the user/groups can be data source connector who can connect to the data source on server or data source editor who can connect, edit, download, delete, publish, and schedule data source refresh (if you are the content owner/admin). We can also configure custom roles and assign them to user/groups which includes customised capabilities.
If there are any external files from the local storage that was included along with the data source, Tableau provides an option for these files to be available in the server along with the data source.
For data sources that require credentials to access the data, the authentication can be done by either prompting the user to enter credentials while accessing the source or by embedding the credentials along with the data while publishing.
Tableau by default closes the local data source and updates the workbook connection to use the newly published data source. To continue using the local data source instead, keep the Update workbook to use the published data source checkbox unchecked.
What is an embedded data source?
An embedded data source allows the workbook to be published along with the data source instead of connecting to a separately published source. Every embedded data source has a separate connection to the data and the scope of the data is restricted to the workbook within which the data source is embedded.
Difference between an embedded and published data source
Some differences between published and the embedded data source are listed here
Standardization of data
|Published data source|
Embedded data source
|Standardization of data||Users can have a single version of truth thus it avoids data proliferation||Every embedded data source has a separate connection to the data thus every workbook tends to have a different interpretation of the same data causing data proliferation.|
|Sharing||Can be shared across its users and every user will have access to the same data||The scope of the data is restricted only to the workbook.|
|Refresh Schedule||Requires a single refresh schedule to be enabled on the extract to have the updated data in all workbooks||To have updated and fresh data in an embedded workbook each workbook requires a separate refresh schedule.|
|Accessibility||A person who consumes data may not be aware of the complete data because it is a shared connection. Anyone authorised user can make changes to the data which will reflect in the workbook. This might cause some complication in terms of accessibility.||Every user maintains an individual connection to the data hence they are aware of the complete data that will exist within the workbook inclusive of the changes made that retains only within a specific connection|
|Data ownership||Maintaining ownership of data is difficult. Any user who has editor privilege can edit the data. Hence, extra caution is essential while defining user roles.||It promotes complete ownership of the data as it eliminates the data being liable to changes by others with connected workbooks.|
|Updating data||With published source, the changes we make are reflected on all the workbooks that are connected to it||If there are multiple workbooks that access the same data, updating data becomes a tedious process as it must be updated individually in all the workbooks|
Embedded data sources add additional load on the server when each workbook has separate refresh schedules. For scenarios where performance is the priority, it is better to connect the workbook to a published data source. Thus, from a performance standpoint, a published source always delivers optimal performance on the server.
To learn more about Visual BI’s Tableau Consulting & End User Training Programs, contact us here.