Over the past couple of years, there has been a strong inclination in the BI market towards Self-Service tools. This blog aims at providing a brief comparison between the three Self-service tools with ETL capabilities: Alteryx 2019.1.6.58192, Tableau Prep 2018.3.1 and Power BI Dataflow (July 2019).
Alteryx is one of the very mature tool in the market that provide self-service ETL and data preparation which extends to spatial and predictive analytics etc., Even though Alteryx is predominantly used as an analytics tool, it also has ETL capabilities to assist the users to prepare and blend data from multiple sources and export the results of the analytics to a variety of platform.
Tableau and Power BI are the major players in the BI industry and have been constantly identified as the market leaders of self-service BI tools by Gartner over the last couple of years. To bridge the gap between transforming raw data to a fully functional dynamic report both the technologies introduced an additional component as a part of their service to include self-service ETL capabilities. Tableau Prep and Power BI Dataflow were introduced as the ETL counterpart for Tableau and Power BI respectively.
Though comparing data preparation capabilities of an analytical tool like Alteryx to ETL counterparts of visualization tools such as Tableau and Power BI is like comparing apples and orange, we are curious to understand how easy to achieve the data preparation from a self-service perspective. ETL capabilities are compared across the three tools based on the following parameters;
- Data Connectivity
- Data Preparation
- Transformation and Join Capabilities
2. Data Connectivity
Since the first step in an ETL process is to Extract the data, it is essential to understand the list of possible sources, that all these tools can connect to and retrieve data from.
To view the complete list of sources, click here.
Self-service users can easily connect to a variety of data sources in Alteryx when compared to other tools. Also, the data can easily be loaded to tables for further processing and consumption whereas this is limited in other tools in comparison.
3. Data Preparation
The next step of an ETL process is data preparation which involves the conversion of raw data into a usable format for further analysis. This might involve the following processes;
- Data Cleansing
- Building Expressions
- Sampling and Indexing
3.1 Data Cleansing
We can comfortably achieve all common data cleansing needs such as replacing NULL values, removing white spaces, tabs, numbers, special characters, modifying the case of text, to select/deselect columns, to reorder, rename, resize columns and to modify the data type in all the three tools.
In addition to the above-listed data cleansing options, direct options for removing duplicate white spaces, restricting the column size and adding column description is available only in Alteryx.
3.2 Building Expressions
In order to create a new column or to update an existing column, an expression is used which can be created with the help of following categories of functions such as;
- Conditional functions such as if, elseif, etc.,
- Conversion functions which include data type conversions
- Date Time functions such as Datediff(), DateParse() etc.,
- Math functions such as ABS(), Mean(), Mode() etc.,
- Operators such as Boolean AND, OR, NOT, etc.,
- String functions such as TRIM(), SUBSTR(), LEN(), REPLACE() etc.,
- Test functions which includes Isnull(), IsNum() etc.,
By comparing the functions across all the three tools, operations such as Boolean expressions, regular expressions and generating random numbers are explicitly available in Alteryx. Even though most of these functions can be achieved in all the tools, unlike Alteryx there is no direct option to incorporate them in PowerBI Dataflow as the ability to build expressions in PowerBI Dataflow is restricted. It requires additional workarounds or extensive knowledge in Power Query to achieve them which adds a little bit of complexity to naïve users.
3.3 Data Sampling and Indexing
Sampling – fetch the set of records from the input.
Indexing – assigns a unique identifier which increments sequentially for every record in the data. In terms of indexing, all three tools have indexing functionality.
4. Data Transformation
With data transformations, the users can convert the data from one structure/format to another which includes activities such as data integration and data management. In terms of data transformation, the following parameters are considered for the analysis;
An overall summary for the three tools is given in the table below
From the perspective of a self-service tool with ETL capabilities, almost all the basic features are possible in Alteryx, Tableau Data Prep, and Power BI Data flow.
In terms of advanced functionalities for data preparation and data transformation, Alteryx provides a better intuitive experience for naïve end users whereas we have to use workarounds in Tableau Data Prep and power query in Power BI Data Flow. Power BI Data Flow is a very recent launch, it is in its initial version and we expect it to quickly ramp up in features and capabilities like how Power BI got better and better with every month.
The tools that an analyst is going to pick depends on the analytical task at hand. If the platform of the user is either Tableau or PowerBI they can opt for Tableau Prep or PowerBI dataflow for self-service ETL. If you need to take a deep dive into your data to do predictive, spatial and BI analytics in addition to data preparation/ETL then the analyst could go for Alteryx.
Learn more about Visual BI Solutions Microsoft Power BI offerings here.