In the last few years, Artificial Intelligence and Machine Learning have seen an unprecedented rise in popularity across industries and areas of scientific research. Businesses are looking for ways to integrate these new technologies into their operations. However, the shortage of qualified data scientist and machine learning experts has been one of the challenges which thwart the adoption of AI. But a growing number of tools are bringing these capabilities into the hands of developers, citizen data scientists, domain experts and business users.
In this blog, we will delve into Automated Machine Learning (AutoML) for Data Flows – a new capability in Power BI which enables business users to experience machine learning models without having to learn how to program or extensive knowledge of mathematics and statistics. AutoML for dataflows allows users to create Machine Learning models with few simple clicks and generates model summary reports.
Before building a machine learning model in Power BI, users need to create a dataflow for the data. For more details on how to configure the dataflow, please refer to this blog on how to create a dataflow. Note that Automated Machine Learning (AutoML) is currently only available for dataflows on Power BI Premium and Embedded capacities.
In this post, I will use the concrete compressive strength dataset from UCI public datasets as our input for the machine learning models. To download the dataset, please go to this link. Now, let’s delve into it!
Building an ML model with AutoML
First, select the newly created dataflow and select Edit Entity icon to see your full data. The user can take advantage of Power BI’s self-service data prep to edit the data. Note that dataflow entity is changed into Concrete Strength.
In the dataflow entity screen, select ‘Apply ML Model’ icon and then select ‘Add a machine learning model’.
Select the entity and outcome field that you want to make a prediction. Our objective is to predict the strength column of Concrete Strength entity so we will choose Concrete Strength as ‘Entity’ and strength as ‘Outcome field’. Then click ‘Next’.
The next step is choosing a model. One awesome thing about AutoML is that after users specify the outcome field, it will analyze the label data and recommend the most likely model type. Choose to ‘Select a different model’ if you want to pick different model type. For our case, regression is the right model so I will click ‘Next’.
Select which features or columns you want to include in the training models. AutoML will also analyze the entity to suggest the inputs that can be used to train the machine learning model.
Lastly, name the model and specify the training time. Usually, the longer the training time is, the more accurate the result is. Additionally, AutoML will split the provided data into training and test data. The test data will be used to validate the model after training.
After the model is train, go to ‘Machine learning models’ tab and select ‘View training report’ icon to see the training report.
The report provides the detail of the training model, analyzes the results and explains how the features in the models contribute to the model prediction. Users can also check the key features which influence the prediction model.
For our case, the model yields a relatively good result with 94% of the variation in strength can be explained by this model. The final model used is Pre-fitted Soft Voting Regressor.
Spend some time on the training report to understand how the model works and see if you can extract more insights from this model!
Automated Machine Learning is a powerful tool for business users to get the most out of data. It allows users to get answers from data quickly and more intuitively with its training report. Its integration in Power BI enables a smarter and more adaptive BI tool. Using AutoML, business analysts or developers without a strong background in machine learning and programming can provide data science solutions in a simple way.