This blog continues our discussion on ML/AI capabilities in Power BI. In our previous blog, we talked about Automated Machine Learning in Power BI,  a nice feature for business analysts and citizen data scientists. However, it is only available in Power BI Premium and Embedded capacities; thus, prevents some users from integrating data science solution in their advanced analytics process.

In this blog, we will introduce PyCaret, an open-source low-code machine learning library that can be integrated into Power BI. It enables customers to build machine learning models with a few lines of code in Power BI.

Prerequisites

Before delving into ML capabilities of PyCaret, you will first need to install PyCaret in your local machine or virtual environment and set your Python directory in Power BI.

To install PyCaret follows the instruction in this link.

In Power BI Desktop, select File > Options and Settings > Options > Python scripting to set your Python directory.

Machine Learning in Power BI with PyCaret

The data that we will use in this model is Bank Customers data used for churn modeling from Kaggle, an online data science community. You can download the data from here. Keep in mind that data that we use for modeling had already been preprocessed so it was slightly different from the original data. Particularly, I performed one-hot encoding to Geography column, removed the CustomerID, Surname, RowNumber column and split the data into Churn Training and Churn Testing data.  For the sake of convenience, I include the link to the already prep data here.

Building Customer Churn Prediction Model

After loading data into Power BI, you can see the dataset Churn Training and Churn Testing datasets when you open Power Query Editor.

Machine Learning in Power BI with PyCaret

In your Churn Training dataset, select Transform > Run Python Script, you should see a Python editor enter Python scripts.

Machine Learning in Power BI with PyCaret

Our goal is to predict the features Exited in the data which indicates customer leaves the bank services (1) or not (0). This can be formulated as a classification problem. Let write the following lines of code in the Python editor and select OK to run the model.

Machine Learning in Power BI with PyCaret

The script creates the popular XGBoost model to predict the target label Exited. We then save the model as pickle file in directory C:/Users/VisualBI/Desktop/ so we can use it the future.

Keep in mind that PyCaret will handle most of the feature engineering and splitting data steps for you which can be useful for business users who do not have knowledge in programming.

Then, we go back to our Churn Testing data, select Transform> Run Python Scripts and run the following lines.

Machine Learning in Power BI with PyCaret

The scripts find the model which we previously trained and apply to the Churn Testing dataset. The result will be saved in the Label column. You can on the left side is original label column Exited and the predicted label Label.

Machine Learning in Power BI with PyCaret
Machine Learning in Power BI with PyCaret

Conclusion

PyCaret enables business users and citizen data scientists to discover a deeper layer of advanced analytics. It is open-sourced and easy to use. It provides a wide range of functions within Power BI. It is an exciting machine learning library which can be helpful to business users with little programming and statistics knowledge

Stay tuned for more blogs from me about Machine Learning and AI. Let us know if you are seeking additional guidance in planning your Power BI governance program. Read more blogs from Power BI Category here.


Corporate HQ:
5920 Windhaven Pkwy, Plano, TX 75093

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com


Copyright © Visual BI Solutions Inc.

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!

Share This!

Share this with your friends and colleagues!