In any e-commerce website, the overall rating allocated to a product is based on the individual reviews by the customer. Lately, these Ratings/Reviews have become a key factor in determining whether a customer will buy a product or not. Using built-in features of Power BI like AI Insight, sentiment analysis and web scraping reviews along with the product specification can be managed, studied and analysed. Helping a great deal with the decision-making process.
In this blog, we are going to dynamically retrieve the review of mobile phones from Amazon and run sentiment analysis to understand if the product is preferred by the customers.
Web scraping product details
By Selecting the data source as Web, PowerBI simplifies retrieving contents from any webpage provided we have its corresponding URL. Since we are interested in retrieving the mobile phone reviews and its details, make a simple search of the required model of mobile phone in Amazon, retrieve its URL and enter it in the dialogue box that pops up when you select web as your data source.
When you click ok, a connection to the webpage is established and the entire contents corresponding to the URL will be pulled into PBI as an HTML source. As a result, tables in the HTML will be available as a data source
For example, Table 5 displays ratings of the item:
However, we are interested in collecting reviews of customers. These reviews are not available as HTML tables but stored under a certain HTML class. To retrieve the data, PowerBI provides the ability to Add Tables using Examples.
When we select this option, we will be able to add new columns and generate values for it based on the sample data. Using this sample data, the values that map to the subsequent data hooks under the same class are automatically retrieved by Power BI. By adding a new column Reviews and including some sample of data, all the other subsequent data gets automatically filled by Power BI.
Likewise, we add two more column, Ratings and Review Text that includes the comments given by the user of each review. The same process can be repeated to extract more contents from the website.
Once we have the source tables ready, the next step is to apply sentiment analysis over the contents web scraped from Amazon. The Text Analytics associated with the AI Insights feature of PowerBI allows its users to build the sentiment analysis model. It utilizes Azure Cognitive Services to obtain the sentiment score.
Sentiment analysis considers a text input and runs a machine learning algorithm which assigns a score ranging from 0 to 1. A score of 0 indicates negative sentiment and 1 indicates positive sentiment. The model in cognitive services is trained with a dictionary of texts that are mapped to its corresponding sentiments. This training model is considered to assign the sentiment score for the reviews extracted.
The scores are generated in a new column which can be classified into positive (> 0.5), negative (<0.5) and neutral (=0.5).
Enabling Parameters to the URL
As the final step, create a new parameter to dynamically modify the source URL. By using parameters, similar information corresponding to different products can be extracted by manipulating the URL source.
Sentiment Analysis Report
After enabling the parameters and replacing it with the source URL, a PowerBI report can be built to analyse the sentiment scores obtained over the product reviews. The primary results of the sentiment analysis that determines whether a product is preferred by the customer are indicated in a gauge. The range of the gauge varies from 0 to 1 indicating the sentiment score. If the average of the sentiment score is greater than 0.5 then the product has positive reviews among its customers. Review classification tells us the weightage of positive and negative reviews and the distribution of sentiment score provides a detailed summary of the information previewed in the gauge and the donut chart. Ratings out of 5 as rated by the customers are displayed in the pie chart and the feature rating which is obtained because of web scraping is displayed in a column chart. Finally, a word cloud over the review texts that highlight the key phrases. The frequency of the words in the review texts is scoring the size of the words hence the most impacted words out of the review texts can be identified from a single glance.
This simple report with basic sentiment analysis can be run across multiple products by manipulating the URL to identify if the product is preferred by the customers or not.
Learn more about Microsoft Power BI services offerings from Visual BI solutions here.