Understanding the Prediction Mechanism of Sentiments by XAI Visualization

by   Chaehan So, et al.

People often rely on online reviews to make purchase decisions. The present work aimed to gain an understanding of a machine learning model's prediction mechanism by visualizing the effect of sentiments extracted from online hotel reviews with explainable AI (XAI) methodology. Study 1 used the extracted sentiments as features to predict the review ratings by five machine learning algorithms (knn, CART decision trees, support vector machines, random forests, gradient boosting machines) and identified random forests as best algorithm. Study 2 analyzed the random forests model by feature importance and revealed the sentiments joy, disgust, positive and negative as the most predictive features. Furthermore, the visualization of additive variable attributions and their prediction distribution showed correct prediction in direction and effect size for the 5-star rating but partially wrong direction and insufficient effect size for the 1-star rating. These prediction details were corroborated by a what-if analysis for the four top features. In conclusion, the prediction mechanism of a machine learning model can be uncovered by visualization of particular observations. Comparing instances of contrasting ground truth values can draw a differential picture of the prediction mechanism and inform decisions for model improvement.


What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI

When people buy products online, they primarily base their decisions on ...

Machine Learning Model of the Swift/BAT Trigger Algorithm for Long GRB Population Studies

To draw inferences about gamma-ray burst (GRB) source populations based ...

Yelp Dataset Challenge: Review Rating Prediction

Review websites, such as TripAdvisor and Yelp, allow users to post onlin...

A comparative study of forecasting Corporate Credit Ratings using Neural Networks, Support Vector Machines, and Decision Trees

Credit ratings are one of the primary keys that reflect the level of ris...

A Simple and Effective Model-Based Variable Importance Measure

In the era of "big data", it is becoming more of a challenge to not only...

Understanding Random Forests: From Theory to Practice

Data analysis and machine learning have become an integrative part of th...

Algorithmic Songwriting with ALYSIA

This paper introduces ALYSIA: Automated LYrical SongwrIting Application....