Confounds and Overestimations in Fake Review Detection: Experimentally Controlling for Product-Ownership and Data-Origin
The popularity of online shopping is steadily increasing. At the same time, fake product reviewsare published widely and have the potential to affect consumer purchasing behavior. In response,previous work has developed automated methods for the detection of deceptive product reviews.However, studies vary considerably in terms of classification performance, and many use data thatcontain potential confounds, which makes it difficult to determine their validity. Two possibleconfounds are data-origin (i.e., the dataset is composed of more than one source) and productownership (i.e., reviews written by individuals who own or do not own the reviewed product). Inthe present study, we investigate the effect of both confounds for fake review detection. Using anexperimental design, we manipulate data-origin, product ownership, review polarity, and veracity.Supervised learning analysis suggests that review veracity (60.26 - 69.87 confounded with product-ownership (66.19 - 74.17 86.94 confounded withproduct-ownership and data-origin combined (87.78 - 88.12 suggesting overestimations of thetrue performance in other work. These findings are moderated by review polarity.
READ FULL TEXT