A Comparative Study of Sentiment Analysis Methods for Detecting Fake Reviews in E-Commerce

Fake Reviews Detection GPT-2 NBSVM BiLSTM RoBERTa.

Authors

  • Maneerat Puttarattanamanee KMITL-Digital Analytics and Intelligence Center, Faculty of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520,, Thailand
  • Laor Boongasame 2) Department of Mathematics, Faculty of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand. 3) Business Innovation and Investment Laboratory: B2I-Lab, School of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520,, Thailand
  • Karanrat Thammarak
    kanchan.th@wu.ac.th
    Department of Computer Engineering and Electronics, School of Engineering and Technology, Walailak University, Nakhon Si Thammarat 80160,, Thailand https://orcid.org/0000-0003-4694-6128
Vol. 4 No. 2 (2023): June
Research Articles

Downloads

The popularity of the e-commerce system has increased, especially under the COVID scenario. Consumer product reviews from the past have had a significant impact on influencing consumers' purchasing decisions. Fake reviews”those written by humans and computers that engage in dishonest behavior”are consequently generated to increase product sales. The fake reviews hurt consumers and are dishonest. The goal of this research is to examine and evaluate the performance of various methods for identifying fake reviews. The well-known and widely-used Amazon Review Data (2018) dataset was used for this research. The first 10 product categories on Amazon.com with favorable feedback will be provided in the data section. After that, perform fundamental data preparation procedures such as special character trimming, bag of words, TF-IDF, etc. The models are trained to create a dataset for detecting fake reviews. This research compares the performance of four different models: GPT-2, NBSVM, BiLSTM, and RoBERTa. The hyperparameters of the models are also tuned to find the optimal values. The research concludes that the RoBERTa model performs the best overall, with an accuracy of 97%. GPT-2 has an overall accuracy of 82%, NBSVM has an overall accuracy of 95%, and BiLSTM has an overall accuracy of 92%. The research also calculates the Area Under the Curve (AUC) for each model and finds that RoBERTa has an AUC of 0.9976, NBSVM has an AUC of 0.9888, BiLSTM has an AUC of 0.9753, and GPT-2 has an AUC of 0.9226. It can be observed that the RoBERTa model has the highest AUC value, which is close to 1. Therefore, it can be concluded that this model provides the most accurate prediction for detecting fake reviews, which is the main focus of this research.

 

Doi: 10.28991/HIJ-2023-04-02-08

Full Text: PDF