A Hybrid Machine Learning Approach for Sentiment Analysis of Beauty Products Reviews
محورهای موضوعی : Machine learningKanika Jindal 1 , Rajni Aron 2
1 - Lovely Professional University, Punjab, India
2 - SVKM's Narsee Monjee Institute of Management Studies, Mumbai, India
کلید واژه: Sentiment Analysis, Machine Learning, Beauty Products, Feature Extraction, Social Media.,
چکیده مقاله :
Nowadays, social media platforms have become a mirror that imitates opinions and feelings about any specific product or event. These product reviews are capable of enhancing communication among entrepreneurs and their customers. These reviews need to be extracted and analyzed to predict the sentiment polarity, i.e., whether the review is positive or negative. This paper aims to predict the human sentiments expressed for beauty product reviews extracted from Amazon and improve the classification accuracy. The three phases instigated in our work are data pre-processing, feature extraction using the Bag-of-Words (BoW) method, and sentiment classification using Machine Learning (ML) techniques. A Global Optimization-based Neural Network (GONN) is proposed for the sentimental classification. Then an empirical study is conducted to analyze the performance of the proposed GONN and compare it with the other machine learning algorithms, such as Random Forest (RF), Naive Bayes (NB), and Support Vector Machine (SVM). We dig further to cross-validate these techniques by ten folds to evaluate the most accurate classifier. These models have also been investigated on the Precision-Recall (PR) curve to assess and test the best technique. Experimental results demonstrate that the proposed method is the most appropriate method to predict the classification accuracy for our defined dataset. Specifically, we exhibit that our work is adept at training the textual sentiment classifiers better, thereby enhancing the accuracy of sentiment prediction.
Nowadays, social media platforms have become a mirror that imitates opinions and feelings about any specific product or event. These product reviews are capable of enhancing communication among entrepreneurs and their customers. These reviews need to be extracted and analyzed to predict the sentiment polarity, i.e., whether the review is positive or negative. This paper aims to predict the human sentiments expressed for beauty product reviews extracted from Amazon and improve the classification accuracy. The three phases instigated in our work are data pre-processing, feature extraction using the Bag-of-Words (BoW) method, and sentiment classification using Machine Learning (ML) techniques. A Global Optimization-based Neural Network (GONN) is proposed for the sentimental classification. Then an empirical study is conducted to analyze the performance of the proposed GONN and compare it with the other machine learning algorithms, such as Random Forest (RF), Naive Bayes (NB), and Support Vector Machine (SVM). We dig further to cross-validate these techniques by ten folds to evaluate the most accurate classifier. These models have also been investigated on the Precision-Recall (PR) curve to assess and test the best technique. Experimental results demonstrate that the proposed method is the most appropriate method to predict the classification accuracy for our defined dataset. Specifically, we exhibit that our work is adept at training the textual sentiment classifiers better, thereby enhancing the accuracy of sentiment prediction.
[1] L. Yang, Y. Li, J. Wang and R. Sherratt, "Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning", IEEE Access, vol. 8, pp. 23522-23530, 2020. DOI: 10.1109/access.2020.2969854.
[2] T. U. Haque, N. N. Saber, and F. M. Shah, “Sentiment analysis on large scale Amazon product reviews,” 2018 IEEE Int. Conf. Innov. Res. Dev. ICIRD 2018, no. May, pp. 1–6, 2018, DOI: 10.1109/ICIRD.2018.8376299.
[3] J. Park, "Framework for Sentiment-Driven Evaluation of Customer Satisfaction With Cosmetics Brands", IEEE Access, vol. 8, pp. 98526-98538, 2020. DOI: 10.1109/access.2020.2997522.
[4] N. Nandal, R. Tanwar and J. Pruthi, "Machine learning based aspect level sentiment analysis for Amazon products", Spatial Information Research, vol. 28, no. 5, pp. 601-607, 2020. DOI: 10.1007/s41324-020-00320-2.
[5] M. Hu and B. Liu, “Mining and summarizing customer reviews,” KDD-2004 - Proc. Tenth ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 168–177, 2004, DOI: 10.1145/1014052.1014073.
[6] P. Jain, R. Pamula and G. Srivastava, "A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews", Computer Science Review, vol. 41, p. 100413, 2021. DOI: 10.1016/j.cosrev.2021.100413.
[7] X. Fang and J. Zhan, “Sentiment analysis using product review data,” J. Big Data, vol. 2, no. 1, 2015, DOI: 10.1186/s40537-015-0015-2.
[8] K. Jindal and R. Aron, "A systematic study of sentiment analysis for social media data", Materials Today: Proceedings, 2021. DOI: 10.1016/j.matpr.2021.01.048.
[9] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, 2014, DOI: 10.1016/j.asej.2014.04.011.
[10] Z. Liu, L. Liu, and H. Li, “An Empirical Study of Sentiment Analysis for Chinese Microblogging,” Elev. Wuhan Int. Conf. E-bus., 2012.
[11] J. R. Ragini, P. M. R. Anand, and V. Bhaskar, “Big data analytics for disaster response and recovery through sentiment analysis,” Int. J. Inf. Manage., vol. 42, no. September 2017, pp. 13–24, 2018, DOI: 10.1016/j.ijinfomgt.2018.05.004.
[12] P. Singh, R. S. Sawhney, and K. S. Kahlon, “Sentiment analysis of demonetization of 500 & 1000 rupee banknotes by Indian government,” ICT Express, vol. 4, no. 3, pp. 124–129, 2018, DOI: 10.1016/j.icte.2017.03.001.
[13] P. Pugsee, P. Sombatsri, and R. Juntiwakul, “Satisfactory analysis for cosmetic product review comments,” ACM Int. Conf. Proceeding Ser., vol. Part F1287, pp. 0–5, 2017, DOI: 10.1145/3089871.3089890.
[14] D. A. Kristiyanti and M. Wahyudi, “Feature selection based on Genetic algorithm, particle swarm optimization and principal component analysis for opinion mining cosmetic product review,” 2017 5th Int. Conf. Cyber IT Serv. Manag. CITSM 2017, 2017, DOI: 10.1109/CITSM.2017.8089278.
[15] P. Pugsee, V. Nussiri, and W. Kittirungruang, Opinion mining for skin care products on twitter, vol. 937. Springer Singapore, 2019.
[16] R. Ren, D. D. Wu, and D. D. Wu, “Forecasting stock market movement direction using sentiment analysis and support vector machine,” IEEE Syst. J., vol. 13, no. 1, pp. 760–770, 2019, DOI: 10.1109/JSYST.2018.2794462.
[17] N. Thessrimuang and O. Chaowalit, “Opinion representative of cosmetic products,” 20th Int. Comput. Sci. Eng. Conf. Smart Ubiquitos Comput. Knowledge, ICSEC 2016, 2017, DOI: 10.1109/ICSEC.2016.7859945.
[18] T. Chatchaithanawat and P. Pugsee, “A framework for laptop review analysis,” ICAICTA 2015 - 2015 Int. Conf. Adv. Informatics Concepts, Theory Appl., 2015, DOI: 10.1109/ICAICTA.2015.7335358.
[19] J. Ni, J. Li, and J. McAuley, “Justifying recommendations using distantly-labeled reviews and fine-grained aspects,” EMNLP-IJCNLP 2019 - 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process. Proc. Conf., pp. 188–197, 2020, DOI: 10.18653/v1/d19-1018.
[20] E. Haddi, X. Liu, and Y. Shi, “The role of text pre-processing in sentiment analysis,” Procedia Comput. Sci., vol. 17, pp. 26–32, 2013, DOI: 10.1016/j.procs.2013.05.005.
[21] Y. Zhang, R. Jin, and Z. H. Zhou, “Understanding bag-of-words model: A statistical framework,” Int. J. Mach. Learn. Cybern., vol. 1, no. 1–4, pp. 43–52, 2010, DOI: 10.1007/s13042-010-0001-0.
[22] B. K. Bhavitha, A. P. Rodrigues, and N. N. Chiplunkar, “Comparative study of machine learning techniques in sentimental analysis,” Proc. Int. Conf. Inven. Commun. Comput. Technol. ICICCT 2017, no. Icicct, pp. 216–221, 2017, DOI: 10.1109/ICICCT.2017.7975191.
[23] G. Tomassetti, and L. Cagnina, “Particle swarm algorithms to solve engineering problems: a comparison of performance,” Journal of Engineering, vol. 2013, no. 1, pp. 1-13, 2013, DOI: 10.1155/2013/435104.
[24] H. Nguyen, R. Al, and K. Academy, “Comparative Study of Sentiment Analysis with Product Reviews Using Machine Learning and Lexicon-Based Approaches,” SMU Data Sci. Rev., vol. 1, no. 4, 2018.
[25] J. D. Rodríguez, A. Pérez, and J. A. Lozano, “Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 3, pp. 569–575, 2010, DOI: 10.1109/TPAMI.2009.187.
[26] J. Keilwagen, I. Grosse, and J. Grau, “Area under precision-recall curves for weighted and unweighted data,” PLoS One, vol. 9, no. 3, pp. 1–13, 2014, DOI: 10.1371/journal.pone.0092209.