تشخیص عددی قطبیت با کاربست شبکههای عمیق بازگشتی و یادگیری بانظارت در نظرکاوی بر روی مرورهای فارسی کاربران حوزه تجارت الکترونیک
محورهای موضوعی : مهندسی برق و کامپیوترسپیده جمشیدی نژاد 1 , فاطمه احمدی آبکناری 2 , پیمان بیات 3
1 - دانشگاه آزاد اسلامی واحد رشت
2 - دانشگاه پیام نور رشت
3 - دانشگاه آزاد اسلامی واحد رشت
کلید واژه: پردازش زبان طبیعی, تحلیل احساس, تشخیص قطبیت جملات, تشخیص هرزنظر, شبکههای عصبی عمیق, نظرکاوی,
چکیده مقاله :
نظرکاوی، زیرشاخهای از دادهکاوی است که به حوزه پردازش زبان طبیعی وابسته بوده و با گسترش تجارت الکترونیکی، به یکی از زمینههای محبوب در بازیابی اطلاعات تبدیل شده است. این حوزه بر زیرمجموعههای مختلفی مانند تشخیص قطبیت، استخراج جنبه و تشخیص هرزنظر تمرکز دارد. اگرچه وابستگی نهانی بین این زیرمجموعهها وجود دارد اما طراحی یک چارچوب جامع شامل تمامی این موارد، بسیار چالشبرانگیز است. پژوهشهای موجود در این حوزه اکثراً بر روی زبان انگلیسی بوده و برای تحلیل احساس، بدون توجه به زیرمجموعههای تأثیرگذار، فقط بر روی حالت باینری تمرکز داشتهاند. همچنین استفاده از یادگیری ماشینی برای دستهبندی نظرات بسیار رایج است و در سالهای اخیر، اغلب پژوهشها از یادگیری عمیق با اهداف متفاوت استفاده کردهاند. از آنجا که در ادبیات پژوهشی به چارچوبی جامع با تمرکز بر زیرمجموعههای تأثیرگذار کمتر پرداخته شده است، از این رو در مقاله حاضر با استفاده از راهکارهای نظرکاوی و پردازش زبان طبیعی، چارچوب جامع مبتنی بر یادگیری عمیق با نام RSAD که پیشتر توسط نویسندگان این مقاله در حوزه نظرکاوی کاربران فارسی زبان توسعه داده شده بود برای تشخیص قطبیت در دو حالت باینری و غیر باینری جملات با تمرکز بر سطح جنبه بهبود داده شده که تمام زیرمجموعههای لازم برای تحلیل احساس را پوشش میدهد. مقایسه و ارزیابی RSAD با رویکردهای موجود، نشاندهنده استحکام آن است.
Opinion mining as a sub domain of data mining is highly dependent on natural language processing filed. Due to the emerging role of e-commerce, opinion mining becomes one of the interesting fields of study in information retrieval scope. This domain focuses on various sub areas such as polarity detection, aspect elicitation and spam opinion detection. Although there is an internal dependency among these sub sets, but designing a thorough framework including all of the mentioned areas is a highly demanding and challenging task. Most of the literatures in this area have been conducted on English language and focused on one orbit with a binary outcome for polarity detection. Although the employment of supervised learning approaches is among the common utilizations in this area, but the application of deep neural networks has been concentrated with various objectives in recent years so far. Since the absence of a trustworthy and a complete framework with special focuses on each impacting sub domains is highly observed in opinion mining, hence this paper concentrates on this matter. So, through the usage of opinion mining and natural language processing approaches on Persian language, the deep neural network-based framework called RSAD that was previously suggested and developed by the authors of this paper is optimized here to include the binary and numeric polarity detection output of sentences on aspect level. Our evaluation on RSAD performance in comparison with other approaches proves its robustness.
[1] B. Sabeti, P. Hosseini, G. Ghassem-Sani, and S. A. Mirroshandel, LexiPers: An Ontology Based Sentiment Lexicon for Persian. arXiv preprint arXiv:1911.05263, 2019.
[2] E. S. Tellez, et al., "A simple approach to multilingual polarity classification in Twitter," Pattern Recognition Letters, vol. 94, pp. 68-74, 15 Jul. 2017.
[3] R. Dehkharghani, "Building phrase polarity lexicons for sentiment analysis," Int. J. Interact. Multim. Artif. Intell., vol. 5, no. 3, pp. 98-105, 2018.
[4] S. Al-Azani and E. S. M. El-Alfy, "Hybrid deep learning for sentiment polarity determination of arabic microblogs," in Proc. Int. Conf. on Neural Information Processing, pp. 491-500, Guangzhou, China, 14-18 Nov. 2017.
[5] K. Dashtipour, M. Gogate, J. Li, F. Jiang, B. Kong, and A. Hussain, "A hybrid Persian sentiment analysis framework: integrating dependency grammar based rules and deep neural networks," Neurocomputing, vol. 380, pp. 1-10, 7 Mar. 2020.
[6] Y. Chandra and A. Jana, "Sentiment analysis using machine learning and deep learning," in Proc. IEEE 7th Int. Conf. on Computing for Sustainable Global Development, 4 pp., New Delhi, India, 12-14 Mar. 2020.
[7] S. Chen, C. Peng, L. Cai, and L. Guo, "A deep neural network model for target-based sentiment analysis," in Proc. IEEE Int Joint Conf. on Neural Networks, 7 pp., Rio de Janeiro, Brazil, 8-13Jul. 2018.
[8] M. El-Masri, N. Altrabsheh, H. Mansour, and A. Ramsay, "A web-based tool for Arabic sentiment analysis," Procedia Computer Science, vol. 117, pp. 38-45, 2017.
[9] M. Zhang, "E-commerce comment sentiment classification based on deep learning," in Proc. IEEE 5th Int. Conf. on Cloud Computing and Big Data Analytics, pp. 184-187, Chengdu, China, 10-13 Apr. 2020.
[10] B. Liu, "Sentiment analysis and opinion mining," Synthesis Lectures on Human Language Technologies, vol. 5, no. 1, pp. 1-167, 2012.
[11] E. Asgarian, A. Saeedi, B. Stiri, and H. Ghaemi, NLPTools [Online]. Available: https://wtlab.um.ac.ir, 2016.
[12] M. Hu and B. Liu, "Mining opinion features in customer reviews," AAAI, vol. 4, no. 4, pp. 755-760, Jul. 2004.
[13] A. Hassan and A. Mahmood, "Deep learning approach for sentiment analysis of short texts," in Proc. IEEE 3rd Int. Conf. on Control, Automation and Roboticspp. 705-710, Nagoya, Japan, 24-26 Apr. 2017.
[14] https://github.com/ICTRC/Parsivar
[15] N. Jindal and B. Liu, "Opinion spam and analysis," in Proc. of the Int. Conf. on Web Search and Data Mining, pp. 219-230, Palo Alto, CA, USA 11-12 Feb. 2008.
[16] F. H. Li, M. Huang, Y. Yang, and X. Zhu, "Learning to identify review spam," in Proc. 22nd Int. Joint Conf. on Artificial Intelligence, pp. 2488-2493, Barcelona, Spain, 16–22 Jul. 2011.
[17] M. E. Basiri, N. Safarian, and H. K. Farsani, "A supervised framework for review spam detection in the Persian language," in Proc. IEEE 5th Int. Conf. on Web Research, pp. 203-207, Tehran, Iran, 24-25 Apr. 2019.
[18] S. Jamshidi-Nejad, F. Ahmadi-Abkenari, and P. Bayat, "A combination of frequent pattern mining and graph traversal approaches for aspect elicitation in customer reviews," IEEE Access, vol. 8, pp. 151908-151925, 2020.
[19] M. Khalash and M. Imany, "Persian Language Processing Tool," http://www.sobhe.ir/hazm, 2013.
[20] A. Mohammadi, M. R. Pajoohan, M. Montazeri, and M. Nematbakhsh, "Identifying explicit features of Persian comments," J. of Computing and Security, vol. 6, no. 1, pp. 1-11, Winter/ Spring 2019.
[21] G. Jain, M. Sharma, and B. Agarwal, "Spam detection in social media using convolutional and long short term memory neural network," Annals of Mathematics and Artificial Intelligence, vol. 85, pp. 21-44, 2019.
[22] H. Nguyen and K. Shirai, "A joint model of term extraction and polarity classification for aspect-based sentiment analysis," in Proc. IEEE 10th In. Conf. on Knowledge and Systems Engineering, pp. 323-328, Ho Chi Minh City, Vietnam, 1-3 Nov. 2018.
[23] C. Wu, F. Wu, S. Wu, Z. Yuan, and Y. Huang, "A hybrid unsupervised method for aspect term and opinion target extraction," Knowledge-Based Systems, vol. 148, pp. 66-73, 2018.
[24] R. Dehkharghani, Y. Saygin, B. Yanikoglu, and K. Oflazer, "SentiTurkNet: a Turkish polarity lexicon for sentiment analysis," Language Resources and Evaluation, vol. 50, no. 3, pp. 667-685, Sept. 2016.
[25] T. Hofmann, "Probabilistic latent semantic indexing," in Proc. of the 22nd Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 50-57, Berkele, CA, USA, 15-19 Aug. 1999.
[26] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," The J. of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[27] T. Griffiths and M. Steyvers, "Prediction and semantic association," Advances in Neural Information Processing Systems, pp. 11-18, 2002.
[28] M. Steyvers and T. Griffiths, Probabilistic Topic Models: Handbook of Latent Semantic Analysis, pp. 439-460, Psychology Press, 2007.
[29] M. Hu and B. Liu, "Mining and summarizing customer reviews," in Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 168-177, Seattle, WA, USA, 22-25 Aug. 2004.
[30] S. Blair-Goldensohn, et al., "Building a sentiment summarizer for local service reviews," in Proc. of the WWW2008 Workshop: NLP in the Information Explosion Era, pp. 14-23, Beijing, China, 22-22 Apr. 2008.
[31] L. W. Ku, Y. T. Liang, and H. H. Chen, "Opinion extraction, summarization and tracking in news and blog corpora," in Proc. AAAI Spring Symp.: Computational Approaches to Analyzing Weblogs, pp. 100-107, Mar. 2006.
[32] A. Bagheri, "Integrating word status for joint detection of sentiment and aspect in reviews," J. of Information Science, vol. 45, no. 6, pp. 736-755, 2019.
[33] M. Shams and A. Baraani-Dastjerdi, "Enriched LDA (ELDA): combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction," Expert Systems with Applications, vol. 80, pp. 136-146, 1 Sept. 2017.
[34] A. Bagheri, M. Saraee, and F. de Jong, "Sentiment classification in Persian: introducing a mutual information-based method for feature selection," in Proc. 21st Iranian Conf. on Electrical Engineering, 6 pp., Mashhad, Iran, 14-16 May 2013.