تحليل احساس در رسانههاي اجتماعي فارسي با رويکرد شبکه عصبي پيچشي
محورهای موضوعی : مهندسی برق و کامپیوترمرتضي روحانيان 1 , مصطفي صالحي 2 , علي درزي 3 , وحید رنجبر 4
1 - دانشگاه تهران
2 - دانشگاه تهران
3 - دانشگاه تهران
4 - مهندسی کامپیوتر
کلید واژه: تحليل احساسرسانههاي اجتماعيشبکه عصبي پيچشيشدت نظراتمتون کوتاه,
چکیده مقاله :
افزايش کاربري شهروندان از رسانههاي اجتماعي (مانند توئيتر، فروشگاههاي برخط و غيره) آنها را به منبعي عظيم براي تحليل و درک پديدههاي گوناگون تبديل کرده است. هدف تحليل احساس استفاده از دادههاي به دست آمده از اين رسانهها و کشف گرايشهاي پيدا و پنهان کاربران نسبت به موجوديتهاي خاص حاضر در متن است. در کار حاضر ما با استفاده از شبکه عصبي پيچشي که نوعي شبکه عصبي پيشخور است، به تحليل گرايش نظرات در رسانههاي اجتماعي در دو و پنج سطح و با در نظر گرفتن شدت آنها ميپردازيم. در اين شبکه عمل کانولوشن با استفاده از صافيهايي با اندازههاي مختلف بر روي بردارهاي جملات ورودي اعمال ميشود و بردار ويژگي حاصل به عنوان ورودي لايه نرم بيشينه براي دستهبندي نهايي جملات به کار ميرود. شبکههاي عصبي پيچشي با پارامترهاي مختلف با استفاده از معيار مساحت زير منحني و بر روي مجموعه داده جمعآوري شده از رسانههاي اجتماعي فارسي ارزيابي شدند و نتايج به دست آمده نشاندهنده بهبود کارايي آنها در گستره رسانههاي اجتماعي نسبت به روشهاي سنتي يادگيري ماشين به خصوص بر روي دادهها با طول کوتاهتر هستند.
With the social media engagement on the rise, the resulting data can be used as a rich resource for analyzing and understanding different phenomena around us. A sentiment analysis system employs these data to find the attitude of social media users towards certain entities in a given document. In this paper we propose a sentiment analysis method for Persian text using Convolutional Neural Network (CNN), a feedforward Artificial Neural Network, that categorize sentences into two and five classes (considering their intensity) by applying a layer of convolution over input data through different filters. We evaluated the method on three different datasets of Persian social media texts using Area under Curve metric. The final results show the advantage of using CNN over earlier attempts at developing traditional machine learning methods for Persian texts sentiment classification especially for short texts.
[1] S. Greenwood, A. Perrin, and M. Duggan, "Social media update 2016: facebook usage and engagement is on the rise, while adoption of other platforms holds steady," Pew Research Center, 2016.
[2] J. Mander, "Daily time spent on social networks rises to 1.72 hours," London: Global Web Index, 2015.
[3] B. Liu, "Sentiment analysis and opinion mining," Synthesis Lectures on Human Language Technologies, vol. 5, no. 1, pp. 1-167, May 2012.
[4] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," in Proc. of the ACL-02 Conf. on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, vol. 10, pp. 79-86, Philadelphia, PA, USA, 6-7 Jul. 2002.
[5] A. Tripathy, A. Agrawal, and S. Kumar Rath, "Classification of sentiment reviews using n-gram machine learning approach," Expert Systems with Applications, vol. 57, pp. 117-126, Sept. 2016.
[6] T. Mullen and N. Collier, "Sentiment analysis using support vector machines with diverse information sources," in Proc. 9th Conference on Empirical Methods in Natural Language Processing, EMNLP’04, vol. 4, pp. 412-418, Jan. 2004.
[7] B. Agarwal and N. Mittal, "Machine learning approach for sentiment analysis," In: Prominent Feature Extraction for Sentiment Analysis, pp. 21-45, Dec. 2016.
[8] S. Poria, H. Peng, A. Hussain, N. Howard, and E. Cambria, "Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis," Neurocomputing, vol. 261, pp. 217-230, Oct. 2017.
[9] L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou, and K. Xu, "Adaptive recursive neural network for target-dependent twitter sentiment classification," in Proc. 52nd Annual Meeting of the Association for Computational Linguistics, ACL’14, vol. 2, pp. 49-54, Baltimore, MD, USA, Jun. 2014.
[10] Y. Kim, Convolutional Neural Networks for Sentence Classification, arXiv preprint arXiv:1408.5882, 2014.
[11] Y. Zhang and B. Wallace, A Sensitivity Analysis of (and Ppractitioners' Guide to) Convolutional Neural Networks for Sentence Classification, arXiv preprint arXiv:1510.03820, 2015.
[12] E. T. Jaynes, "Information theory and statistical mechanics," Physical Review, vol. 106, no. 4, pp. 620, May 1957.
[13] M. S. Neethu and R. Rajasree, "Sentiment analysis in twitter using machine learning techniques," in Proc. IEEE 4th Int. Conf. on, Computing, Communications and Networking Technologies, ICCCNT’13, 5 pp., Tiruchengode, India, 4-6 Jul. 2013.
[14] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, Sept.. 1995.
[15] م. ع. زارع چاهوکي و س. ح. ر. محمدي، "بهينهسازي هستههاي چندگانه در ماشين بردار پشتيبان جفتي براي کاهش شکاف معنايي تشخيص صفحات فريبآميز،" مجله مهندسي برق دانشگاه تبريز، جلد 46، شماره 4، صص. 135-145، زمستان 1395.
[16] C. N. Dos Santos and M. Gatti, "Deep convolutional neural networks for sentiment analysis of short texts," in Proc. of, the 25th Int. Conf. on Computational Linguistics, COLING'14, pp. 69-78, Dublin, Ireland, 25–29 Aug. 2014.
[17] Y. Zhang, M. Chen, L. Liu, and Y. Wang, "An effective convolutional neural network model for Chinese sentiment analysis," in Proc. AIP Conf. Proc., vol. 1836, pp. 020084, Rome, Italy, 27-29 Jan. 2017.
[18] M. Cieliebak, J. Deriu, D. Egger, and F. Uzdilli, "A twitter corpus and benchmark resources for german sentiment analysis," in Proc. of the 5th Ine. Workshop on Natural Language Processing for Social Media, SocialNLP, pp. 45-51, Boston, USA, Dec. 2017.
[19] R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, "Recursive deep models for semantic compositionality over a sentiment treebank," in Proc. of the Conf. on Empirical Methods in Natural Language Processing, EMNLP'13, vol. 1631, pp. 1631-1642, Seattle, WA, USA, 18-21 Oct. 2013.
[20] م. ح. رفان، م. کمرزرين و ع. دمشقي، "بهبود دقت و پايداري RTDGPS با استفاده از مدل ترکيبي RNN و PSO،" مجله مهندسي برق دانشگاه تبريز، جلد 46، شماره 1، صص. 185-196، بهار 1395.
[21] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," J. of Machine Learning Research, vol. 12, no. 76, pp. 2493-2537, Aug. 2011.
[22] A. Bagheri and M. Saraee, "Persian sentiment analyzer: a framework based on a novel feature selection method," International J. of Artificial Intelligence™, vol. 12, no. 2, pp. 115-129, Nov. 2014.
[23] ح. اکبريان، م. صالحي و ﻫ. ويسي، "تعيين جهتگيري نظرات در رسانههاي اجتماعي فارسيزبان،" ارائهشده در بيست و چهارمين کنفرانس مهندسي برق ايران، 6 صص.، شيراز، ايران، 23-21 ارديبهشت 1395.
[24] M. E. Basiri, A. R. Naghsh-Nilchi, and N. Ghassem-Aghaee, "A framework for sentiment analysis in Persian," Open Trans. on Information Processing, vol. 1, no. 3, pp. 1-14, Nov. 2014.
[25] M. S. Hajmohammadi and R. Ibrahim, "A SVM-based method for sentiment analysis in persian language," in Proc. Int. Conf. on Graphic and Image Processing, ICGIP’12, vol. 8768, 5 pp., Singapore, Singapore, 5-7 Oct. 2013.
[26] B. Roshanfekr, S. Khadivi, and M. Rahmati, "Sentiment analysis using deep learning on Persian texts," in Iranian Conf. on Electrical Engineering, ICEE’17, pp. 1503-1508, Tehran, Iran, 2-4 May 2017.
[27] K. Wang, X. Wang, L. Lin, M. Wang, and W. Zuo, "3D human activity recognition with reconfigurable convolutional neural networks," in Proc. of the 22nd ACM Int Conf. on Multimedia, pp. 97-106, Orland, FL, USA, 18-19 Jun. 2014.
[28] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Proc. Advances in Neural Information Processing Systems, NIPS'13, pp. 3111-3119, Lake Tahoe, CA, USA, 5-10 Dec. 2013.
[29] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, "A neural probabilistic language model," J. of Machine Learning Research, vol. 3, no. 6, pp. 1137-1155, Feb. 2003.
[30] پ. حسيني، ع. احمديان رامکي، ح. ملکي، م. انواري و س. ا. ميرروشندل، "پيکره فارسي تحليل احساس سنتيپرس،" مجموعه مقالات سومين همايش ملي زبانشناسي رايانشي، ، 8 صص.، تهران، ایران، آبان 1393.
[31] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching Word Vectors with Subword Iinformation, arXiv preprint arXiv:1607.04606, 2016.
[32] J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio, "Theano: a CPU and GPU math compiler in python," in Proc. 9th Python in Science Conf., 7 pp., Austin, Texas, 28 Jun.-3 Jul. 2010.
[33] F. Pedregosaet al., "Scikit-learn: machine learning in python," J. of Machine Learning Research, vol. 12, pp. 2825-2830, Oct. 2011.