Utilizing Gated Recurrent Units to Retain Long Term Dependencies with Recurrent Neural Network in Text Classification
Subject Areas : Natural Language ProcessingNidhi Chandra 1 , Laxmi Ahuja 2 , Sunil Kumar Khatri 3 , Himanshu Monga 4
1 - Amity University Uttar Pradesh, Noida, India
2 - Amity University Uttar Pradesh, Noida,India
3 - Amity University, Tashkent
4 - Jawaharlal Lal Nehru Government Engineering College, Nodia, India
Keywords: Gated Recurrent Units, Recurrent Neural Network, Word Embedding, Deep Learning, LSTM.,
Abstract :
The classification of text is one of the key areas of research for natural language processing. Most of the organizations get customer reviews and feedbacks for their products for which they want quick reviews to action on them. Manual reviews would take a lot of time and effort and may impact their product sales, so to make it quick these organizations have asked their IT to leverage machine learning algorithms to process such text on a real-time basis. Gated recurrent units (GRUs) algorithms which is an extension of the Recurrent Neural Network and referred to as gating mechanism in the network helps provides such mechanism. Recurrent Neural Networks (RNN) has demonstrated to be the main alternative to deal with sequence classification and have demonstrated satisfactory to keep up the information from past outcomes and influence those outcomes for performance adjustment. The GRU model helps in rectifying gradient problems which can help benefit multiple use cases by making this model learn long-term dependencies in text data structures. A few of the use cases that follow are – sentiment analysis for NLP. GRU with RNN is being used as it would need to retain long-term dependencies. This paper presents a text classification technique using a sequential word embedding processed using gated recurrent unit sigmoid function in a Recurrent neural network. This paper focuses on classifying text using the Gated Recurrent Units method that makes use of the framework for embedding fixed size, matrix text. It helps specifically inform the network of long-term dependencies. We leveraged the GRU model on the movie review dataset with a classification accuracy of 87%.
[1] Andrew L , Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. "Learning word vectors for sentiment analysis." In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pp. 142-150. 2011.
[2] Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. "Learning long-term dependencies with gradient descent is difficult." IEEE transactions on neural networks 5, no. 2, 1994,pp 157-166.
[3] Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." IEEE transactions on pattern analysis and machine intelligence 35, no. 8, 2013, pp 1798-1828.
[4] B. Athiwaratkun, Ben, and Jack W. Stokes. "Malware classification with LSTM and GRU language models and a character-level CNN." In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2482-2486.
[5] Chen, Huimin, Maosong Sun, Cunchao Tu, Yankai Lin, and Zhiyuan Liu. "Neural sentiment classification with user and product attention." In Proceedings of the 2016 conference on empirical methods in natural language processing, 2016, pp. 1650-1659.
[6] Chen, Peng, Zhongqian Sun, Lidong Bing, and Wei Yang. "Recurrent attention network on memory for aspect sentiment analysis." In Proceedings of the 2017 conference on empirical methods in natural language processing,2017 pp. 452-461.
[7] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation, 2014 ." arXiv preprint arXiv:1406.1078.
[8] Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. "Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014." arXiv preprint arXiv:1412.3555.
[9] C.Zhou, Chunting, Chonglin Sun, Zhiyuan Liu, and Francis Lau. "A C-LSTM neural network for text classification."2015, arXiv preprint arXiv:1511.08630.
[10] Collobert, Ronan, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. "Natural language processing (almost) from scratch." Journal of machine learning research 12, no. ARTICLE 2011, pp 2493-2537.
[11] D. Tang, Duyu, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. "Learning sentiment-specific word embedding for twitter sentiment classification." In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics ,Volume 1, 2014 , pp. 1555-1565.
[12] Hemmatian, Fatemeh, and Mohammad Karim Sohrabi. "A survey on classification techniques for opinion mining and sentiment analysis." Artificial Intelligence Review 52, no. 3, 2019,pp 1495-1545.
[13] Guolong Liu, Xiaofei Xu, Bailong Deng, Siding Chen, and Li. "A hybrid method for bilingual text sentiment classification based on deep learning." In 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD),IEEE, 2016,pp. 93-98.
[14] Guan, Ziyu, Long Chen, Wei Zhao, Yi Zheng, Shulong Tan, and Deng Cai. "Weakly-Supervised Deep Learning for Customer Review Sentiment Classification." In IJCAI,2016, pp. 3719-3725.
[15] Goldberg, Yoav. "A primer on neural network models for natural language processing." Journal of Artificial Intelligence Research 57,2016, pp. 345-420.
[16] H.Wang, and Dequn Zhao. "Emotion analysis of microblog based on emotion dictionary and Bi-GRU." In 2020 Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), . IEEE, 2020, pp. 197-200.
[17]J. Pennington, R. Socher, and C. D. Manning. “Glove: Global vectors for word representation”. Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1532-1543.
[18] Jabreel, Mohammed, Fadi Hassan, and Antonio Moreno. "Target-dependent sentiment analysis of tweets using bidirectional gated recurrent neural networks." In Advances in hybridization of intelligent methods, Springer, Cham, 2018, pp. 39-55.
[19] Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, Yoshua Bengio. "Gated feedback recurrent neural networks." In International conference on machine learning, PMLR, 2015, pp. 2067-2075.
[20] Johnson, Rie, and Tong Zhang. "Effective use of word order for text categorization with convolutional neural networks," 2014, arXiv preprint arXiv:1412.1058.
[21] L Zhang, S Wang, B Liu. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2018 Jul, 8(4):e1253.
[22[ Le, Quoc, and Tomas Mikolov. "Distributed representations of sentences and documents." In International conference on machine learning, PMLR, 2014, pp. 1188-1196.
[23]Li, Cheng, Xiaoxiao Guo, and Qiaozhu Mei. "Deep memory networks for attitude identification." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining,2017, pp. 671-680.
[24] Luo LX. “Network text sentiment analysis method combining LDA text representation and GRU-CNN”, Personal and Ubiquitous Computing 23, no. 3 ,2019, pp 405-12.
[25]M Zulqarnain, R Ghazali, MG Ghouse.” Efficient processing of GRU based on word embedding for text classification”, International Journal of Informatics Visualisation, vol 3 No 4, 2019, pp.377-383.
[26] MU Salur, I Aydin. “A novel hybrid deep learning model for sentiment classification”. IEEE Access. 2020 Mar 23;8,pp 58080-93.
[27] Moraes, Rodrigo, João Francisco Valiati, and Wilson P. GaviãO Neto. "Document-level sentiment classification: An empirical comparison between SVM and ANN." Expert Systems with Applications 40, no. 2 , 2013, pp 621-633.
[28] Ö Yildirim. "A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification." Computers in biology and medicine 96 , 2018, pp189-202.
[29] R. Ni and H. Cao, "Sentiment Analysis based on GloVe and LSTM-GRU," In 2020 39th Chinese Control Conference (CCC), IEEE, 2020, pp. 7492-7497.
[30] S Yan, J Chai, L Wu. "Bidirectional GRU with Multi-Head Attention for Chinese NER." In 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), IEEE, 2020, pp. 1160-1164.
[31] S Yang, X Yu, Y Zhou. "LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review Dataset as an Example." In 2020 International Workshop on Electronic Communication and Artificial Intelligence , IWECAI, IEEE, 2020, pp. 98-101.
[32] Tang, Duyu, Bing Qin, and Ting Liu. "Document modeling with gated recurrent neural network for sentiment classification." In Proceedings of the 2015 conference on empirical methods in natural language processing,2015, pp. 1422-1432.
[33]Tang, Duyu, Bing Qin, and Ting Liu. "Learning semantic representations of users and products for document level sentiment classification." In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing ,Volume 1, 2015, pp. 1014-1023.
[34]Tai, Kai Sheng, Richard Socher, and Christopher D. Manning. "Improved semantic representations from tree-structured long short-term memory networks." 2015, arXiv preprint arXiv:1503.00075.
[35]Thet, Tun Thura, Jin-Cheon Na, and Christopher SG Khoo. "Aspect-based sentiment analysis of movie reviews on discussion boards." Journal of information science 36, no. 6 2010, pp 823-848.
[36]T. Mikolov, , Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. "Distributed representations of words and phrases and their compositionality."2013, arXiv preprint arXiv:1310.4546.
[37]Turian, Joseph, Lev Ratinov, and Yoshua Bengio. "Word representations: a simple and general method for semi-supervised learning." In Proceedings of the 48th annual meeting of the association for computational linguistics,2010, pp. 384-394.
[38]Y. Pan and M. Liang, "Chinese text sentiment analysis based on bi-gru and self-attention." In 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), . IEEE, 2020, vol. 1, pp. 1983-1988.
[39]Zhou, Zhiyuan, Yuan Qi, Zheng Liu, Chengchao Yu, and Zhihua Wei. "A C-GRU neural network for rumors detection." In 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), IEEE, 2018, pp. 704-708.
[40] Zhang, Wei, Quan Yuan, Jiawei Han, and Jianyong Wang. "Collaborative multi-Level embedding learning from reviews for rating prediction." In IJCAI, vol. 16, pp. 2986-2992. 2016.
[41] Data Set: Large Movie Review Dataset v1.0 by Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Available http://ai.stanford.edu/~amaas/data/sentiment/ .