مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون

الموضوعات :

حسن علی اکبرپور ¹ , محمدتقی منظوری‌شلمانی ² , امیرمسعود رحمانی ³

1 - دانشجوی دکتری دانشگاه آزاد اسلامی، واحد علوم و تحقیقات، گروه مهندسی کامپیوتر، تهران، ایران
2 - دانشیار گروه مهندسی کامپیوتر‌، دانشگاه صنعتی شریف‌، تهران‌، ایران
3 - استاد دانشگاه آزاد اسلامی، واحد علوم و تحقیقات، گروه مهندسی کامپیوتر، تهران، ایران

تاريخ الإرسال : 18 الأربعاء , شعبان, 1442 تاريخ التأكيد : 21 السبت , جمادى الأولى, 1443 تاريخ الإصدار : 18 الأربعاء , صفر, 1444

الکلمات المفتاحية: یادگیری عمیق, خلاصه‌سازی چکیده‌ای, , معماری کدگذار-کدگشا, سازوکار توجه کمکی, ویژگی‌های زبانی.,

ملخص المقالة :

با گسترش وب و در دسترس قرار گرفتن حجم زیادی از اطلاعات در قالب اسناد متنی‌، توسعه سیستم‌های خودکار خلاصه‌سازی متون به‌عنوان یکی از موضوعات مهم در پردازش زبان‌های طبیعی در مرکز توجه محققان قرار گرفته است. البته با معرفی روش‌های یادگیری عمیق در حوزه پردازش متن، خلاصه‌سازی متون نیز وارد فاز جدیدی از توسعه شده و در سال‌های اخیر نیز استخراج خلاصه‌ چکیده‌ای از متن با پیشرفت قابل‌توجهی مواجه شده است. اما می‌توان ادعا کرد که تاکنون از همه ظرفیت شبکه‌های عمیق برای این هدف استفاده نشده است و نیاز به پیشرفت در این حوزه توأمان با در نظر گرفتن ویژگی‌های شناختی همچنان احساس می‌شود. در این راستا، در این مقاله یک مدل دنباله‌ای مجهز به سازوکار توجه کمکی برای خلاصه‌سازی چکیده‌ای متون معرفی شده است که نه‌تنها از ترکیب ویژگی‌های زبانی و بردارهای تعبیه به‌عنوان ورودی مدل یادگیری بهره می‌برد بلکه برخلاف مطالعات پیشین که همواره از سازوکار توجه در بخش کد‌گذار استفاده می‌کردند، از سازوکار توجه کمکی در بخش کدگذار استفاده می‌کند. به کمک سازوکار توجه کمکی معرفی‌شده که از سازوکار ذهن انسان هنگام تولید خلاصه الهام می‌گیرد، بجای اینکه کل متن ورودی کدگذاری شود، تنها قسمت‌های مهم‌تر متن کدگذاری شده و در اختیار کدگشا برای تولید خلاصه قرار می‌گیرند. مدل پیشنهادی همچنین از یک سوئیچ به همراه یک حد آستانه در کدگشا برای غلبه بر مشکل با کلمات نادر بهره می‌برد. مدل پیشنهادی این مقاله روی دو مجموعه داده CNN/Daily Mail و DUC-2004 مورد آزمایش قرار گرفت. بر اساس نتایج حاصل از آزمایش‌ها و معیار ارزیابی ROUGE، مدل پیشنهادی از دقت بالاتری نسبت به سایر روش‌های موجود برای تولید خلاصه چکیده‌ای روی هر دو مجموعه داده برخوردار است.

المصادر:

[1] M. Dey and D. Das, "A Deep Dive into Supervised Extractive and Abstractive Summarization from Text," in Data Visualization and Knowledge Engineering: Springer, 2020, pp. 109-132.
[2] T. Shi, Y. Keneshloo, N. Ramakrishnan, and C. K. Reddy, "Neural abstractive text summarization with sequence-to-sequence models," ACM Transactions on Data Science, vol. 2, no. 1, pp. 1-37, 2021.
[3] A. M. Al-Numai and A. M. Azmi, "The Development of Single-Document Abstractive Text Summarizer During the Last Decade," in Trends and Applications of Text Summarization Techniques: IGI Global, 2020, pp. 32-60.
[4] S. Chakraborty, X. Li, and S. Chakraborty, "A more abstractive summarization model," arXiv preprint arXiv:2002.10959, 2020.
[5] L. Abualigah, M. Q. Bashabsheh, H. Alabool, and M. Shehab, "Text Summarization: A Brief Review," in Recent Advances in NLP: The Case of Arabic Language: Springer, 2020, pp. 1-15.
[6] Y. Dong, "A survey on neural network-based summarization methods," arXiv preprint arXiv:1804.04589, 2018.
[7] F. Zhao, B. Quan, J. Yang, J. Chen, Y. Zhang, and X. Wang, "Document Summarization using Word and Part-of-speech based on Attention Mechanism," in Journal of Physics: Conference Series, 2019, vol. 1168, no. 3: IOP Publishing, p. 032008.
[8] D. Suleiman and A. Awajan, "Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges," Mathematical Problems in Engineering, vol. 2020, 2020.
[9] H. Lin and V. Ng, "Abstractive Summarization: A Survey of the State of the Art," in Proceedings of the AAAI Conference on Artificial Intelligence, 2019, vol. 33, pp. 9815-9822.
[10] W. Kryściński, N. S. Keskar, B. McCann, C. Xiong, and R. Socher, "Neural text summarization: A critical evaluation," arXiv preprint arXiv: 1908.08960, 2019.
[11] X. Xiang, G. Xu, X. Fu, Y. Wei, L. Jin, and L. Wang, "Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization," Information, vol. 9, no. 9, p. 217, 2018.
[12] S. Song, H. Huang, and T. Ruan, "Abstractive text summarization using LSTM-CNN based deep learning," Multimedia Tools and Applications, vol. 78, no. 1, pp. 857-875, 2019.
[13] H. P. Luhn, "The automatic creation of literature abstracts," IBM Journal of research and development, vol. 2, no. 2, pp. 159-165, 1958.
[14] I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Advances in neural information processing systems, 2014, pp. 3104-3112.
[15] A. M. Rush, S. Chopra, and J. Weston, "A neural attention model for abstractive sentence summarization," arXiv preprint arXiv:1509.00685, 2015.
[16] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv: 1409.0473,2014.
[17] S. Chopra, M. Auli, and A. M. Rush, "Abstractive sentence summarization with attentive recurrent neural networks," in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 93-98.
[18] W. Zeng, W. Luo, S. Fidler, and R. Urtasun, "Efficient summarization with read-again and copy mechanism," arXiv preprint arXiv:1611.03382, 2016.
[19] S. Shen, Y. Zhao, Z. Liu, and M. Sun, "Neural headline generation with sentence-wise optimization," arXiv preprint arXiv:1604.01904, 2016.
[20] S. Takase, J. Suzuki, N. Okazaki, T. Hirao, and M. Nagata, "Neural headline generation on abstract meaning representation," in Proceedings of the 2016 conference on empirical methods in natural language processing, 2016, pp. 1054-1059.
[21] T. Wang, P. Chen, K. Amaral, and J. Qiang, "An experimental study of LSTM encoder-decoder model for text simplification," arXiv preprint arXiv:1609.03663, 2016.
[22] Q. Chen, X. Zhu, Z. Ling, S. Wei, and H. Jiang, "Distraction-based neural networks for document summarization," arXiv preprint arXiv:1610.08462, 2016.
[23] A. See, P. J. Liu, and C. D. Manning, "Get to the point: Summarization with pointer-generator networks," arXiv preprint arXiv:1704.04368, 2017.
[24] K. Al-Sabahi, Z. Zuping, and Y. Kang, "Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization," arXiv preprint arXiv:1809.06662, 2018.
[25] K. Yao, L. Zhang, D. Du, T. Luo, L. Tao, and Y. Wu, "Dual encoding for abstractive text summarization," IEEE transactions on cybernetics, 2018.
[26] W. H. Alquliti and N. B. A. Ghani, "Convolutional Neural Network based for Automatic Text Summarization."
[27] Y. Zhang, D. Li, Y. Wang, Y. Fang, and W. Xiao, "Abstract Text Summarization with a Convolutional Seq2seq Model," Applied Sciences, vol. 9, no. 8, p. 1665, 2019.
[28] R. Nallapati, B. Zhou, C. Gulcehre, and B. Xiang, "Abstractive text summarization using sequence-to-sequence rnns and beyond," arXiv preprint arXiv:1602.06023, 2016.
[29] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality, Nips," 2013.
[30] W. Yoon, Y. S. Yeo, M. Jeong, B.-J. Yi, and J. Kang, "Learning by Semantic Similarity Makes Abstractive Summarization Better," arXiv preprint arXiv:2002.07767, 2020.
[31] A. Graves, "Generating sequences with recurrent neural networks," arXiv preprint arXiv:1308.0850, 2013.
[32] P. Over, H. Dang, and D. Harman, "DUC in context," Information Processing & Management, vol. 43, no. 6, pp. 1506-1520, 2007.
[33] C.-Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Association for Computational Linguistic, Barcelona, Spain, 2004.
[34] A. Fan, D. Grangier, and M. Auli, "Controllable abstractive summarization," arXiv preprint arXiv:1711.05217, 2017.
[35] R. Paulus, C. Xiong, and R. Socher, "A deep reinforced model for abstractive summarization," arXiv preprint arXiv:1705.04304, 2017.
[36] W.-T. Hsu, C.-K. Lin, M.-Y. Lee, K. Min, J. Tang, and M. Sun, "A unified model for extractive and abstractive summarization using inconsistency loss," arXiv preprint arXiv:1805.06266, 2018.
[37] A. Celikyilmaz, A. Bosselut, X. He, and Y. Choi, "Deep communicating agents for abstractive summarization," arXiv preprint arXiv:1803.10357, 2018.
[38] H. Zhang, J. Xu, and J. Wang, "Pretraining-based natural language generation for text summarization," arXiv preprint arXiv:1902.09243, 2019.
[39] P. Li, L. Bing, and W. Lam, "Actor-critic based training framework for abstractive summarization," arXiv preprint arXiv:1803.11070, 2018.
[40] Q. Zhou, N. Yang, F. Wei, and M. Zhou, "Selective encoding for abstractive sentence summarization," arXiv preprint arXiv:1704.07073, 2017.

شارک

عنوان URL للمقالة

مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون

رایمگ

الروابط

المراكز ذات الصلة

دعامة

الصفحات الرسمية