تولید متن با رویکرد جمعی مبتنی بر شبکه‌های مولد مقابله‌ای

الموضوعات :

احسان منتهایی ¹ , مهدیه سلیمانی باغشاه ²

1 - دانشگاه صنعتی شریف
2 - دانشگاه صنعتی شریف

تاريخ الإرسال : 27 الأحد , شوال, 1440 تاريخ التأكيد : 27 الأحد , شوال, 1440 تاريخ الإصدار : 20 الإثنين , جمادى الأولى, 1442

الکلمات المفتاحية: تولید متنمدل مولدشبکه‌های GANیادگیری جمعی,

ملخص المقالة :

تولید متن یکی از مسايل مهم در حوزه پردازش زبان طبیعی به حساب می‌آید. روش‌های پایه ارائه‌‌شده در این حوزه، دارای مشکلاتی نظیر ناهمخوانی داده در زمان آموزش و آزمون و همچنین تابع هدف نامناسب هستند. در چند سال اخیر پیشرفت‌های زیادی در حوزه تولید تصویر به وسیله شبکه‌های مولد مقابله‌ای انجام شده است. همین موضوع باعث شده که استفاده از شبکه‌های مولد مقابله‌ای در تولید متن نیز به تازگی مورد توجه قرار گیرد. اما به دلیل گسسته‌بودن جنس دنباله‌ها، این امر به سادگی میسر نبوده و برای حل آن نیاز به استفاده از راهکار‌هایی مثل یادگیری تقویتی و استفاده از تقریب وجود دارد. به علاوه ناپایداری شبکه‌های مولد مقابله‌ای باعث ایجاد چالش‌های جدید و بالارفتن پیچیدگی مسأله می‌شود. در این پژوهش با رویکردی جدید که جمعی و مبتنی بر ایده شبکه‌های مولد مقابله‌ای است به ارائه روشی جمعی برای حل مسأله تولید متن می‌پردازیم. اساس روش پیشنهادی تخمین نسبت چگالی احتمال بوده و با این رویکرد روشی بدون مشکل در برابر گسستگی دنباله‌ها ارائه شده است. راهکار ارائه‌شده نسبت به روش‌های شبکه‌های مولد مقابله‌ای در حوزه دنباله، آموزشی پایدار‌تر دارد و همچنین مشکل اُریبی مواجهه نیز در روش پیشنهادی وجود ندارد. آزمایش‌های انجام‌شده نشان‌دهنده برتری روش پیشنهادی در مقایسه با روش‌های پیشین بر روی مجموعه داده‌های معروف مربوط به تولید متن است.

المصادر:

[1] F. Huszar, How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? Computing Research Repository (CoRR), 2015. abs/1511.05101.
[2] I. J. Goodfellow, et al., "Generative adversarial nets," in Proc. Advances in Neural Information Processing Systems 27: Annual Conf. on Neural Information Processing Systems, vol. 2, pp. 2672-2680, Montreal, Canada, Dec. 2014.
[3] I. J. Goodfellow, NIPS 2016 Tutorial: Generative Adversarial Networks, Computing Research Repository (CoRR), 2017. abs/1701.00160.
[4] L. Yu, W. Zhang, J. Wang, and Y. Yu, "SeqGAN: sequence generative adversarial nets with policy gradient," in Proc. of the 31st AAAI Conf. on Artificial Intelligence, pp. 2852-2858, San Francisco, CA, USA, Feb. 2017.
[5] S. Bengio, O. Vinyals, N. Jaitly, N. Shazeer, "Scheduled sampling for sequence prediction with recurrent neural networks," Advances in Neural Information Processing Systems, vol. 1, pp. 1171-1179, Montreal, Canada, 7-12 Dec. 2015.
[6] M. J. Kusner and J. M. Hernandez-Lobato, GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution, arXiv e-prints, 2016: p. arXiv:1611.04051-arXiv:1611.04051.
[7] E. Jang, S. Gu, and B. Poole, "Categorical reparameterization with gumbel-softmax," in Proc. Int. Conf. on Learning Representations, ICLR’17, 12 pp., Toulon, France, 24-26 Apr. 2017.
[8] C. J. Maddison, A. Mnih, and Y. W. The, "The concrete distribution: a continuous relaxation of discrete random variables," in Proc. 5th Int. Conf. on Learning Representations, ICLR’17, 20 pp., Toulon, France, 24-26 Apr. 2017.
[9] A. M. Lamb, et al., Professor Forcing: A New Algorithm for Training Recurrent Networks, in Advances in Neural Information Processing Systems 29, D. D. Lee, et al., Editors. 2016, Curran Associates, Inc. pp. 4601-4609.
[10] Y. Zhang, et al., "Adversarial feature matching for text generation," in Proc. of the 34th Int. Conf. on Machine Learning, ICML’17, pp. 4006-4015, Sydney, Australia, Aug. 2017.
[11] G. L. Guimaraes, et al., Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models. CoRR, 2017. abs/1705.10843.
[12] K. Lin, et al., "Adversarial ranking for language generation," Advances in Neural Information Processing Systems, pp. 3158-3168, Long Beach, CA, USA, 4-9 Dec. 17.
[13] J. Guo, et al., "Long text generation via adversarial training with leaked information," in Proc. of the 32nd AAAI Conf. on Artificial Intelligence, AAAI’18, the 30th Innovative Applications of Artificial Intelligence, IAAI’18, and the 8th AAAI Symp. on Educational Advances in Artificial Intelligence, EAAI’18, pp. 5141-5148, New Orleans, Louisiana, USA, Feb. 2018.
[14] T. Che, et al., Maximum-Likelihood Augmented Discrete Generative Adversarial Networks, arXiv preprint arXiv:1702.07983, 2017.
[15] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville., "Improved training of wasserstein GANs," in Advances in Neural Information Processing Systems 30, I. Guyon, et al., Editors. 2017, Curran Associates, Inc. p. 5767-5777.
[16] O. Press, A. Bar, B. Bogin, J. Berant, and L. Wolf., Language Generation with Recurrent Generative Adversarial Networks without Pre-Training, arXiv preprint arXiv:1706.01399, 2017.
[17] S. Subramanian, S. Rajeswar, F. Dutil, C. Pal, and A. Courville, "Adversarial generation of natural language," in Proc. of the 2nd Workshop on Representation Learning for NLP, pp. 241-251, Vancouver, Canada, 3-3 Aug. 2017.
[18] A. S. Vezhnevets, et al., "FeUdal networks for hierarchical reinforcement learning," in Proc. of the 34th Int’ Conf. on Machine Learning, ICML 2017, vol. 70, pp. 3540-3549, Sydney, Australia, Aug. 2017.
[19] R. D. Hjelm and A. Jacob, "Boundary-seeking generative adversarial networks," in Proc. Int. Conf. on Learning Representations, 17 pp., Apr. 2018.
[20] M. H. Moghadam and B. Panahbehagh, Creating a New Persian Poet Based on Machine Learning, Computing Research Repository (CoRR), 2018. abs/1810.06898.
[21] S. H. Hosseini Saravani, M. Bahrani, H, Veisi, and S. Besharati, "Persian language modeling using recurrent neural networks," in Proc. 9th Int. Symp. on Telecommunications, IST’18, pp. 207-210, Tehran, Iran, 17-19 Dec. 2018.
[22] M. Sugiyama, T. Suzuki, and T. Kanamori, "Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation," Annals of the Institute of Statistical Mathematics, vol. 64, no. 5, pp. 1009-1044, 2012.
[23] X. Zhang and M. Lapata, "Chinese poetry generation with recurrent neural networks," in Proc. of the Conf. on Empirical Methods in Natural Language Processing, EMNLP’14, pp. 670-680, Doha, Qatar, 25-29 Oct. 2014.
[24] K. Papineni, S. Roukos, T. Ward, and W. –J. Zhu, "Bleu: a method for automatic evaluation of machine translation," in Proc. of the 40th Annual Meeting of the Association for Computational Linguistics, ACL’02, pp. 311-318, Philadelphia, PA, USA, Jul. 2002.
[25] Y. Zhu, et al., "Texygen: a benchmarking platform for text generation models," in Proc. 41st Int. ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR’18, pp. 1097-1100, Jun. 2018.
[26] D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, CoRR, 2014. abs/1412.6980.

شارک

عنوان URL للمقالة

تولید متن با رویکرد جمعی مبتنی بر شبکه‌های مولد مقابله‌ای

رایمگ

الروابط

المراكز ذات الصلة

دعامة

الصفحات الرسمية