طراحی و پیاده‌سازی سیستم تبدیل متن به گفتار برای زبان کردی و بررسی کیفی آن

الموضوعات :

وفا بارخدا ¹ , انور بهرام‌پور ² , فردین اخلاقیان ³ , هشام فیلی ⁴

1 - دانشگاه کردستان
2 - دانشگاه آزاد اسلامی واحد سنندج
3 - دانشگاه کردستان
4 - دانشگاه تهران

تاريخ الإرسال : 13 الأربعاء , صفر, 1437 تاريخ التأكيد : 16 السبت , صفر, 1437 تاريخ الإصدار : 09 الإثنين , رجب, 1431

الکلمات المفتاحية: تحلیل نوایی دایفون زبان کردی سیستم تبدیل متن به گفتار سیستم سنتز اتصالی هجا واج‌گونه,

ملخص المقالة :

در این مقاله اولین سیستم تبدیل متن به گفتار پیاده‌سازی شده برای زبان کردی معرفی شده است. زبان کردی دارای دو رسم‌الخط رایج بر اساس الفبای عربی و لاتین است. در قسمت تحلیل متن، علاوه بر رفع ابهامات رایج در متون مختلف، مشکلات مربوط به هر دو رسم‌الخط بررسی شده است. همچنین نمادهای استانداردی تعریف شده‌اند كه سيستم قادر است متن ورودي به هر يك از رسم‌الخط‌هاي فوق را به رشته‌اي از نمادهاي استاندارد تبديل نمايد. همچنین منحنی‌های تغییرات گام برای انواع مختلف جمله‌ها در این زبان برای اولین بار بررسی شده است. در قسمت تولید گفتار، سه سیستم مختلف بر اساس واحدهای واج‌گونه، هجا و دایفون پیاده‌سازی شده است. برای بررسی کیفیت این سیستم‌ها و مقایسه آنها با همدیگر از چهار آزمون MOS، قابلیت فهم، DRT و MRT استفاده شده است. نتایج این آزمون‌ها نشان‌دهنده قابلیت فهم بالای این سیستم‌ها و به‌ویژه سیستم مبتنی بر دایفون است.

المصادر:

[1] E. S. Rawski, The Last Emperors: a Social History of Qing Imperial Institutions, Berkeley and Los Angeles: University of California Press, ISBN 0520212894, 1998.
[2] A. Black, CHATR Version 0.8: A Generic Speech Synthesis, System Documentation, ATR-Interpreting Telecommunications Laboratories, Kyoto, Japan, 1996.
[3] A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP 96, vol. 1, pp. 373-376, Atlanta, Georgia, 7-10 May 1996.
[4] M. Beutnagel, A. Conkie, J. Schroeter, Y. Stylianou, and A. Syrdal, "The AT&T NEXT-GEN TTS System," Joint Meeting of ASA, EAA, and DAGA, 1999.
[5] T. Dutoit, High Quality Text - to - Speech Synthesis of the French Language, Ph.D. Dissertation, the Faculté Polytechnique de Mons, 1993.
[6] T. Dutoit, F. Bataille, V. Pagel, N. Pierret and O. van der Vreken, "The MBROLA project: towards a set of high quality speech synthesizers free of use of non commercial purposes," in Proc. Fourth Int. Conf. on Spoken Language Processing, vol. 3, pp. 1393-1396, Philadelphia, US, 3-6 Oct. 1996.
[7] W. Hamza, Arabic Speech Synthesis Using Large Speech Database, Ph.D. Thesis, Cairo University, Electronics and Communications Engineering Department, 2000.
[8] A. Youssef and O. Emam, "An Arabic TTS system based on the IBM trainable speech synthesizer," Le traitement automatique de l arabe, JEP TALN 2004, 2004.
[9] F. Chouireb, M. Guerti, M. Naïl, and Y. Dimeh, "Development of a prosodic database for standard Arabic," Arabian J. for Science and Engineering, vol. 32, no. 2B, pp. 251-262, Oct. 2007.
[10] A. Ramsay and H. Mansour, "Towards including prosody in a text-to-speech system for modern standard Arabic," Computer Speech and Language, vol. 22, no. 1, pp. 84-103, Jan. 2008.
[11] H. Al-Muhtaseb, M. Elshafei, and M. Al-Ghamdi, "Techniques for High Quality Arabic Speech Synthesis," Information Sciences, vol. 140, pp. 255-267, 2002.
[12] I. Amdal and T. Svendsen, "A speech synthesis corpus for Norwegian," in Proc. Fifth Int. Conf. on Language Resources and Evaluation (LREC'06), pp. 1373-1376, Genova, Italy, 2006.
[13] K. Yoon, "A prosodic phrasing model for a Korean text-to-speech synthesis system," Computer Speech & Language, vol. 20, no. 1, pp. 69-79, Jan. 2006.
[14] P. Zervas, I. Potamitis, N. Fakotakis, and G. Kokkinakis, "A Greek TTS based on non uniform unit concatenation and the utilization of festival architecture," in Proc. First Balkan Conf. on Informatics, pp. 662-668, Thessalonica, Greece, 21-23 Nov. 2003.
[15] A. Farrokhi, S. Ghaemmaghami, and M. Sheikhan, "Estimation of prosodic information for Persian text-to-speech system using a recurrent neural network," in Proc. Speech Prosody 2004, Nara, Japan, 23-26 Mar. 2004.
[16] H. R. Abutalebi and M. Bijankhan, "Implementation of a text-to -speech system for Farsi language," in Proc. Sixth Int. Conf. on Spoken Language Processing, vol. 1, pp. 661-664, Beijing, China, Oct. 2000.
[17] F. Hendessi, A. Ghayoori, and T. A. Gulliver, "A speech synthesizer for Persian text using a neural network with a smooth ergodic HMM," ACM Trans. on Asian Language Information Processing (TALIP), vol. 4, no. 1, pp. 38-52, Mar. 2005.
[18] A. Koochari, M. Namnabat, S. M. Kasaeiyan, and A. Niazade, "Duration modeling for Persian text-to-speech system by neural network," in Proc. Int. Conf. on Multidisciplinary Information Sciences & Technologies, InSciT2006, Mirida, Spain, 25-28 Oct. 2006.
[19] M. Namnabat and A. Koochari, "Generating F0 contours for speech synthesis in Persian language using classification and regression tree," in Proc. 12th Int. Computer Society of Iran Computer Conf., CSICC’07, Tehran, Iran, 20-22 Feb. 2007.
[20] M. M. Homayounpour and M. Namnabat, "FarsBayan: a unit selection based Farsi speech synthesizer," in Proc. Nineth Int. Conf. on Spoken Language Processing, InterSpeech 2006-ICSLP, Pittsburgh, US, 17-21 Sep. 2006.
[21] M. Namnabat and M. M. Homayounpour, "A letter to sound system for Farsi Language using neural networks," in Proc. Int. Conf. on Signal Processing, ICSP2006, vol. 1, Beijing, China, 16-20 Nov. 2006.
[22] S. Baban, Phonology and Syllabication in Kurdish Language, Kurdish Academy Press, First Edition, Arbil, 2005. (in Kurdish)
[23] W. M. Thackston, Sorani Kurdish: a Reference Grammar with Selected Reading, Harvard: Iranian Studies at Harvard University, 2006.
[24] ع. رخزادی، آواشناسی و دستور زبان کردی، انتشارات ترفرند، تهران، 1380.
[25] م. کاوه، زبان‌شناسی و دستور زبان کردی (لهجه سقزی)، انتشارات احسان، ویرایش اول، تهران، 1385.
[26] و. بارخدا، طراحی و پیاده‌سازی سیستم تبدیل متن به گفتار در زبان کردی، پایان‌نامه کارشناسی ارشد، گروه کامپیوتر و فناوری اطلاعات، دانشگاه کردستان، 1388.
[27] R. J. Deller Jr., J. G. Proakis, and J. H. Hansen, Discrete Time Processing of Speech Signals, John Wiley and Sons, 2000.
[28] F. Daneshfar, W. Barkhoda, and B. ZahirAzami, "Implementation of a Text-to-Speech System for Kurdish Language," in Proc. Fourth Int. Conf. on Digital Telecommunications, ICDT'09, pp. 117-120, Colmar, France, 20-25 Jul. 2009.
[29] J. T. Sejnowski and R. Rosenberg, Parallel Networks that Learn to Pronounce English Text, the Johns Hopkins University, Complex Systems Inc, pp. 145-168, 1987.
[30] S. Lemmetty, Review of Speech Synthesis Technology, M. Sc Thesis, Helsinki University of Technology, 1999.
[31] M. N. Rao, S. Thomas, T. Nagarajan, and H. A. Murthy, "Text-to-speech synthesis using syllable-like units," in Proc. of National Conf. on Communications, pp. 277-280, IIT Kharagpur, India, Jan. 2005.
[32] W. Barkhoda, B. ZahirAzami, O. Shahryari, and A. Bahrampour, "A comparison between allophone, syllable, and diphone based TTS systems for Kurdish language," IEEE Int. Symp. on Signal Processing and Information Technology, ISSPIT'09, pp. 557-562, Ajman, UAE, 14-17 Dec. 2009.
[33] م. شيخان، م. نصيرزاده و ع. دفتريان، "طراحی و پياده‌سازی سيستم تبديل متن به گفتار طبيعی برای زبان فارسی،" مجله علمی پژوهشی دانشكده مهندسی دانشگاه فردوسی مشهد، سال 17، شماره 2، صص. 48-31، 1384.

شارک

عنوان URL للمقالة

طراحی و پیاده‌سازی سیستم تبدیل متن به گفتار برای زبان کردی و بررسی کیفی آن

رایمگ

الروابط

المراكز ذات الصلة

دعامة

الصفحات الرسمية