بهبود کيفيت گفتار نويزي باند محدود با تلفيق الگوريتم‌هاي سري تيلور برداري و گسترش پهناي باند

الموضوعات : مهندسی برق و کامپیوتر

سارا پورمحمدي ¹ , منصور ولي ² , محسن قدياني ³

1 - دانشگاه شاهد
2 - برق
3 - دانشگاه شاهد

تاريخ الإرسال : 17 الأحد , صفر, 1437 تاريخ التأكيد : 17 الأحد , صفر, 1437 تاريخ الإصدار : 12 الجمعة , شعبان, 1434

الکلمات المفتاحية: سري‌هاي تيلور برداري گسترش پهناي باند گفتار نويزي باند محدود مدل ترکيب گوسي,

ملخص المقالة :

در مقاله حاضر با تلفيق دو ديدگاه سري‌هاي تيلور برداري و گسترش پهناي باند مصنوعي، ايده جديدي در زمينه بهبود كيفيت سيگنال گفتار باند محدود تخريب‌شده توسط نويز ارائه شده است. بدين ترتيب كه ابتدا پارامترهاي بازنمايي MFCC استخراج‌شده از گفتار نويزي باند محدود به روش سري‌هاي تيلور برداري اصلاح شده و سپس با استفاده از مدل گسترش پهناي باند مبتني بر GMM، بردارهاي بازنمايي گفتار باند گسترده براي اين پارامترهاي اصلاح‌شده تخمين زده مي‌شوند. سپس به كمك دو معيار اندازه‌گيري PESQ و LSD، ميزان شباهت پوش طيف و سيگنال گفتار تخمين زده شده باند گسترده با پوش طيف باند گسترده و گفتار تميز مرجع سنجيده مي‌شود. نتايج به دست آمده از پياده‌سازي اين الگوريتم به وضوح بيانگر كارايي مناسب ايده پيشنهادي در جهت بهبود كيفيت بردارهاي بازنمايي گفتار باند محدود آلوده به نويز و نزديك‌تر كردن آنها به بردارهاي ويژگي سيگنال گفتار باند گسترده مرجع هستند.

المصادر:

[1] M. Vali, S. A. Seyyed Salehi, and K. Karimi, "Robust speech recognition by modifying clean and telephone feature vectors using bidirectional neural network," in Proc. Interspeech, Pittsburgh, US, 17-21 Sep. 2006.
[2] R. M. Stern, B. Raj, and P. J. Moreno, "Compensation for environmental degradation in automatic speech recognition," in Proc. of the Tutorial and Research Workshop, pp. 33-42, 1997.
[3] P. J. Moreno, Speech Recognition in Noisy Environment, Ph.D. Thesis, pp. 79-96 and 121-126, 1996.
[4] P. J. Moreno, B. Raj, and R. M. Stern, "A vector taylor series approach for environment-independent speech recognition," in Proc. ICASSP, vol. 2, pp. 733-736, Atlanta, US, 7-10 May 1996.
[5] N. S. Kim, D. Y. Kim, B. G. Kong, and S. R. Kim, "Application of VTS to environment compensation with noise statistics," in Proc. Interspeech, 2001.
[6] D. Y. Kim, C. K. Un, and N. S. Kim, "Speech recognition in noisy environments using first-order vector taylor series," Speech Communication, vol.24, no.1, pp. 39-49, Apr. 1998.
[7] B. Iser and G. Schmidt, Bandwidth Extension of Telephony Speech, in Adaptive Signal Processing: Next Generation Solutions, eds. T Adali and S. Haykin, New York, Wiley, 2010.
[8] J. Peter and V. Peter, "On artificial bandwidth extension of telephone speech," Signal Processing, vol. 83, no. 8, pp. 1707-1719, 2003.
[9] P. Jax and P. Vary, "Feature selection for improved bandwidth extension of speech signal," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, pp. 697-700, Montreal, Canada, 2004.
[10] A. H. Nour-Eldin and P. Kabal, "Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech," in Proc. Interspeech, pp. 2489-2492, Antwerp, Belgium, 2007.
[11] A. H. Nour-Eldin and P. Kabal, "Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech," in Proc. Interspeech, pp. 53-56, Brisbane, Australia, 22-26 Sep. 2008.
[12] H. Pulakka, U. Remes, K. Palomaki, M. Kurimo, and P. Alku, "Speech bandwidth extension using gaussian mixture model-based estimation of the highband mel spectrum," in Proc. ICASSP, pp. 5100-5103, 2011.
[13] A. Shahina and B. Yegnanarayana, "Mapping neural networks for bandwidth extension of narrowband speech," in Proc. Interspeech, pp 1435-1438, 2006.
[14] B. Milner and X. Shao, "Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model," InterSpeech, pp. 2421-2424, Denver, US, 2002.
[15] L. Laaksonen, H. Pulakka, V. Myllyla, and P. Alku, "Development, evaluation, and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal," IEEE Trans. Consumer Electronics, vol. 55, no. 2, pp. 780-787, May 2009.
[16] ب. زماني دهكردي، ا. اكبري و ب. ناصر شريف، "طرح دو فيلتر جديد براي بهبود كيفيت گفتار مبتني بر توزيع احتمال پسين براي ضرايب موجك،" نشريه علمي پژوهشي انجمن كامپيوتر ايران، جلد 6، شماره 3- ب، صص. 13-1، پاييز 1387.

شارک

عنوان URL للمقالة

بهبود کيفيت گفتار نويزي باند محدود با تلفيق الگوريتم‌هاي سري تيلور برداري و گسترش پهناي باند

رایمگ

الروابط

المراكز ذات الصلة

دعامة

الصفحات الرسمية