Improving the accuracy of the GMM model in the form of the GMM-VSM system in the application of speech language recognition

Subject Areas : General

Fahimeh GHasemian ¹ , Mohamad mahdi Homaion por ²

1 -
2 -

Received: 2010-04-09 Accepted : 2010-04-09 Published : 2011-05-03

Keywords:

Abstract :

The GMM model is one of the most widely used and successful models in the field of automatic language recognition. In this article, a new model called Adapted Weight-GMM (AW-GMM) is presented. This model is similar to GMM, with the difference that the weight of its components in the form of GMM-VSM system is determined based on the strength of the components in differentiating one language from other languages. Also, due to the computational complexity in the GMM-VSM system in the case where a 2-component sequence is considered, a technique for constructing a 2-component sequence has been presented, which can be used to construct higher-order sequences as well. used The evaluations carried out on 4 languages English, Persian, French and German from OGI data show the effectiveness of the presented techniques.

References:

Ziaei A., Ahadi S. M.,Mirrezaie S. M. and Yeganeh H., "Spoken Language Identification Using a New Sequence Kernel-based SVM Back-end Classifier", ISSPIT, 2008, pp.324-329.
Zissman M. A., "Comparision of Four Approaches to Automatic Language Identification of Telephone Speech", IEEE Transactions on Speech and Audio Processing, vol. 4, 1996, pp.31-44.
Li H., Ma B. and Lee C. H., "A Vector space modeling approach to spoken language identification," IEEE Transactions on Audio, Speech and Language Processing, vol. 15, 2007, pp.271-284.
Torres-Carrasquillo P. A., Singer E., Kohler M. A., Greene R. J., Reynolds D. A. and Deller J. A., "Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features", ICSLP, 2002, pp.89-92.
Tong, R.,Bin, M.,Zhu, D.,Li, H., Chng, E. S., "Integrating acoustic, prosodic and phonotactic features for spoken language identification," ICASSP, 2006, pp. 205-208.
Tong R., Ma B., Li H., and Chng E. S., "Target-Oriented Phone Tokenizers for Spoken Language Recognition", ICASSP 2008, pp. 200-203.
Richardson F. S., Campbell W. M., Torres-Carrasquillo P. A., “Discriminative N-gram selection for dialect recognition”, interspeech, 2009, pp. 192-195.
Muthusamy Y. K., Cole R. A., Oshika B. T., "The OGI multi-language telephone speech corpus", ICSLP, 1992.
Available at: http://htk.eng.cam.ac.uk/

Share To

Article Url

Improving the accuracy of the GMM model in the form of the GMM-VSM system in the application of speech language recognition

Rimag

Links

Related Centers

Technical Support

Official pages