Adoption Speaker Recognition System using Mel-frequency Cepstral Coefficients

Reda Elbarougy, G behery, Hanan A. Algrbaa


The traditional system of speaker recognition depends on extraction some of features within the speech. Many researchers have adopted several ways to extract these features. One of the most famous of these methods is MFCC. The number of extracted features using MFCC ranges from 2 to 13 feature. The number of extracted features not enough to deal with our problem. This paper presented an adoptive MFCC range from 2 to 91 features to the MFCC. The adoptive MFCC gave us the features instead of 13 that certainly gave a great opportunity to distinguish between speakers. The extracted features were used in the process of classifying speakers using GMM and NNT the proposed system is very efficiency where some speakers were fully recognized.

Full Text:



Tomi Kinnunen, Haizhou Li, "An overview of text-independent speaker recognition: From fea-tures to supervectors", Speech Communication Vol 52, 2010, pp 12–40.

Nilu Sing, R.A.Khan,Raj Shree, "Applications of Speaker Recognition", Procedia Engineering ,Vol 38, 2012, pp 3122 – 3126.

Nidhi Desai, .Kinnal Dhameliya, Vijayendra De-sai, "Feature Extraction and Classification tech-niques for Speech Recognition: A Review", In-ternational Journal of Emerging Technology and Advanced Engineering (IJETA) Certified Journal, Vol 3, Issue 12, December 2013, pp 367-371.

Jia-Ching Wang, Jhing-Fa Wang, Yu-Sheng Weng, "Chipdesign of MFCC extraction for speech recognition", INTEGRATION, the VLSI journal 32, 2002, pp 111–131.

Bhargab Medhi and P.H. Talukdar, " Different acoustic feature parameters ZCR, STE, LPC and MFCC analysisof Assamese vowel phonemes ", International Conference on Frontiers in Math-ematics (ICFM), 2015, pp 39-43.

Francesc Alías, Joan Claudi Socoró and Xavier Sevillano, "A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Mu-sic and Environmental Sounds", applied sciences (Appl. Sci.), Vol 6, Issue143, 2016, pp 1-44.

Najiya Omar," Speaker Identification System Enhanced By Optimized Neural Networks And Feature Fusion Techniques Evaluated By Coch-lear Implant-Like Spectrally Reduced Speech", the degree of Master of Applied Science at Dal-housie University Halifax, Nova Scotia February 2017.

Navnath S Nehe and Raghunath S Hol-ambe,"DWT and LPC based feature extraction methods for isolated word recognition", EURA-SIP Journal on Audio, Speech, and Music Pro-cessing 2012, pp1-7

Alfie Tan Kok Leong, "A Music Identification System Based on Audio Content Similarity", Oct-2003.

Lei Xie, Zhi-Qiang Liu, "A Comparative Study of Audio Features For Audio to Visual Cobversion in MPEG-4 Compliant Facial Animation", Proc. of ICMLC, Dalian, Aug-2006, pp 13-16.

Namrata Dave, "Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition", International Journal for Advance Research in Engineering and Technology, Vol 1, Issue VI, 2013, pp1-5.

RajivChechi, Reetu, "Performance Analysis of MFCC and LPCC Techniques In Automatic Speech Recognition", International Journal of Engineering Research & Technology (IJERT), Vol. 2 Issue 9, 2013, pp 3142-3146.

Dr E.Chandra, K.Manikandan,M.S.Kalaivani, "A Study on Speaker Recognition System and Pattern classification Techniques", International Journal Of Innovative Research In Electrical, Electronics, Instrumentation And Control Engi-neering Vol. 2, Issue 2, 2014, pp 772-775.

Md Jahangir Alam , Tomi Kinnunen , Patrick Kenny , Pierre Ouellet and Douglas O’Shaughnessy, "Multitaper MFCC and PLP features for speaker verification using i-vectors", Elsevier B.V., SciVerse ScienceDirect, 2012.

Vladimir Fabregas Surigué de Alencar and Abra-ham Alcaim, "Transformations of LPC and LSF Parameters to Speech Recognition Features", S. Singh et al. (Eds.): ICAPR 2005, LNCS 3686, Springer-Verlag Berlin Heidelberg, 2005, pp 522 – 528.

Wei Guo, Liqing Zhang, and Bin Xia," An Audi-tory Neural Feature Extraction Method for Ro-bust Speech Recognition", IEEE, Icassp 2, 2002.

Juan A. Morales-Cordovilla, Antonio M. Peina-do," Feature Extraction Based onPitch-Synchronous Averaging for Robust Speech Recognition", Ieee Transactions On Audio, Speech, And Language Processing, Vol. 19, No. 3, March 2011.

Chulhee Lee, Donghoon Hyun, Euisun Choi, Jinwook Go, and Chungyong Lee, "Optimizing Feature Extraction for Speech Recognition", IEEE Transactions On Speech And Audio Pro-cessing, Vol. 11, No. 1, January 2003, pp 80:P87. Athira Aroon, S.B. Dhonde, "Speaker Recogni-tion System using Gaussian Mixture Model", In-ternational Journal of Computer Applications (0975 – 8887) Vol 130, No.14, November 2015.

Douglas A.Reynolds, and Richard C Rose, "Ro-bust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models", IEEE Transactions on Speech and Audio Processing, Vol 3, No 1, January 1995.

Sreenivasa Rao Krothapalli, Shashidhar G. Koolagudi, Emotion Recognition using Speech Features, Springer Science+Business Media New York 2013.

Zhenhao Ge, Ananth N. Iyer, Srinath Cheluva-raja, Ram Sundaram, Aravind Ganapathiraju," Neural Network Based Speaker Classification and Verification Systems with Enhanced Fea-tures", Intelligent Systems Conference, IEEE 2017, pp 1-6.

F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, "A Database of German

Talieh Seyed Tabtabae, "Speech-based human emotion recognition", Ryerson University 2007.

Mohammed J. Zaki, Wagner Meira Jr, Data Min-ing and Analysis, Cambridge University Press, First published 2014.

Fred Richardson, Douglas Reynolds, Fellow, and Najim Dehak, "Deep Neural Network Approach-es to Speaker and Language Recognition",Ieee Signal Processing Letters, Vol. 22, No. 10, Octo-ber 2015.

Yanick Lukic, Carlo Vogt, Oliver D¨urr, Thilo Stadelmann, "Speaker Identification And Clus-tering Using Convolutional Neural Netw

orks", IEEE International Workshop On Ma-chine Learning For Signal Processing, Sept. 13–16, 2016, Salerno, Italy.


  • There are currently no refbacks.