Development of a Low-Cost Security System Based on Voice Recognition Using Artificial Intelligence

Willians Jeremy Luna Condori, Emily Juliana Mamani Macedo, Alex Leon Ppacco Huamani, Jesús Talavera S., Jarelh Galdos

Citation :

Willians Jeremy Luna Condori, Emily Juliana Mamani Macedo, Alex Leon Ppacco Huamani, Jesús Talavera S., Jarelh Galdos, "Development of a Low-Cost Security System Based on Voice Recognition Using Artificial Intelligence," International Journal of Electrical and Electronics Engineering, vol. 11, no. 6, pp. 351-358, 2024. Crossref, https://doi.org/10.14445/23488379/IJEEE-V11I6P135

Abstract

Voice recognition has been widely used in various applications, especially in the field of security. In this paper, we propose the development of a low-cost security system based on voice recognition using artificial intelligence. The system utilizes a Raspberry Pi 4B as a microcontroller and Python as a programming language. The system works with a pre-recorded database of voices from 20 people, and the new user’s voice is matched against the pre-recorded voices using Gaussian Mixture Model (GMM). We extracted Mel-Frequency Cepstral Coefficients (MFCC) from the recorded voices, which were used to train the GMM. The system achieved an accuracy rate of 95.42%, with an equal error rate of 4.57%. The proposed system is low-cost and easy to use, making it accessible to a wider audience. However, it has some limitations, such as only being able to work with a pre-recorded database of voices.

Keywords

Voice recognition, Security system, Gaussian mixture model, Mel-frequency cepstral coefficients, Low-cost biometric-systems.

References

[1] National Institute of Statistics and Informatics, Citizen Security Statistics, 2023. [Online] Available. https://www-gobpe.translate.goog/institucion/inei/colecciones/6094-estadisticas-de-seguridadciudadana?_x_tr_sl=es&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=sc
[2] H. Bharathi et al., “Home Automation by Using RASPBERRY Pi and Android Application,” 2017 International Conference of Electronics, Communication and Aerospace Technology (ICECA), pp. 687-689, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Thoraya Obaid et al., “Zigbee-Based Voice-Controlled Wireless Smart Home System,” International Journal of Wireless & Mobile Networks, vol. 6, no. 1, pp. 47-59, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[4] George Alexakis et al., “Control of Smart Home Operations Using Natural Language Processing, Voice Recognition and IoT Technologies in a Multi-Tier Architecture,” Designs, vol. 3, no. 3, pp. 1-18, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Hairol Nizam Mohd. Shah et al., “Biometric Voice Recognition in Security System,” Indian Journal of Science and Technology, vol. 7, no. 2, pp. 104-112, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Ossama Abdel-Hamid et al., “Convolutional Neural Networks for Speech Recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1533-1545, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Jung-Chun Liu et al., “An MFCC‐Based Text‐Independent Speaker Identification System for Access Control,” Concurrency and Computation: Practice and Experience, vol. 30, no. 2, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Zulfiqar Ali et al., “Vocal Fold Disorder Detection Based on Continuous Speech by Using MFCC and GMM,” 2013 7th IEEE GCC Conference and Exhibition (GCC), pp. 292-297, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Suma Paulose, Dominic Mathew, Abraham Thomas, “Performance Evaluation of Different Modeling Methods and Classifiers with MFCC and IHC Features for Speaker Recognition,” Procedia Computer Science, vol. 115, pp. 55-62, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Cemal Hanilçi, “Data Selection for i-Vector Based Automatic Speaker Verification Anti-Spoofing,” Digital Signal Processing, vol. 72, pp. 171-180, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Moataz El Ayad, Mohamed S. Kamel, and Fakhri Karray, “Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases,” Pattern Recognition, vol. 44, pp. 572-587, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Kunxia Wang et al., “Speech Emotion Recognition Using Fourier Parameters,” IEEE Transactions on Affective Computing, vol. 6, no. 1, pp. 69-75, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Yongjin Wang, and Ling Guan, “Recognizing Human Emotional State from Audiovisual Signals*,” IEEE Transactions on Multimedia, vol. 10, no. 5, pp. 936-946, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Peipei Shen, Zhou Changjun, and Xiong Chen, “Automatic Speech Emotion Recognition Using Support Vector Machine,” Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology, pp. 621-625, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[15] S. Sharanyaa, and M. Sambath, “Optimized Hybrid Model for Enhanced Parkinson’s Disease Classification Using Feature Fused Voice Signal,” SSRG International Journal of Electronics and Communication Engineering, vol. 10, no. 11, pp. 11-26, 2023.
[CrossRef] [Publisher Link]
[16] Zhizheng Wu et al., “Synthetic Speech Detection Using Temporal Modulation Feature,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7234-7238, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Jorge Martinez et al., “Speaker Recognition Using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) Techniques,” 2012 22nd International Conference on Electrical Communications and Computers (CONIELECOMP), Puebla, Mexico, pp. 248-251, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[18] M. Shamim Hossain, Ghulam Muhammad, and Atif Alamri, “Smart Healthcare Monitoring: A Voice Pathology Detection Paradigm for Smart Cities,” Multimedia Systems, vol. 25, pp. 565-575, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Stefan Billeb et al., “Biometric Template Protection for Speaker Recognition Based on Universal Background Models,” IET Biometrics, vol. 4, no. 2, pp. 116-126, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Fang-Yie Leu, and Guan-Liang Lin, “An MFCC-Based Speaker Identification System,” 2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), pp. 1055-1062, 2017.
[CrossRef] [Google Scholar] [Publisher Link]