2024 : 5 : 4
Samira Mavaddati

Samira Mavaddati

Academic rank: Assistant Professor
ORCID:
Education: PhD.
ScopusId:
Faculty: Faculty of Technology and Engineering
Address: University of mazandaran
Phone: 011-35305126

Research

Title
Voice-based age, gender, and language recognition based on ResNet deep model and transfer learning in spectro-temporal domain
Type
JournalPaper
Keywords
Gender recognition Voice-based age recognition Convolutional neural network Transfer learning ResNet model
Year
2024
Journal Neurocomputing
DOI
Researchers Samira Mavaddati

Abstract

In personal identity recognition systems, detecting a person's age, gender, and language using voice signal characteristics is a crucial issue, especially because of the importance of security considerations. Age, gender, and language classification problems are important in signal processing because they are used to analyze and understand human behavior, interactions, and preferences. This can be especially useful in the fields of human-computer interaction, psychology, and social science research. In this paper, a new system for detecting a speaker's age, gender, and language based on deep learning models is presented. Deep learning models have shown great efficacy in various fields of signal processing. For this paper, a range of deep models were tested, including convolutional neural networks (CNNs), recurrent neural network (RNN), and a fine-tuning ResNet34 architecture. Additionally, techniques such as transfer learning were applied to improve the efficiency of the proposed system. The input voice signals are preprocessed by applying the spectro-temporal transform to obtain additional features that can be fed to the ResNet34 model, which is designed specifically for the task of voice signal processing. The dataset used in this paper was sourced from the Mozilla common voice initiative, which is dedicated to advancing speech recognition and language identification technologies. The performance of the proposed algorithm was evaluated in the presence of Gaussian noise to determine its robustness. The experimental results demonstrated that the proposed algorithm significantly outperformed basic algorithms and other deep neural networks in terms of age and gender recognition from voice signals.