A Voice Activity Detection Algorithm Using Sparse Non-negative Matrix Factorization-based Model Learning in Spectro-Temporal Domain

Research

Title	A Voice Activity Detection Algorithm Using Sparse Non-negative Matrix Factorization-based Model Learning in Spectro-Temporal Domain
Type	JournalPaper
Keywords	Voice Activity Detector Spectro-temporal Domain Sparse Structured Principal Component Analysis Sparse Non-negative Matrix Factorization
Year	2023
Journal	International Journal of Engineering
DOI
Researchers	Samira Mavaddati

Abstract

Voice activity detectors are presented to extract silence/speech segments of the speech signal to eliminate different background noise signals. A novel voice activity detector is proposed in this paper using spectro-temporal features extracted from the auditory model of the speech signal. After extracting the scale, rate, and frequency features from this feature space, a sparse structured principal component analysis algorithm is used to consider the basic components of these features and reduce the dimension of learning data. Then these feature vectors are employed to learn the models by the sparse non-negative matrix factorization algorithm. The model learning procedure is performed to represent each feature vector with a proper sparse rate based on the selected atoms. Voice activity detection of the input frames is performed by computing the energy of the sparse representation for each input frame over the composite model. If the calculated energy exceeds a specified threshold, it indicates that the input frame has a structure similar to the atoms of the learned models and concludes that the observed frame has voice content. The results of the proposed detector were compared with other baseline methods and classifiers in this processing field. These results in the presence of stationary, non-stationary and periodic noises were investigated and they are shown that the proposed method based on model learning with spectro-temporal features can correctly detect the silence/speech activities.

Samira Mavaddati

Research

Abstract