Title
|
Voice Activity Detection using Clustering-based Method in SpectroTemporal Features Space
|
Type
|
JournalPaper
|
Keywords
|
Spectro-temporal Features, Auditory Model, Gaussian Mixture Model, WK-means clustering, Voice Activity Detection.
|
Abstract
|
This paper proposes a novel method for voice activity detection based on clustering in the spectro-temporal domain. In the proposed algorithms, the auditory model is used in order to extract the spectro-temporal features. The Gaussian mixture model and the WK-means clustering methods are used to decrease the dimensions of the spectro-temporal space. Moreover, the energy and positions of the clusters are used for voice activity detection. Silence/speech is recognized using the attributes of clusters and the updated threshold value in each frame. Having a higher energy, the first cluster is used as the main speech section in computation. The efficiency of the proposed method is evaluated for silence/speech discrimination in different noisy conditions. Displacement of the clusters in the spectro-temporal domain is considered as the criterion to determine the robustness of the features. According to the results obtained, the proposed method improves the speech/nonspeech segmentation rate in comparison to the temporal and spectral features in low signal to noise ratios (SNRs).
|
Researchers
|
Samira Mavaddati (Third Researcher), Nafiseh Esfandian (First Researcher), Fatemeh Jahani Bahnamiri (Second Researcher)
|