سامانه پژوهشی دانشگاه مازندران | Speech enhancement using sparse dictionary learning in wavelet packet transform domain

عنوان	Speech enhancement using sparse dictionary learning in wavelet packet transform domain
نوع پژوهش	مقاله چاپ شده
کلیدواژه‌ها	Dictionary learning; Sparse representation; Domain adaptation; Voice activity detector; Wavelet packet transform
چکیده	This paper presents a new speech enhancement algorithm via sparse representation in wavelet packet transform domain. We propose the specified dictionary learning procedures for training data of speech and noise signals based on coherence criterion for each subband of decomposition level. Using these learning algorithms, self-coherence between atoms of each dictionary and mutual coherence between speech and noise dictionaries atoms are minimized along with the approximation error. The speech enhancement algorithm is introduced in two scenarios, supervised and semi-supervised. In each scenario, a defined voice activity detector scheme is employed based on the energy of sparse coefficient matrix in representation of the observation data over corresponding dictionaries. In proposed supervised scenario, we take advantage of domain adaptation technique to transform a learned noise dictionary to an adapted dictionary by noise data captured based on the circumstances of test environment. Using this step, observation data is sparsely coded based on the current situation of noisy space with low sparse approximation error. This technique has a prominent role to obtain the better enhancement results particularly when the noise is non-stationary. In proposed semi-supervised scenario, adaptive thresholding of wavelet coefficients is done based on the variance of the estimated noise in each frame of different subbands. The proposed approaches lead to significant speech enhancement results, in comparison with the earlier methods in this context and the traditional procedures, based on different objective, subjective measures and statistical test.
پژوهشگران	سمیرا مودتی (نفر اول)، سید محمد احدی (نفر دوم)، ساناز سیدین (نفر سوم)

مشخصات پژوهش