A new single channel singing voice separation algorithm is presented in this
paper. This field of signal processing provides important capability in various areas dealing
with singer identification, voice recognition, data retrieval. This separation procedure is
done using a decomposition model based on the spectrogram of singing voice signals. The
novelty of the proposed separation algorithm is related to different issues listed in the
following: 1) The decomposition scheme employs the vocal and music models learned
using sparse non-negative matrix factorization algorithm. The vocal signal and music
accompaniment can be considered as sparse and low-rank components of a singing voice
segment, respectively. 2) An alternating factorization algorithm is used to decompose input
data based on the modeled structures of the vocal and musical components. 3) A voice
activity detection algorithm is introduced based on the energy of coding coefficients matrix
in the training step to learn the basis vectors that are related to instrumental parts. 4) In the
separation phase, these non-vocal atoms are updated to the new test conditions using the
domain transfer approach to result in a proper separation procedure with low reconstruction
error. The performance evaluation of the proposed algorithm is done using different
measures and leads to significantly better results in comparison with the earlier methods in
this context and the traditional procedures. The average improvement values of the
proposed separation algorithm for PESQ, fwSegSNR, SDI, and GNSDR measures in
comparison with previous separation methods in two defined test scenario and three
mentioned SMR levels are 0.53, 0.84, 0.39, and 2.19, respectively.