Semi-parametric joint detection and estimation for speech enhancement based on minimum mean square error

Publication date: September 2018Source: Speech Communication, Volume 102Author(s): Van-Khanh Mai, Dominique Pastor, Abdeldjalil Aïssa-El-Bey, Raphaël Le BidanAbstractWe propose a novel estimator for estimating the amplitude of speech coefficients in the time-frequency domain. In order to avoid a phase spectrum estimator of complex coefficients when using the Fourier transform, we consider the discrete cosine transform (DCT). This estimator aims at minimizing the mean square error of the absolute values of the speech DCT coefficients. In order to take advantage of both parametric and non-parametric approaches, the proposed method combines block shrinkage and Bayesian statistical estimation. First, the absolute value of the clean coefficient is estimated by block smoothed sigmoid-based shrinkage (Block-SSBS). The block size required by the Block-SSBS is obtained by statistical optimization. This step enables us to reduce the negative impact on speech intelligibility of classical denoising methods similarly to smoothed binary masking. Second, for refining the estimation, an optimal statistical estimator is added to handle musical noise. For evaluating the performance of the proposed method, objective criteria are used. The experiments enhance the relevance of the approach, in terms of speech quality and intelligibility.
Source: Speech Communication - Category: Speech-Language Pathology Source Type: research