New IEEE/ACM Journal Publication
20 March 2018, by David Mosteller
Key Contribution: Explicit modeling of the uncertainty of the speech PSD estimate for MMSE-based speech enhancement
Martin Krawczyk-Becker, Timo Gerkmann, "On Speech Enhancement Under PSD Uncertainty", IEEE/ACM Trans. Audio, Speech, Language Proc., in press. [doi]
Abstract:
Many well-known and frequently employed Bayesian clean speech estimators have been derived under the assumption that the true power spectral densities (PDFs) of speech and noise are exactly known. In practice, however, only estimates are available. Simply neglecting PSD estimation errors and handling the estimates as true values leads to speech estimation errors causing musical noise and undesired suppression of speech. In this paper, the uncertainty of the available speech PSD estimates is addressed. The main contributions are: (1) we summarize and examine ways to model and incorporate the uncertainty of PSD estimates for a more robust speech enhancement performance. (2) a novel nonlinear clean speech estimator is derived that takes into account prior knowledge about the absolute value of typical speech PSDs. (3) we show that the derived statistical framework provides uncertainty-aware counterparts to a number of well-known conventional clean speech estimators such as the Wiener filter and Ephraim and Malah's amplitude estimators. (4) we show how modern PSD estimators can be incorporated into the theoretical framework and propose to employ frequency dependent priors. Finally, the effects and benefits of considering the uncertainty of speech PSD estimates are analyzed, discussed, and evaluated via instrumental measures and a listening experiment.