Keynote at IHCON

13 August 2022, by David Mosteller

Prof. Dr.-Ing. Timo Gerkmann held a keynote talk at the International Hearing Aid Research Conference (IHCON) at Lake Tahoe, California

Download Slides

Title: Machine Learning for Speech Signal Processing on Hearing Devices

Abstract: Background noise, reverberation, and competing speakers often present a major challenge for users of hearing devices. To mitigate these effects and facilitate speech communication, mod-ern devices typically employ signal enhancement algorithms. In recent years, the advent of deep learning techniques has dramatically transformed the field of signal enhancement, and what used to be considered beyond reach is now well within the realms of possibility. In this talk we will present some of the recent trends proposed and investigated by our group within this context.
We begin with recent advances in single-microphone source separation, showing how modern machine learning approaches allow for high-quality separation of competing speakers. We then address algorithmic latency which is an important factor for hearing devices. Algorithmic la-tency depends on the segment lengths employed for spectral analysis and synthesis. While in traditional magnitude-centric approaches shorter segments decrease performance, we show that neural networks allow for enhancing both magnitude and phase on short segments yielding both a low algorithmic latency and an improved performance. Next, we question the optimality of the traditional signal processing chain of beamforming and postfiltering in multimicrophone speech enhancement. We show that with neural networks more powerful nonlinear joint spa-tial-spectral filters can be learned that outperform the traditional sequential spatial and spectral processing. Finally, we present the very recent and powerful approach of score-based genera-tive models, where we were among the first groups to tailor this approach for speech enhance-ment with impressive results. We show that this method can be flexibly used in both denoising and dereverberation tasks.

Latest articles

Photo: Gerhard Richter

12.09.2025|SP

Dissertation Julius Richter

Julius Richter successfully defended his PhD degree with his thesis "Generative Speech Enhancement in Multimodal Applications". Thanks to committee members and in particular the external members Shinji Watanabe and Simon Leglaive!

Julius Richter's dissertation advances generative speech enhancement...

Photo: Fang

16.06.2025|SP

Dissertation Huajian Fang

Huajian Fang received his PhD degree for his thesis Model-Based Deep Speech Enhancement for Improved Interpretability and Robustness".

His thesis explores how integrating statistical modeling with deep learning-based approaches can improve speech enhancement. To achieve this, uncertainty estimation...