Keynote at IHCON
13 August 2022, by David Mosteller
Prof. Dr.-Ing. Timo Gerkmann held a keynote talk at the International Hearing Aid Research Conference (IHCON) at Lake Tahoe, California
Title: Machine Learning for Speech Signal Processing on Hearing Devices
Abstract: Background noise, reverberation, and competing speakers often present a major challenge for users of hearing devices. To mitigate these effects and facilitate speech communication, mod-ern devices typically employ signal enhancement algorithms. In recent years, the advent of deep learning techniques has dramatically transformed the field of signal enhancement, and what used to be considered beyond reach is now well within the realms of possibility. In this talk we will present some of the recent trends proposed and investigated by our group within this context.
We begin with recent advances in single-microphone source separation, showing how modern machine learning approaches allow for high-quality separation of competing speakers. We then address algorithmic latency which is an important factor for hearing devices. Algorithmic la-tency depends on the segment lengths employed for spectral analysis and synthesis. While in traditional magnitude-centric approaches shorter segments decrease performance, we show that neural networks allow for enhancing both magnitude and phase on short segments yielding both a low algorithmic latency and an improved performance. Next, we question the optimality of the traditional signal processing chain of beamforming and postfiltering in multimicrophone speech enhancement. We show that with neural networks more powerful nonlinear joint spa-tial-spectral filters can be learned that outperform the traditional sequential spatial and spectral processing. Finally, we present the very recent and powerful approach of score-based genera-tive models, where we were among the first groups to tailor this approach for speech enhance-ment with impressive results. We show that this method can be flexibly used in both denoising and dereverberation tasks.