Our research focuses on the derivation and development of novel digital signal processing algorithms for speech, audio, and multimodal signals. Addressed applications include communication devices such as hearing aids and mobile phones, as well as human-machine interfaces such as voice controlled assistants and robots. We aim at finding optimal solutions using mathematical and computational statistics for signal analysis and signal processing given a set of practical constraints. These constraints may include the low algorithmic latency in communications devices, limited computational power in mobile devices, and limited resources for training. The employed methods include Bayesian models and estimation, as well as machine learning techniques such as Artificial Deep Neural Networks.
The following video shows our real-time source separation demo in our varechoic sound studio.
Speech and Signal Enhancement
Modern speech communication devices like smartphones and hearing devices are used in many different environments. Particularly in noisy and reverberant environments communication devices, hearing devices and acoustic human-machine interfaces still exhibit limited performance. This can be very annoying, for instance if hearing aid users are unable to follow a conversation, e.g. in a noisy restaurant. Also automatic speech recognition for human-machine interfaces severly suffer performance when employed in noisy and reverberant environments.
Our group works on developing robust solutions to reduce additive noise and mend the negative effects of reverberation. For this we employ methods from statistical signal processing and machine learning.
An introduction to speech enhancement can be found here.
Bayesian Statistical Modeling and Estimation
Whenever possible, in our research we aim at replacing heuristics by solid statistical models. One tool we find particular powerful is Bayesian modeling and estimation. The art is to find proper statistical models for the target signals, interferences, parameters and their uncertainty, as well as the derivation of estimators that are optimal under these models. Examples are minimum-mean square error estimators (MMSE), Maximum A Posteriori (MAP), and Maximum Likelihood (ML) estimators.
Phase-Aware Signal Processing
Many Signal Processing algorithms work in a Fourier spectral domain. In this domain the signal representation is typically complex-valued, i.e. it can be represented by magnitude and phase. While in many existing algorithms only the Fourier magnitude is modified, we are explicitly interested in algorithms that take phase information into account. For this we include phase information as uncertain prior knowledge in our statistical models.
Please find further information here.
Modern Machine Learning Techniques and Deep Neural Networks
With the advance of computational power and available data, modern machine learning techniques such as Deep Neural Networks find an ever increasing interest in the Signal Processing community. In our group, we combine domain-knowledge from signal processing with the power and potential of these methods.
More information can be found here.