On the Role of Spatial, Spectal, and Temporal Processing for DNN-based Non-linear Multi-channel Enhancement
This website accompanies the conference paper:
Kristina Tesch, Nils-Hendrik Mohrmann, and Timo Gerkmann, "On the Role of Spatial, Spectral and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement", Interspeech 2022 [arxiv]
Paper overview
Employing deep neural networks (DNNs) to directly learn filters for multi-channel speech enhancement has potentially two key advantages over a traditional approach combining a linear spatial filter with an independent tempo-spectral post-filter:
- non-linear spatial filtering allows to overcome potential restrictions originating from a linear processing model and
- joint processing of spatial and tempo-spectral information allows to exploit interdependencies between different sources of information.
A variety of DNN-based non-linear filters have been proposed recently, for which good enhancement performance is reported. However, little is known about the internal mechanisms which turns network architecture design into a game of chance. Therefore, we perform experiments to better understand the internal processing of spatial, spectral and temporal information by DNN-based non-linear filters. For this, we compare a traditional separated setup with a linear spatial filter plus post-filter, a joint spatial and tempo-spectral non-linear filter (most DNNs fall into this class), and a separated approach that combines a non-linear spatial filter with a post-filter. Furthermore, we design the network such that we can control which sources of information (spatial, spectral, temporal) are processed by the network.

Audio examples: Separability of spatial processing and post-filtering
Example | Clean | Noisy | FT-JNF (proposed) | LSF (oracle) + PF | FT-NSF + PF | LSF (oracle MVDR) | FT-NSF |
---|---|---|---|---|---|---|---|
1 | |||||||
2 | |||||||
3 | |||||||
4 | |||||||
5 |
Audio examples: Contribution of information sources
Example | Clean | Noisy | FT-JNF (proposed) | F-JNF | T-JNF | FT-NSF | F-NSF | T-NSF |
---|---|---|---|---|---|---|---|---|
1 | ||||||||
2 | ||||||||
3 | ||||||||
4 | ||||||||
5 |