Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement
This website accompanies the journal submission:
Kristina Tesch and Timo Gerkmann, "Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement", submitted to IEEE Transactions on Audio, Speech and Language Processing [arxiv]
Comparison to state-of-the-art methods (Paper section V)
Example | Clean | Noisy | FT-JNF (proposed) | T-JNF [1] | CRNN [2] | FasNet+TAC [3] | EaBNet [4] | COSPA [5] |
---|---|---|---|---|---|---|---|---|
1 | ||||||||
2 | ||||||||
3 | ||||||||
4 | ||||||||
5 | ||||||||
6 | ||||||||
7 |
Results for CHiME3 data (Paper section VI)
Example | Clean | Noisy | FT-JNF (proposed) | F-JNF | T-JNF [1] | CRNN [2] | FasNet+TAC [3] | EaBNet [4] | COSPA [5] |
---|---|---|---|---|---|---|---|---|---|
1 | |||||||||
2 | |||||||||
3 | |||||||||
4 | |||||||||
5 | |||||||||
6 | |||||||||
7 | |||||||||
8 |
Analysis of the interplay of spatial with tempo-spectral information (Paper section IV)
B. Separability of spatial processing and post-filtering
Examples are for a setting with three microphones.
Example | Clean | Noisy | FT-JNF (proposed) | LSF (oracle) + PF | FT-NSF + PF | LSF (oracle MVDR) | FT-NSF |
---|---|---|---|---|---|---|---|
1 | |||||||
2 | |||||||
3 | |||||||
4 | |||||||
5 |
C. Contribution of information sources
Examples are for a setting with three microphones.
Example | Clean | Noisy | FT-JNF (proposed) | F-JNF | T-JNF | FT-NSF | F-NSF | T-NSF |
---|---|---|---|---|---|---|---|---|
1 | ||||||||
2 | ||||||||
3 | ||||||||
4 | ||||||||
5 |
References
- X. Li and R. Horaud, "Multichannel Speech Enhancement Based On Time-Frequency Masking Using Subband Long Short-Term Memory," 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019, pp. 298-302.
- S. Chakrabarty and E. A. P. Habets, "Time–Frequency Masking Based Online Multi-Channel Speech Enhancement With Convolutional Recurrent Neural Networks," in IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 4, pp. 787-799, 2019.
- Y. Luo, Z. Chen, N. Mesgarani and T. Yoshioka, "End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation," 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6394-6398.
- A. Li, W. Liu, C. Zheng and X. Li, "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2022, pp. 6487-6491.
- M. M. Halimeh and W. Kellermann, "Complex-Valued Spatial Autoencoders for Multichannel Speech Enhancement," 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 261-265.