Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement

Publication typeJournal Article
Publication date2021-05-14
scimago Q1
wos Q1
SJR1.061
CiteScore12.4
Impact factor5.1
ISSN23299290, 23299304
Electrical and Electronic Engineering
Computer Science (miscellaneous)
Computational Mathematics
Acoustics and Ultrasonics
Abstract
For challenging acoustic scenarios as low signal-to-noise ratios, current speech enhancement systems usually suffer from performance bottleneck in extracting the target speech from the mixtures within one step. To address this issue, we propose a novel complex spectral mapping approach with a two-stage pipeline for monaural speech enhancement in the time-frequency domain. The proposed algorithm aims to decouple the primal problem into multiple sub-problems, which follows the classic proverb, “two heads are better than one”. More specifically, in the first stage, only magnitude is estimated, which is incorporated with the noisy phase to obtain a coarse complex spectrum estimation. To facilitate the previous estimation, in the second stage, an auxiliary network serves as the post-processing module, where residual noise is further suppressed and the phase information is effectively modified. The global residual connection strategy is adopted in the second stage to accelerate the training convergence speed. To alleviate the parameter burden caused by the multi-stage pipeline, we propose a light-weight temporal convolutional module, which substantially decreases the trainable parameters and obtains even better objective performance over the original version. We conduct extensive experiments on three standard corpora, including WSJ0-SI84, DNS Challenge dataset, and Voice Bank + DEMAND dataset. Objective test results demonstrate that our proposed approach achieves state-of-the-art performance over previous advanced systems under various conditions. Meanwhile, subjective listening test results further validate the superiority of our proposed method in terms of subjective quality.
Found 
Found 

Top-30

Journals

2
4
6
8
10
12
14
Applied Acoustics
14 publications, 8.97%
IEEE Signal Processing Letters
12 publications, 7.69%
IEEE/ACM Transactions on Audio Speech and Language Processing
11 publications, 7.05%
Digital Signal Processing: A Review Journal
5 publications, 3.21%
Applied Sciences (Switzerland)
4 publications, 2.56%
Speech Communication
4 publications, 2.56%
Neural Networks
4 publications, 2.56%
IEEE Transactions on Audio Speech and Language Processing
4 publications, 2.56%
Information Fusion
3 publications, 1.92%
IEEE Access
3 publications, 1.92%
Electronics (Switzerland)
3 publications, 1.92%
Symmetry
2 publications, 1.28%
Measurement: Journal of the International Measurement Confederation
2 publications, 1.28%
Multimedia Tools and Applications
2 publications, 1.28%
Expert Systems with Applications
2 publications, 1.28%
Neurocomputing
2 publications, 1.28%
Computer Speech and Language
2 publications, 1.28%
Circuits, Systems, and Signal Processing
2 publications, 1.28%
Biomedical Signal Processing and Control
1 publication, 0.64%
Entropy
1 publication, 0.64%
Lecture Notes in Computer Science
1 publication, 0.64%
Eurasip Journal on Audio, Speech, and Music Processing
1 publication, 0.64%
Journal of the Acoustical Society of America
1 publication, 0.64%
Pattern Recognition
1 publication, 0.64%
IEEE Transactions on Artificial Intelligence
1 publication, 0.64%
Future Internet
1 publication, 0.64%
Lecture Notes in Networks and Systems
1 publication, 0.64%
Applied System Innovation
1 publication, 0.64%
IEEE Internet of Things Journal
1 publication, 0.64%
2
4
6
8
10
12
14

Publishers

10
20
30
40
50
60
70
80
90
Institute of Electrical and Electronics Engineers (IEEE)
84 publications, 53.85%
Elsevier
41 publications, 26.28%
MDPI
12 publications, 7.69%
Springer Nature
11 publications, 7.05%
Association for Computing Machinery (ACM)
2 publications, 1.28%
Acoustical Society of America (ASA)
1 publication, 0.64%
SPIE-Intl Soc Optical Eng
1 publication, 0.64%
Acoustical Society of Japan
1 publication, 0.64%
Wiley
1 publication, 0.64%
SAGE
1 publication, 0.64%
Institute of Electronics, Information and Communications Engineers (IEICE)
1 publication, 0.64%
10
20
30
40
50
60
70
80
90
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
156
Share
Cite this
GOST |
Cite this
GOST Copy
Li A. et al. Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement // IEEE/ACM Transactions on Audio Speech and Language Processing. 2021. Vol. 29. pp. 1829-1843.
GOST all authors (up to 50) Copy
Li A., Liu W., Zheng C., Fan C., Li X. Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement // IEEE/ACM Transactions on Audio Speech and Language Processing. 2021. Vol. 29. pp. 1829-1843.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1109/taslp.2021.3079813
UR - https://doi.org/10.1109/taslp.2021.3079813
TI - Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement
T2 - IEEE/ACM Transactions on Audio Speech and Language Processing
AU - Li, Andong
AU - Liu, Wenzhe
AU - Zheng, Chengshi
AU - Fan, Cunhang
AU - Li, Xiaodong
PY - 2021
DA - 2021/05/14
PB - Institute of Electrical and Electronics Engineers (IEEE)
SP - 1829-1843
VL - 29
SN - 2329-9290
SN - 2329-9304
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2021_Li,
author = {Andong Li and Wenzhe Liu and Chengshi Zheng and Cunhang Fan and Xiaodong Li},
title = {Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement},
journal = {IEEE/ACM Transactions on Audio Speech and Language Processing},
year = {2021},
volume = {29},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
month = {may},
url = {https://doi.org/10.1109/taslp.2021.3079813},
pages = {1829--1843},
doi = {10.1109/taslp.2021.3079813}
}