Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee

Published in Interspeech, 2020

Abstract:

Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired. In this paper, we propose to perform self-supervised pre-training to stabilize the label assignment in training the speech separation model. Experiments over several types of self-supervised approaches, several typical speech separation models and two different datasets showed that very good improvements are achievable if a proper self-supervised approach is chosen.

Recommended citation: Huang, S.-F., Chuang, S.-P., Liu, D.-R., Chen, Y.-C., Yang, G.-P., Lee, H.-y. (2021) Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training. Proc. Interspeech 2021, 3056-3060, doi: 10.21437/Interspeech.2021-763 https://www.isca-archive.org/interspeech_2021/huang21h_interspeech.html