Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval
Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee, Lin-shan LeePublished in IEEE SLT, 2018
Part of Audio Word2Vec Project.
Recommended citation: Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee and Lin-shan Lee, "Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval," 2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, 2018, pp. 941-948, doi: 10.1109/SLT.2018.8639553. https://ieeexplore.ieee.org/abstract/document/8639553
Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation
Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Yu-Hsuan Wang, Chia-Hao ShenPublished in IEEE/ACM TASLP, 2019
Part of Audio Word2Vec Project.
Recommended citation: Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Yu-Hsuan Wang and Chia-Hao Shen, "Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation," IEEE/ACM Transactions on Audio, Speech, and Language Processing 27.9 (2019): 1481-1493. https://ieeexplore.ieee.org/abstract/document/8736337
Pretrained Language Model Embryology: The Birth of ALBERT
Cheng-Han Chiang, Sung-Feng Huang, Hung-yi LeePublished in EMNLP, 2020
The results show that ALBERT learns to reconstruct and predict tokens of different parts of speech (POS) in different learning speeds during pretraining, and it is found that linguistic knowledge and world knowledge do not generally improve as pretraining proceeds, nor do downstream tasks' performance.
Recommended citation: Chiang, C., Huang, S., & Lee, H. (2020). Pretrained Language Model Embryology: The Birth of ALBERT. ArXiv, abs/2010.02480. https://arxiv.org/abs/2010.02480
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi LeePublished in Interspeech, 2020
SSL for speech separation.
Recommended citation: Huang, S.-F., Chuang, S.-P., Liu, D.-R., Chen, Y.-C., Yang, G.-P., Lee, H.-y. (2021) Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training. Proc. Interspeech 2021, 3056-3060, doi: 10.21437/Interspeech.2021-763 https://www.isca-archive.org/interspeech_2021/huang21h_interspeech.html
Non-Autoregressive Mandarin-English Code-Switching Speech Recognition
Shun-Po Chuang, Heng-Jui Chang, Sung-Feng Huang, Hung-yi LeePublished in IEEE ASRU, 2021
Mask-CTC NAR ASR framework to tackle the CS speech recognition issue.
Recommended citation: Chuang, Shun-Po, Heng-Jui Chang, Sung-Feng Huang, and Hung-yi Lee. "Non-autoregressive mandarin-english code-switching speech recognition." In 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 465-472. IEEE, 2021. https://ieeexplore.ieee.org/abstract/document/9688174
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-rong Liu, Po-chun Hsu, Yi-chen Chen, Sung-feng Huang, Shun-po Chuang, Da-yi Wu, Hung-yi LeePublished in IEEE/ACM TASLP, 2021
Unsupervised ASR
Recommended citation: Liu, Da-rong, Po-chun Hsu, Yi-chen Chen, Sung-feng Huang, Shun-po Chuang, Da-yi Wu, and Hung-yi Lee. "Learning phone recognition from unpaired audio and phone sequences based on generative adversarial network." IEEE/ACM transactions on audio, speech, and language processing 30 (2021): 230-243. https://ieeexplore.ieee.org/abstract/document/9664381/
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang, Chyi-Jiunn Lin, Da-Rong Liu, Yi-Chen Chen, Hung-yi LeePublished in IEEE/ACM TASLP, 2022
Meta-learning for few-shot speaker adaptive text-to-speech
Recommended citation: Huang, Sung-Feng, Chyi-Jiunn Lin, Da-Rong Liu, Yi-Chen Chen, and Hung-yi Lee. "Meta-tts: Meta-learning for few-shot speaker adaptive text-to-speech." IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2022): 1558-1571. https://ieeexplore.ieee.org/abstract/document/9756900
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-yi LeePublished in IEEE ICASSP, 2023
Learnable model pruning for TTS fine-tuning
Recommended citation: Huang, Sung-Feng, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, and Hung-yi Lee. "Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning." In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2023. https://ieeexplore.ieee.org/abstract/document/10097178
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization
Wei-Ping Huang, Sung-Feng Huang, Hung-yi LeePublished in IEEE ASRU, 2023
Utilize unlabeled speech data for few-shot cross-lingual TTS adaptation
Recommended citation: Huang, Wei-Ping, Sung-Feng Huang, and Hung-yi Lee. "Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization." In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 1-8. IEEE, 2023. https://ieeexplore.ieee.org/abstract/document/10389665
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Sung-Feng Huang, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Chao-Han Huck Yang, Yu Tsao, Yu-Chiang Frank Wang, Hung-yi Lee, Szu-Wei FuPublished in IEEE SLT, 2024
Speech editing dataset & edit deepfake detection
Recommended citation: Huang, Sung-Feng, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Chao-Han Huck Yang, Yu Tsao, Yu-Chiang Frank Wang, Hung-yi Lee, and Szu-Wei Fu. "Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits." In 2024 IEEE Spoken Language Technology Workshop (SLT), pp. 652-659. IEEE, 2024. https://ieeexplore.ieee.org/abstract/document/10832200/
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Pin-Jui Ku, Alexander H. Liu, Roman Korostik, Sung-Feng Huang, Szu-Wei Fu, Ante JukićPublished in IEEE ICASSP, 2025
Foundation flow-matching model for generation tasks
Recommended citation: Ku, P. J., Liu, A. H., Korostik, R., Huang, S. F., Fu, S. W., & Jukić, A. (2024). Generative speech foundation model pretraining for high-quality speech extraction and restoration. arXiv preprint arXiv:2409.16117. https://arxiv.org/abs/2409.16117
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.