Publications

You can also find my updated publications on my Google Scholar profile.

Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen
In Proc. ASRU, 2023

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang
In Proc. ASRU, 2023

Speaker Adaptive Text-to-Speech with Timbre-Normalized Vector-Quantized Feature
Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu
In IEEE/ACM TASLP, 2023

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian
In Proc. ACM MM, 2023

Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation
Zheng Liang, Zheshu Song, Ziyang Ma, Chenpeng Du, Kai Yu, Xie Chen
In Proc. INTERSPEECH, 2023

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen
In Proc. INTERSPEECH, 2023

Blank-regularized CTC for Frame Skipping in Neural Transducer
Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey
In Proc. INTERSPEECH, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen
In Proc. INTERSPEECH, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen
In Proc. INTERSPEECH, 2023

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu
In Proc. INTERSPEECH, 2023

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Sen Liu, Yiwei Guo, Chengpeng Du, Xie Chen, Kai Yu
In Proc. INTERSPEECH, 2023

An Adapter Based Multi-Label Pre-Training for Speech Separation and Enhancement
T Wang, X Chen, Z Chen, S Yu, W Zhu
Proc. ICASSP, 2023

Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR
X Gong, W Wang, H Shao, X Chen, Y Qian
Proc. ICASSP, 2023

Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Y Guo, C Du, X Chen, K Yu
Proc. ICASSP, 2023

LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer
X Gong, Y Wu, J Li, S Liu, R Zhao, X Chen, Y Qian
Proc. ICASSP, 2023

Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition
X Chen, Z Ma, C Tang, Y Wang, Z Zheng
Proc. ICASSP, 2023

Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Q Chen, Z Ma, T Liu, X Tan, Q Lu, K Yu, X Chen
Proc. ICASSP, 2023

Internal language model adaptation with text-only data for end-to-end speech recognition
Z Meng, Y Gaur, N Kanda, J Li, X Chen, Y Wu, Y Gong
Proc. INTERSPEECH, 2022

VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature
C Du, Y Guo, X Chen, K Yu
Proc. INTERSPEECH, 2022

Factorized neural transducer for efficient language model adaptation
X Chen, Z Meng, S Parthasarathy, J Li
Proc. ICASSP, 2022

2021 and Before

Memory-efficient pipeline-parallel DNN training
D Narayanan, A Phanishayee, K Shi, X Chen, M Zaharia
Proc. ICML, 2021

Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS
Y Deng, R Zhao, Z Meng, X Chen, B Liu, J Li, Y Gong, L He
Proc. INTERSPEECH, 2021

Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
X. Chen, Y. Wu, Z. Wang, S. Liu, J. Li
Proc. ICASSP, 2021

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Z. Meng, N. Kanda, Y. Gaur, S. Parthasarathy, E. Sun, L. Lu, X. Chen, J. Li, Y. Gong
Proc. IEEE ICASSP, 2021

Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Z. Meng, S. Parthasarathy, E. Sun, Y. Gaur, N. Kanda, L. Lu, X. Chen, R. Zhao, J. Li, Y. Gong
Proc. IEEE SLT, 2020

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
X. Chen, S. Parthasarathy, W. Gale, S. Chang, M. Zeng
arXiv preprint arXiv:2010.11349, 2020

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers
J. Xu, X. Chen, S. Hu, J. Yu, X. Liu, H. Meng
Proceedings of ICASSP, 2020

Exploiting Future Word Contexts in Neural Network Language Model
X. Chen, X. Liu, Y. Wang, A. Ragni, M. Gales
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2019

Long-span language modeling for speech recognition
S. Parthasarathy, W. Gale, X. Chen, G. Polovets, S. Chang
arXiv preprint arXiv:1911.04571, 2019

Investigation of Sampling Techniques for Maximum Entropy Language Modeling Training
X. Chen, J. Zhang, T. Anastasakos, F. Alleva
Proceedings of ICASSP, 2019

Gaussian Process LSTM Recurrent Neural Network Language Models for Speech Recognition
M. Lam, X. Chen, S. Hu, J. Yu, X. Liu, H. Meng
Proceedings of ICASSP, 2019

Recurrent Neural Network Language Models Training using Natural Gradient
J. Yu, M. Lam, X. Chen, S. Hu, S. Liu, X. Wu, X. Liu, H. Meng
Proceedings of ICASSP, 2019

Active Memory Networks for Language Modeling
O. Chen, A. Ragni, M.J.F. Gales and X. Chen
Proceedings of INTERSPEECH, 2018

The Effect of Adding Authorship Knowledge in Automated Text Scoring
M. Zhang, X. Chen, R. Cummins, Q. Andersen and T. Briscoe
Workshop of BEA in NAACL, 2018

Limited-memory BFGS Optimization of Recurrent Neural Network Language Models For Speech Recognition
X. Liu, S. Liu, J. Sha, J. Yu, Z Xu, X. Chen, H. Meng
In Proceedings of ICASSP, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription
Y. Wang, X. Chen, M.J.F. Gales, A. Ragni, J. Wong
In Proceedings of ICASSP, 2018

Neural Network Language Modeling with Letter-based Features and Importance Sampling
H. Xu, K. Li, Y. Wang, J. Wang, S. Kang, X. Chen, D. Povey, S. Khudanpur
Proceedings of ICASSP, 2018

Future Word Context in Neural Network Language Model
X. Chen, X. Liu, A. Ragni, Y. Wang, M.J.F. Gales
Proceedings of ASRU, 2017

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition
X. Chen, A. Ragni, X. Liu, M.J.F. Gales
Proceedings of INTERSPEECH, 2017

Recurrent Neural Network Language Models for Keyword Search
X. Chen, A. Ragni, J. Vasilakes, X. Liu, K. Knill, M.J.F. Gales
Proceedings of ICASSP, 2017

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Speech Recognition
X. Chen, X. Liu, Y. Wang, M. J. F. Gales and P. C. Woodland
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2016

Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models
X. Liu, X. Chen, Y. Wang, M. J. F. Gales and P. C. Woodland
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016

Multi-Language Neural Network Language Models
A. Ragni, E. Dakin, X. Chen, M.J. F. Gales and K.M. Knill
Proceedings of INTERSPEECH, 2016

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
X. Chen, X. Liu, Y. Qian, M.J.F. Gales and P.C. Woodland
Proceedings of ICASSP, 2016

Investigation of back-off based interpolation between Recurrent Neural Network and N-Gram Language Models
X. Chen, X. Liu, M.J.F. Gales and P.C. Woodland
Proceedings of ASRU, 2015

Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition
X. Chen, T. Tan, X. Liu, P. Lancheantin, M. Wan, M.J.F. Gales and P.C. Woodland
Proceedings of INTERSPEECH, 2015

Improving the Training and Evaluation Efficiency of Recurrent Neural Network Language Models
X. Chen, X. Liu, M.J.F. Gales, P.C. Woodland
Proceedings of ICASSP, 2015

Recurrent Neural Network Language Model Training with Noise Contrastive Estimation for Speech Recognition
X. Chen, X. Liu, M.J.F. Gales, P.C. Woodland
Proceedings of ICASSP, 2015

Paraphrastic Recurrent Neural Network Language Models
X. Liu, X. Chen, M.J.F. Gales, P.C. Woodland
Proceedings of ICASSP, 2015

Robust Excitation-based Feature for Automatic Speech Recognition
T. Drugman, Y. Stylianou, L. Chen, X. Chen, M.J.F Gales
Proceedings of ICASSP, 2015

An Initial Investigation of Long-Term Adaptation for Meeting Transcription
X. Chen, M.J.F. Gales and K. Knill et, al.
Proceedings of INTERSPEECH, 2014

Efficient GPU-based Training of Recurrent Neural Network Language Models Using Spliced Sentence Bunch
X. Chen, Y. Wang, X. Liu, M.J.F. Gales and P.C. Woodland
Proceedings of INTERSPEECH, 2014

Efficient Lattice Rescoring Using Recurrent Neural Network Language Models
X. Liu, Y. Wang, X. Chen, M.J.F. Gales and P.C. Woodland
In Proceedings of ICASSP, 2014

Impact of Single-Microphone Dereverberation on DNN-based Meeting Transcription Systems
T. Yoshioka, X. Chen, and M.J.F. Gales
Proceedings of ICASSP, 2014

Construction of a Compact Dynamic Decoder Network for Large Vocabulary Continuous Speech Recognition
J. Liu, X. Chen, Y. Shan and Y. Shi
Tsinghua Journal of Chinese Studies, 2012

Fast Language Model Look-ahead Algorithm Using Extended N-gram Model
Y. Shan, X. Chen, Y. Shi and J. Liu
ACTA AUTOMATICA SINICA, 2012

X. Chen, A. Eversol, D. Yu and F. Seide
Pipelined Back-Propagation for Context-Dependent Deep Neural Networks
Proceedings of INTERSPEECH, 2012

An Efficient Layer-wised Beam Pruning Algorithm for Large Vocabulary Continuous Speech Recognition System
X Chen, Y Shan, X Zhang, J Liu
Proceedings of ICALIP, 2012

Feature Engineering in Context-Dependent Deep Neural Networks for Conversational Speech Transcription
F. Seide, G. Li, X. Chen and D. Yu
Proceedings of ASRU, 2011