Xixin Wu

Assistant Professor,
Department of Systems Engineering and Engineering Management,
The Chinese University of Hong Kong

Room 709B, William M. W. Mong Engineering Building, The Chinese University of Hong Kong, HK SAR, China
Email: wuxx [at] se.cuhk.edu.hk
Tel: 3943-8243

Biography

I am currently an Assistant Professor at the Department of Systems Engineering and Engineering Management, CUHK. Before joining CUHK, I worked as a Research Associate in the Speech Group of the Machine Intelligence Laboratory, Engineering Department of University of Cambridge, supervised by Prof. Mark Gales and Dr. Kate Knill. I obtain my Ph.D. degree from CUHK, supervised by Prof. Helen Meng, and my M.S. degree from Tsinghua University, supervised by Prof. Zhiyong Wu. My research interests include speech synthesis and recognition, affective computing, and neural network uncertainty.

*I am looking for self-motivated students, research assistants and postdocs with research interests in speech and language processing. Please don't hesitate to contact me if you are interested in our research directions.

Selected Publications (Google Scholar)

Journal publications

Hiformer: Sequence Modeling Networks with Hierarchical Attention Mechanisms,
Xixin Wu, Hui Lu, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
Estimating the Uncertainty in Emotion Class Labels With Utterance-Specific Dirichlet Priors,
Wen Wu, Chao Zhang, Xixin Wu, Philip C. Woodland
IEEE Transactions on Affective Computing, 2022
Exemplar-based Emotive Speech Synthesis,
Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Helen Meng
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021
Speech Emoftion Recognition Using Sequential Capsule Networks,
Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021
Any-to-Many Voice Conversion With Location-Relative Sequence-to-Sequence Modeling,
Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu*,Xunying Liu, Helen Meng
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021 (*corresponding author)
Intonation classification for L2 English speech using multi-distribution deep neural networks,
Kun Li, Xixin Wu and Helen Meng
Computer Speech & Language, 2016

Conference publications

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition,
Jinchao Li, Xixin Wu*, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng
in Proc. ICASSP'23, (ACII 2022 Affective Vocal Burst (AV-B) Recognition competition 1st place, *corresponding author)
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks,
Jingbei Li, Yi Meng, Xixin Wu*, Zhiyong Wu*, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang
in Proc. ACM MM'22, (*corresponding author)
Neural Architecture Search for Speech Emotion Recognition,
Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng
in Proc. ICASSP'22
Ensemble Approaches for Uncertainty in Spoken Language Assessment,
Xixin Wu, Kate M. Knill, Mark J.F. Gales, Andrey Malinin
in Proc. Interspeech'20
Speech Emotion Recognition Using Capsule Networks,
Xixin Wu, Songxiang Liu, Yuewen Cao, Xu Li, Jianwei Yu, Dongyang Dai, Xi Ma, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng
in Proc. ICASSP'19
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis, [demo]
Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu and Helen Meng
in Proc. Interspeech'18
Feature based Adaptation for Speaking Style Synthesis,
Xixin Wu, Lifa Sun, Shiyin Kang, Songxiang Liu, Zhiyong Wu, Xunying Liu, Helen Meng
in Proc. ICASSP'18

Activities & Service

Publicity chair, ISCSLP 2022
Area chair, COLING 2022
Tutorial (together with Prof. Longbiao Wang), Odessey 2022
Invited talk, The SpeechHome Conference on Speech Technology 2021
Tutorial, Seoul International Conference on Speech Sciences (SICSS), Seoul, Korea 2019
Reviewer for IEEE/ACM Trans. on Audio, Speech, and Language Processing, IEEE Trans. on Affective Computing, IEEE Signal Processing Magazine, ICASSP, Interspeech, ACM MM, ACL, EMNLP, AAAI

Honors & Awards

First place in two tasks of ACII 2022 Affective Vocal Bursts (AV-B) Recognition Competition
First place in ACL 2022 Doc2Dial Shared Task
Second place in ICASSP 2022 Multi-party Multi-channel Meeting Transcription Challenge (M2Met)
Best paper award, IEEE Robio 2022
Champion, HKSTP SciTech Challenge 2021
First place in INTERSPEECH 2020 Challenge: Automatic Speech Recognition for Non-Native Children's Speech

Patents

Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction and Passage Dropout,
Helen Meng, Xixin Wu, Kun Li, Tianhua Zhang, Liping Tang, Junan Li, Hongyuan Lu,
U.S. Provisional Patent No. 17900806, filed on 31 August, 2022
Voice synthesis method, model training method and device, and computer equipment,
Xixin Wu, Mu Wang, Shiyin Kang, Dan Su, Dong Yu,
Pub. No. US 2020/0380949 A1, pub. date 3 December, 2020
Acoustic-Graphemic Model and Acoustic- Graphemic-Phonemic Model for Computer-Aid Pronunciation Training and Speech Processing,
Kun Li, Lifa Sun, Xixin Wu and Helen Meng,
U.S. Provisional Patent No. 62/413,939, filed on 31 October 2016