Follow
Xiaofei Wang
Title
Cited by
Cited by
Year
A comparative study on transformer vs rnn in speech applications
S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ...
2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019
8632019
Serialized output training for end-to-end overlapped speech recognition
N Kanda, Y Gaur, X Wang, Z Meng, T Yoshioka
arXiv preprint arXiv:2003.12687, 2020
1262020
Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers
N Kanda, Y Gaur, X Wang, Z Meng, Z Chen, T Zhou, T Yoshioka
arXiv preprint arXiv:2006.10930, 2020
852020
Personalized speech enhancement: New models and comprehensive evaluation
SE Eskimez, T Yoshioka, H Wang, X Wang, Z Chen, X Huang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
662022
Speech enhancement using end-to-end speech recognition objectives
AS Subramanian, X Wang, MK Baskar, S Watanabe, T Taniguchi, D Tran, ...
2019 IEEE Workshop on Applications of Signal Processing to Audio and …, 2019
662019
Streaming multi-talker ASR with token-level serialized output training
N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ...
arXiv preprint arXiv:2202.00842, 2022
582022
Speechx: Neural codec language model as a versatile speech transformer
X Wang, M Thakker, Z Chen, N Kanda, SE Eskimez, S Chen, M Tang, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
562024
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays
N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ...
Proc. CHiME-5, 6-10, 2018
542018
End-to-end speaker-attributed ASR with transformer
N Kanda, G Ye, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
Interspeech 2021, 4413-4417, 2021
522021
Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings
N Kanda, X Chang, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
2021 IEEE Spoken Language Technology Workshop (SLT), 809-816, 2021
482021
Large-scale pre-training of end-to-end multi-talker ASR for meeting transcription with single distant microphone
N Kanda, G Ye, Y Wu, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
Interspeech 2021, 3430-3434, 2021
412021
Transcribe-to-diarize: Neural speaker diarization for unlimited number of speakers using end-to-end speaker-attributed ASR
N Kanda, X Xiao, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
382022
VarArray: Array-geometry-agnostic continuous speech separation
T Yoshioka, X Wang, D Wang, M Tang, Z Zhu, Z Chen, N Kanda
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
312022
Improving noise robustness of contrastive speech representation learning with speech reconstruction
H Wang, Y Qian, X Wang, Y Wang, C Wang, S Liu, T Yoshioka, J Li, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
292022
Oracle performance investigation of the ideal masks
Z Wang, X Wang, X Li, Q Fu, Y Yan
2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), 1-5, 2016
292016
Multi-stream end-to-end speech recognition
R Li, X Wang, SH Mallidi, S Watanabe, T Hori, H Hermansky
IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 646-655, 2019
272019
Stream attention-based multi-array end-to-end speech recognition
X Wang, R Li, SH Mallidi, T Hori, S Watanabe, H Hermansky
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
272019
An investigation of end-to-end multichannel speech recognition for reverberant and mismatch conditions
AS Subramanian, X Wang, S Watanabe, T Taniguchi, D Tran, Y Fujita
arXiv preprint arXiv:1904.09049, 2019
272019
Streaming speaker-attributed ASR with token-level speaker embeddings
N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ...
arXiv preprint arXiv:2203.16685, 2022
262022
Human listening and live captioning: Multi-task training for speech enhancement
SE Eskimez, X Wang, M Tang, H Yang, Z Zhu, Z Chen, H Wang, ...
Interspeech 2021, 2686-2690, 2021
262021
The system can't perform the operation now. Try again later.
Articles 1–20