Dongchao Yang

Cited by

	All	Since 2019
Citations	671	670
h-index	13	13
i10-index	15	15

360

180

270

202020212022202320242 15 54 350 243

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yuexian ZouPeking University Shenzhen Graduate SchoolVerified email at pku.edu.cn
Helin WangJohns Hopkins UniversityVerified email at jh.edu
Rongjie HuangZhejiang UniversityVerified email at zju.edu.cn
Songxiang LiumiHoYoVerified email at mihoyo.com
Zhongjie YePeking UniversityVerified email at stu.pku.edu.cn
Jinchuan TianLanguage Technologies Institute, Carnegie Mellon UniversityVerified email at andrew.cmu.edu
Xu TanPrincipal Researcher and Research Manager, MicrosoftVerified email at microsoft.com
Nuo ChenHong Kong University of Science and TechnologyVerified email at connect.ust.hk

Dongchao Yang

The Chinese University of HongKong

Verified email at se.cuhk.edu.hk - Homepage

TTS Multi-modal Audio Fundation Models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Diffsound: Discrete diffusion model for text-to-sound generation D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu IEEE Transactions on Audio, Speech and Language Processing (TASLP)., 2023	157	2023
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang*, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, ... ICML 2023, 2023	127	2023
AudioGPT: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... AAAI demo, 2024, 2023	92	2023
InstructTTS: Modelling expressive tts in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2024	37	2024
Towards data distillation for end-to-end spoken conversational question answering C You, N Chen, F Liu, D Yang, Y Zou arXiv preprint arXiv:2010.08923, 2020	34	2020
A Mutual learning framework for Few-shot Sound Event Detection D Yang, H Wang, Y Zou, Z Ye, W Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	28*	2022
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information Z Ye, H Wang, D Yang, Y Zou Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021	27	2021
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023	26	2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... ICML 2024, 2023	19	2023
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss Y Xin, D Yang, Y Zou ICASSP2023, 2023	16	2023
Make-a-voice: Unified voice synthesis with discrete representation R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ... arXiv preprint arXiv:2305.19269, 2023	14	2023
Norespeech: Knowledge distillation based conditional diffusion model for noise-robust expressive tts D Yang, S Liu, J Yu, H Wang, C Weng, Y Zou Interspeech2023, 2022	14	2022
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification Y Xin, D Yang, Y Zou Proc. Interspeech 2022, 1546-1550, 2022	13	2022
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023	11	2023
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches Z Zhao, D Yang, R Gu, H Zhang, Y Zou Interspeech2022, 2022	10	2022
Detect what you want: Target sound detection D Yang, H Wang, Y Zou, F Cui, Y Wang Workshop on Detection and Classification of Acoustic Scenes and Events …, 2022	8	2022
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... ICLR 2024, 2023	6	2023
Unsupervised multi-target domain adaptation for acoustic scene classification D Yang, H Wang, Y Zou Interspeech2021, 2021	6	2021
Improving Weakly Supervised Sound Event Detection with Causal Intervention Y Xin, D Yang, F Cui, Y Wang, Y Zou ICASSP2023, 2023	5	2023
Improving Target Sound Extraction with Timestamp Information D Yang, H Wang, C Weng, J Yu, Y Zou Proc. Interspeech 2022, 2022	4	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors