Kaitao Song

Cited by

	All	Since 2019
Citations	8184	8182
h-index	16	16
i10-index	26	26

3100

1550

775

2325

20192020202120222023202464 260 579 1616 3062 2573

Public access

View all

9 articles

1 article

available

not available

Based on funding mandates

Co-authors

Xu TanPrincipal Researcher and Research Manager, MicrosoftVerified email at microsoft.com
Tao QinSenior Principal Research Manager, Microsoft ResearchVerified email at microsoft.com
Tie-Yan LiuDistinguished Scientist, Microsoft Research AI4Science | IEEE Fellow | ACM Fellow | AAIA FellowVerified email at microsoft.com
Wenhai Wang (王文海)CUHK | Shanghai AI Laboratory | NJUVerified email at cuhk.edu.hk
Xiang Li（李翔）Associate Professor, Nankai UniversityVerified email at nankai.edu.cn
Yongliang ShenZhejiang UniversityVerified email at zju.edu.cn
Renqian LuoMicrosoft ResearchVerified email at microsoft.com
Jin XuQwen Team, Alibaba GroupVerified email at alibaba-inc.com
Xiu-Shen WeiProfessor, Southeast UniversityVerified email at seu.edu.cn
Xiangbo Shu (舒祥波)Professor, Nanjing University of Science and TechnologyVerified email at njust.edu.cn
Yi Ren (任意)Research Scientist, TiktokVerified email at bytedance.com
Yicheng ZouShanghai AI LaboratoryVerified email at pjlab.org.cn
Hao SunPeking UniversityVerified email at pku.edu.cn
Di HePeking UniversityVerified email at pku.edu.cn
Dongsheng LiMicrosoft Research AsiaVerified email at microsoft.com
Yezhen WangNational University of SingaporeVerified email at comp.nus.edu.sg

Kaitao Song

Senior Researcher, Microsoft Research

Verified email at microsoft.com - Homepage

Natural Language Processing Large Language Models Artificial General Intelligence


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Pyramid vision transformer: A versatile backbone for dense prediction without convolutions W Wang, E Xie, X Li, DP Fan, K Song, D Liang, T Lu, P Luo, L Shao ICCV 2021, 2021	3645	2021
Pvt v2: Improved baselines with pyramid vision transformer W Wang, E Xie, X Li, DP Fan, K Song, D Liang, T Lu, P Luo, L Shao Computational Visual Media 8 (3), 415-424, 2022	1185	2022
Mass: Masked sequence to sequence pre-training for language generation K Song, X Tan, T Qin, J Lu, TY Liu ICML 2019, 2019	1115	2019
Mpnet: Masked and permuted pre-training for language understanding K Song, X Tan, T Qin, J Lu, TY Liu NeurIPS 2020, 2020	893	2020
HuggingGPT: Solving AI tasks with ChatGPT and Its Friends in Huggingface Y Shen, K Song, X Tan, D Li, W Lu, Y Zhuang NeurIPS 2023, 2023	700	2023
Connecting large language models with evolutionary algorithms yields powerful prompt optimizers Q Guo, R Wang, J Guo, B Li, K Song, X Tan, G Liu, J Bian, Y Yang ICLR 2024, 2023	88	2023
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search J Xu, X Tan, R Luo, K Song, J Li, T Qin, TY Liu KDD 2021, 2021	69	2021
SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint Z Sheng, K Song, X Tan, Y Ren, W Ye, S Zhang, T Qin AAAI 2021, 2020	57	2020
Bi-modal progressive mask attention for fine-grained recognition K Song, XS Wei, X Shu, RJ Song, J Lu IEEE Transactions on Image Processing 29, 7006-7018, 2020	57	2020
Generating adversarial examples with conditional generative adversarial net P Yu, K Song, J Lu 2018 24th international conference on pattern recognition (ICPR), 676-681, 2018	36	2018
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling L Xue, K Song, D Wu, X Tan, NL Zhang, T Qin, WQ Zhang, TY Liu ACL 2021, 2021	34	2021
DiffusionNER: Boundary Diffusion for Named Entity Recognition Y Shen, K Song, X Tan, D Li, W Lu, Y Zhuang ACL 2023, 2023	30	2023
Analyzing and Mitigating Interference in Neural Architecture Search J Xu, X Tan, K Song, R Luo, Y Leng, T Qin, TY Liu, J Li ICML 2022, 2021	29	2021
NaturalSpeech 3: Zero-shot speech synthesis with factorized codec and diffusion models Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ... ICML 2024, 2024	27	2024
Mixed-phoneme bert: Improving bert with mixed phoneme and sup-phoneme representations for text to speech G Zhang, K Song, X Tan, D Tan, Y Yan, Y Liu, G Wang, W Zhou, T Qin, ... INTERSPEECH 2022, 2022	20	2022
LightPAFF: A two-stage distillation framework for pre-training and fine-tuning K Song, H Sun, X Tan, T Qin, J Lu, H Liu, TY Liu arXiv preprint arXiv:2004.12817, 2020	18	2020
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... ICLR 2024, 2023	16	2023
SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition Y Leng, X Tan, W Liu, K Song, R Wang, XY Li, T Qin, E Lin, TY Liu AAAI 2023, 2022	16	2022
Double path networks for sequence to sequence learning K Song, X Tan, D He, J Lu, T Qin, TY Liu COLING 2018, 2018	16	2018
Coarse-to-fine: A dual-view attention network for click-through rate prediction K Song, Q Huang, F Zhang, J Lu Knowledge-Based Systems 216, 106767, 2021	15	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors