Shuming Ma

Cited by

	All	Since 2019
Citations	4448	4267
h-index	37	36
i10-index	65	62

1500

750

375

1125

2017201820192020202120222023202423 154 220 294 417 565 1310 1456

Public access

View all

17 articles

1 article

available

not available

Based on funding mandates

Co-authors

Furu WeiPartner Research Manager, Microsoft ResearchVerified email at microsoft.com
Xu SunAssociate Professor, Peking UniversityVerified email at pku.edu.cn
houfeng wangPeking UniversityVerified email at pku.edu.cn
Junyang LinQwen Team, Alibaba Group & Peking UniversityVerified email at alibaba-inc.com
Lei CuiMicrosoft Research AsiaVerified email at microsoft.com
Tianyu LiuAlibabaVerified email at pku.edu.cn
Jingjing XuShanghai AI LabVerified email at pku.edu.cn
Wenjie LiThe Hong Kong Polytechnic UniversityVerified email at comp.polyu.edu.hk
Sujian LIPeking Univ.Verified email at pku.edu.cn
Yizhong WangUniversity of WashingtonVerified email at cs.washington.edu

Shuming Ma

Microsoft Research Asia

Verified email at microsoft.com - Homepage

Natural language processing deep learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
SGM: sequence generation model for multi-label classification P Yang, X Sun, W Li, S Ma, W Wu, H Wang arXiv preprint arXiv:1806.04822, 2018	443	2018
Kosmos-2: Grounding multimodal large language models to the world Z Peng, W Wang, L Dong, Y Hao, S Huang, S Ma, F Wei arXiv preprint arXiv:2306.14824, 2023	340	2023
Language is not all you need: Aligning perception with language models S Huang, L Dong, W Wang, Y Hao, S Singhal, S Ma, T Lv, L Cui, ... Advances in Neural Information Processing Systems 36, 2024	322	2024
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei arXiv preprint arXiv:2212.10559, 2022	253	2022
Global encoding for abstractive summarization J Lin, X Sun, S Ma, Q Su arXiv preprint arXiv:1805.03989, 2018	189	2018
meprop: Sparsified back propagation for accelerated deep learning with reduced overfitting X Sun, X Ren, S Ma, H Wang International Conference on Machine Learning, 3299-3308, 2017	183	2017
Retentive network: A successor to transformer for large language models Y Sun, L Dong, S Huang, S Ma, Y Xia, J Xue, J Wang, F Wei arXiv preprint arXiv:2307.08621, 2023	149	2023
Deepnet: Scaling transformers to 1,000 layers H Wang, S Ma, L Dong, S Huang, D Zhang, F Wei IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024	131	2024
XLM-E: Cross-lingual language model pre-training via ELECTRA Z Chi, S Huang, L Dong, S Ma, B Zheng, S Singhal, P Bajaj, X Song, ... arXiv preprint arXiv:2106.16138, 2021	113	2021
A simple and effective unified encoder for document-level machine translation S Ma, D Zhang, M Zhou Proceedings of the 58th annual meeting of the association for computational …, 2020	97	2020
A length-extrapolatable transformer Y Sun, L Dong, B Patra, S Ma, S Huang, A Benhaim, V Chaudhary, ... arXiv preprint arXiv:2212.10554, 2022	95	2022
Language models are general-purpose interfaces Y Hao, H Song, L Dong, S Huang, Z Chi, W Wang, S Ma, F Wei arXiv preprint arXiv:2206.06336, 2022	88	2022
Longnet: Scaling transformers to 1,000,000,000 tokens J Ding, S Ma, L Dong, X Zhang, S Huang, W Wang, N Zheng, F Wei arXiv preprint arXiv:2307.02486, 2023	86	2023
Improving semantic relevance for sequence-to-sequence learning of chinese social media text summarization S Ma, X Sun, J Xu, H Wang, W Li, Q Su arXiv preprint arXiv:1706.02459, 2017	81	2017
Alternating language modeling for cross-lingual pre-training J Yang, S Ma, D Zhang, S Wu, Z Li, M Zhou Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 9386-9393, 2020	79	2020
Query and output: Generating words by querying distributed word representations for paraphrase generation S Ma, X Sun, W Li, S Li, W Li, X Ren arXiv preprint arXiv:1803.01465, 2018	77	2018
Bag-of-words as target for neural machine translation S Ma, X Sun, Y Wang, J Lin arXiv preprint arXiv:1805.04871, 2018	73	2018
mT6: Multilingual pretrained text-to-text transformer with translation pairs Z Chi, L Dong, S Ma, SHXL Mao, H Huang, F Wei arXiv preprint arXiv:2104.08692, 2021	70	2021
Semantic-unit-based dilated convolution for multi-label text classification J Lin, Q Su, P Yang, S Ma, X Sun arXiv preprint arXiv:1808.08561, 2018	70	2018
A deep reinforced sequence-to-set model for multi-label classification P Yang, F Luo, S Ma, J Lin, X Sun Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019	66	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors