Lianmin Zheng

Cited by

	All	Since 2019
Citations	7914	7838
h-index	23	23
i10-index	27	27

3600

1800

900

2700

201820192020202120222023202458 233 432 592 774 2214 3585

Public access

View all

10 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Ying ShengPhD student of Stanford UniversityVerified email at stanford.edu
Ion StoicaProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Joseph E. GonzalezProfessor of Computer Science, UC BerkeleyVerified email at berkeley.edu
Hao ZhangUC San DiegoVerified email at ucsd.edu
Zhuohan LiUC BerkeleyVerified email at berkeley.edu
Tianqi ChenCarnegie Mellon UniversityVerified email at cmu.edu
Luis CezeProfessor of Computer Science and Engineering, University of WashingtonVerified email at cs.washington.edu
Carlos GuestrinProfessor, Stanford UniversityVerified email at stanford.edu
Thierry MoreauOctoML Inc., University of WashingtonVerified email at cs.washington.edu
Cody (Hao) YuSoftware Engineer @ Anyscale | ex-Amazonian | UCLA PhD ‘19Verified email at anyscale.com
Yida WangAmazonVerified email at amazon.com
Danyang ZhuoDuke UniversityVerified email at duke.edu
Koushik SenProfessor of Computer Science, University of California, BerkeleyVerified email at cs.berkeley.edu
Weinan ZhangAssociate Professor, Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Yong Yu (俞勇)Professor, Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Jianfei ChenAssociate Professor, Tsinghua UniversityVerified email at mail.tsinghua.edu.cn

Lianmin Zheng

UC Berkeley

Verified email at berkeley.edu - Homepage

Systems Machine Learning Compiler


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
TVM: An automated end-to-end optimizing compiler for deep learning T Chen, T Moreau, Z Jiang, L Zheng, E Yan, H Shen, M Cowan, L Wang, ... 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2018	2006*	2018
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ... https://lmsys.org/blog/2023-03-30-vicuna/, 2023	1536*	2023
Judging llm-as-a-judge with mt-bench and chatbot arena L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... Advances in Neural Information Processing Systems 36, 2024	1459*	2024
Efficient memory management for large language model serving with pagedattention W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ... Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023	497	2023
Learning to optimize tensor programs T Chen, L Zheng, E Yan, Z Jiang, T Moreau, L Ceze, C Guestrin, ... Advances in Neural Information Processing Systems 31, 2018	425	2018
Ansor: Generating High-Performance Tensor Programs for Deep Learning L Zheng, C Jia, M Sun, Z Wu, CH Yu, A Haj-Ali, Y Wang, J Yang, D Zhuo, ... 14th USENIX symposium on operating systems design and implementation (OSDI …, 2020	336	2020
A hardware–software blueprint for flexible deep learning specialization T Moreau, T Chen, L Vega, J Roesch, E Yan, L Zheng, J Fromm, Z Jiang, ... IEEE Micro 39 (5), 8-16, 2019	245*	2019
Alpa: Automating Inter-and Intra-Operator Parallelism for Distributed Deep Learning L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ... 16th USENIX symposium on operating systems design and implementation (OSDI 22), 2022	239	2022
Magent: A many-agent reinforcement learning platform for artificial collective intelligence L Zheng, J Yang, H Cai, M Zhou, W Zhang, J Wang, Y Yu Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	230	2018
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ... International Conference on Machine Learning, 2023	190	2023
H2o: Heavy-hitter oracle for efficient generative inference of large language models Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ... Advances in Neural Information Processing Systems 36, 2024	96	2024
How Long Can Context Length of Open-Source LLMs truly Promise? D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023	82*	2023
AlpaServe: Statistical multiplexing with model parallelism for deep learning serving Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023	80	2023
Chatbot arena: An open platform for evaluating llms by human preference WL Chiang, L Zheng, Y Sheng, AN Angelopoulos, T Li, D Li, H Zhang, ... arXiv preprint arXiv:2403.04132, 2024	72	2024
Actnn: Reducing training memory footprint via 2-bit activation compressed training J Chen, L Zheng, Z Yao, D Wang, I Stoica, M Mahoney, J Gonzalez International Conference on Machine Learning, 1803-1813, 2021	61	2021
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ... The Twelfth International Conference on Learning Representations, 2023	49	2023
Tensorir: An abstraction for automatic tensorized program optimization S Feng, B Hou, H Jin, W Lin, J Shao, R Lai, Z Ye, L Zheng, CH Yu, Y Yu, ... Proceedings of the 28th ACM International Conference on Architectural …, 2023	46	2023
SLoRA: Scalable Serving of Thousands of LoRA Adapters Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ... Proceedings of Machine Learning and Systems 6, 296-311, 2024	39*	2024
A unified optimization approach for cnn model inference on integrated gpus L Wang, Z Chen, Y Liu, Y Wang, L Zheng, M Li, Y Wang Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019	38	2019
Rethinking benchmark and contamination for language models with rephrased samples S Yang, WL Chiang, L Zheng, JE Gonzalez, I Stoica arXiv preprint arXiv:2311.04850, 2023	30	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors