Zeyuan Allen-Zhu

Cited by

	All	Since 2019
Citations	16434	14402
h-index	45	39
i10-index	59	58

4900

2450

1225

3675

201220132014201520162017201820192020202120222023202461 85 136 153 284 474 738 1108 1438 1641 1903 3436 4846

Public access

View all

18 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yuanzhi LiAssistant Professor at CMUVerified email at andrew.cmu.edu
Weizhu ChenMicrosoftVerified email at microsoft.com
Lorenzo OrecchiaUniversity of Chicago, Computer ScienceVerified email at bu.edu
Edward HuOpenAIVerified email at openai.com
Phillip WallisAmazonVerified email at amazon.com
Yelong ShenMicrosoftVerified email at microsoft.com
Elad HazanProfessor at Princeton University and Director Google AI PrincetonVerified email at princeton.edu
Yang YuanTsinghua UniversityVerified email at tsinghua.edu.cn
Zhao SongAdobe ResearchVerified email at ias.edu
Alessandro ChiesaEPFLVerified email at epfl.ch
Chenguang ZhuHead of Zoom GenAI ScienceVerified email at zoom.us
zheng chenMicrosoftVerified email at microsoft.com
Sebastien BubeckVP GenAI Research, Microsoft AIVerified email at microsoft.com
Naman AgarwalSenior Research Scientist, Google AI PrincetonVerified email at google.com
Tengyu MAStanford UniversityVerified email at stanford.edu
Brian BullinsAssistant Professor, Purdue UniversityVerified email at purdue.edu
Zhenyu LiaoApplied Scientist in Amazon Inc.Verified email at amazon.com
Pinyan LuITCS, Shanghai University of Finance and EconomicsVerified email at mail.shufe.edu.cn
Xiaorui SunUniversity of Illinois at ChicagoVerified email at uic.edu
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC BerkeleyVerified email at cs.berkeley.edu

Zeyuan Allen-Zhu

Meta AI / FAIR Labs

Verified email at csail.mit.edu - Homepage

Language Models Machine Learning Optimization Algorithms Theory


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
LoRA: Low-rank adaptation of large language models EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen ICLR 2022: International Conference on Learning Representations, 2022	5824	2022
A convergence theory for deep learning via over-parameterization Z Allen-Zhu, Y Li, Z Song ICML 2019: International Conference on Machine Learning, 2019	1552	2019
Is Q-learning Provably Efficient? C Jin, Z Allen-Zhu, S Bubeck, MI Jordan NIPS 2018: Neural Information Processing Systems, 2018	927	2018
Learning and generalization in overparameterized neural networks, going beyond two layers Z Allen-Zhu, Y Li, Y Liang NeurIPS 2019: Neural Information Processing Systems, 2019	839	2019
Katyusha: the first direct acceleration of stochastic gradient methods Z Allen-Zhu STOC 2017: Symposium on Theory of Computing, 19-23, 2017	685	2017
Variance reduction for faster non-convex optimization Z Allen-Zhu, E Hazan ICML 2016: International Conference on Machine Learning, 699-707, 2016	432	2016
Linear coupling: An ultimate unification of gradient and mirror descent Z Allen-Zhu, L Orecchia ITCS 2017: Innovations in Theoretical Computer Science, 2017	381	2017
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning Z Allen-Zhu, Y Li ICLR 2023: International Conference on Learning Representations, 2023	362	2023
Finding approximate local minima faster than gradient descent N Agarwal, Z Allen-Zhu, B Bullins, E Hazan, T Ma STOC 2017: Symposium on Theory of Computing, 1195-1199, 2017	343*	2017
Byzantine Stochastic Gradient Descent D Alistarh, Z Allen-Zhu, J Li NIPS 2018: Neural Information Processing Systems, 2018	321	2018
A simple, combinatorial algorithm for solving SDD systems in nearly-linear time JA Kelner, L Orecchia, A Sidford, ZA Zhu STOC 2013: Symposium on Theory of Computing, 911-920, 2013	294	2013
Natasha 2: Faster Non-Convex Optimization Than SGD Z Allen-Zhu NIPS 2018: Neural Information Processing Systems, 2018	261	2018
Improved SVRG for non-strongly-convex or sum-of-non-convex objectives Z Allen-Zhu, Y Yuan ICML 2016: International Conference on Machine Learning, 1080-1089, 2016	232	2016
What Can ResNet Learn Efficiently, Going Beyond Kernels? Z Allen-Zhu, Y Li NeurIPS 2019: Neural Information Processing Systems, 2019	216	2019
Even faster accelerated coordinate descent using non-uniform sampling Z Allen-Zhu, Z Qu, P Richtárik, Y Yuan ICML 2016: International Conference on Machine Learning, 1110-1119, 2016	211	2016
On the convergence rate of training recurrent neural networks Z Allen-Zhu, Y Li, Z Song NeurIPS 2019: Neural Information Processing Systems, 2019	199	2019
Asymptotically optimal strategy-proof mechanisms for two-facility games P Lu, X Sun, Y Wang, ZA Zhu ACM-EC 2010: Conference on Economics and Computation, 315-324, 2010	198	2010
Feature purification: How adversarial training performs robust deep learning Z Allen-Zhu, Y Li FOCS 2021: Symposium on Foundations of Computer Science, 977-988, 2022	159	2022
Neon2: Finding Local Minima via First-Order Oracles Z Allen-Zhu, Y Li NIPS 2018: Neural Information Processing Systems, 2018	156	2018
LazySVD: Even faster SVD decomposition yet without agonizing pain Z Allen-Zhu, Y Li NIPS 2016: Neural Information Processing Systems, 974-982, 2016	137	2016

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors