Kaifeng Lyu

Cited by

	All	Since 2019
Citations	933	914
h-index	12	12
i10-index	12	12

300

150

225

201820192020202120222023202416 33 89 132 221 294 142

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zhiyuan LiAssistant Professor, Toyota Technological Institute at ChicagoVerified email at ttic.edu
Sanjeev AroraProfessor of Computer Science, Princeton UniversityVerified email at cs.princeton.edu
Jian LiTsinghua UniversityVerified email at mail.tsinghua.edu.cn
Runzhe WangPrinceton UniversityVerified email at princeton.edu
Yuping LuoComputer Science Department, Princeton UniversityVerified email at cs.princeton.edu
Jikai JinStanford UniversityVerified email at stanford.edu
Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonVerified email at cs.washington.edu
Sadhika MalladiPrinceton UniversityVerified email at princeton.edu
Xinran GuTsinghua UniversityVerified email at mails.tsinghua.edu.cn
Abhishek PanigrahiGraduate Student, Princeton UniversityVerified email at princeton.edu
Sanjiv KumarGoogle Fellow, VP, Google ResearchVerified email at google.com
Guy RothblumWeizmann Institute of ScienceVerified email at alum.mit.edu
Lijie ChenUC BerkeleyVerified email at berkeley.edu
Longbo HuangProfessor, IIIS @ Tsinghua University, China, ACM Distinguished ScientistVerified email at tsinghua.edu.cn
Jason D. LeeAssociate Professor of Electrical Engineering and Computer Science, Princeton UniversityVerified email at princeton.edu
Yongchao ZhouUniversity of TorontoVerified email at mail.utoronto.ca
Ankit Singh RawatResearch Scientist, GoogleVerified email at google.com
Dingli YuPrinceton UniversityVerified email at cs.princeton.edu
Nikunj SaunshiResearch Scientist, GoogleVerified email at google.com
Wei HuAssistant Professor of Computer Science and Engineering, University of MichiganVerified email at umich.edu

Kaifeng Lyu

Princeton University

Verified email at princeton.edu - Homepage


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks K Lyu, J Li 2020 International Conference on Learning Representations (ICLR 2020), 2020	286	2020
Theoretical analysis of auto rate-tuning by batch normalization S Arora, Z Li, K Lyu 2019 International Conference on Learning Representations (ICLR 2019), 2019	118	2019
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning Z Li, Y Luo, K Lyu 2021 International Conference on Learning Representations (ICLR 2021), 2021	110	2021
Learning gradient descent: Better generalization and longer horizons K Lv, S Jiang, J Li 34th International Conference on Machine Learning (ICML 2017) 70, 2247-2255, 2017	110	2017
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias K Lyu, Z Li, R Wang, S Arora 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021	62	2021
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate Z Li, K Lyu, S Arora 34th Conference on Neural Information Processing Systems (NeurIPS 2020), 2020	59	2020
Understanding the generalization benefit of normalization layers: Sharpness reduction K Lyu, Z Li, S Arora 36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022	53	2022
Fine-grained complexity meets IP = PSPACE L Chen, S Goldwasser, K Lyu, GN Rothblum, A Rubinstein 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2019), 1-20, 2019	37	2019
Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing J Jin, Z Li, K Lyu, SS Du, JD Lee International Conference on Machine Learning (ICML 2023), 15200-15238, 2023	23	2023
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms S Malladi, K Lyu, A Panigrahi, S Arora 36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022	23	2022
DistillSpec: Improving speculative decoding via knowledge distillation Y Zhou, K Lyu, AS Rawat, AK Menon, A Rostamizadeh, S Kumar, JF Kagy, ... 2024 International Conference on Learning Representations (ICLR 2024), 2023	18	2023
Why (and When) does Local SGD Generalize Better than SGD? X Gu, K Lyu, L Huang, S Arora 2023 International Conference on Learning Representations (ICLR 2023), 2023	13	2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking K Lyu, J Jin, Z Li, SS Du, JD Lee, W Hu 2024 International Conference on Learning Representations (ICLR 2024), 2023	7	2023
Single-Source Bottleneck Path Algorithm Faster than Sorting for Sparse Graphs R Duan, K Lyu, H Wu, Y Xie 45th International Colloquium on Automata, Languages, and Programming (ICALP …, 2018	6	2018
New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound A Gupta, N Saunshi, D Yu, K Lyu, S Arora 36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022	5	2022
The marginal value of momentum for small learning rate SGD R Wang, S Malladi, T Wang, K Lyu, Z Li 2024 International Conference on Learning Representations (ICLR 2024), 2023	3	2023
RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval K Wen, X Dang, K Lyu arXiv preprint arXiv:2402.18510, 2024		2024
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates K Lyu, H Zhao, X Gu, D Yu, A Goyal, S Arora arXiv preprint arXiv:2402.18540, 2024		2024
Efficient Stagewise Pretraining via Progressive Subnetworks A Panigrahi, N Saunshi, K Lyu, S Miryoosefi, S Reddi, S Kale, S Kumar arXiv preprint arXiv:2402.05913, 2024		2024
A Quadratic Synchronization Rule for Distributed Deep Learning X Gu, K Lyu, S Arora, J Zhang, L Huang 2024 International Conference on Learning Representations (ICLR 2024), 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors