Follow
Jingfeng Wu
Title
Cited by
Cited by
Year
The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects
Z Zhu, J Wu, B Yu, L Wu, J Ma
International Conference on Machine Learning, 7654-7663, 2019
289*2019
On the Noisy Gradient Descent that Generalizes as SGD
J Wu, W Hu, H Xiong, J Huan, V Braverman, Z Zhu
International Conference on Machine Learning, 10367-10376, 2020
1172020
Programmable packet scheduling with a single queue
Z Yu, C Hu, J Wu, X Sun, V Braverman, M Chowdhury, Z Liu, X Jin
Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 179-193, 2021
932021
Benign overfitting of constant-stepsize SGD for linear regression
D Zou, J Wu, V Braverman, Q Gu, SM Kakade
Journal of Machine Learning Research 24 (326), 1-58, 2023
702023
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
J Wu, D Zou, Z Chen, V Braverman, Q Gu, PL Bartlett
arXiv preprint arXiv:2310.08391, 2023
532023
Twenty years after: Hierarchical {Core-Stateless} fair queueing
Z Yu, J Wu, V Braverman, I Stoica, X Jin
18th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2021
412021
Direction matters: On the implicit bias of stochastic gradient descent with moderate learning rate
J Wu, D Zou, V Braverman, Q Gu
International Conference on Learning Representations, 2021
392021
Tangent-normal adversarial regularization for semi-supervised learning
B Yu, J Wu, J Ma, Z Zhu
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019
372019
The benefits of implicit regularization from sgd in least squares problems
D Zou, J Wu, V Braverman, Q Gu, DP Foster, S Kakade
Advances in neural information processing systems 34, 5456-5468, 2021
322021
Last iterate risk bounds of sgd with decaying stepsize for overparameterized linear regression
J Wu, D Zou, V Braverman, Q Gu, S Kakade
International Conference on Machine Learning, 24280-24314, 2022
252022
Ship compute or ship data? why not both?
J You, J Wu, X Jin, M Chowdhury
18th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2021
252021
The power and limitation of pretraining-finetuning for linear regression under covariate shift
J Wu, D Zou, V Braverman, Q Gu, S Kakade
Advances in Neural Information Processing Systems 35, 33041-33053, 2022
182022
Gap-dependent unsupervised exploration for reinforcement learning
J Wu, V Braverman, L Yang
International Conference on Artificial Intelligence and Statistics, 4109-4131, 2022
172022
Lifelong learning with sketched structural regularization
H Li, A Krishnan, J Wu, S Kolouri, PK Pilly, V Braverman
Asian conference on machine learning, 985-1000, 2021
172021
Accommodating picky customers: Regret bound and exploration complexity for multi-objective reinforcement learning
J Wu, V Braverman, L Yang
Advances in Neural Information Processing Systems 34, 13112-13124, 2021
162021
Implicit bias of gradient descent for logistic regression at the edge of stability
J Wu, V Braverman, JD Lee
Advances in Neural Information Processing Systems 36, 2024
152024
A collective AI via lifelong learning and sharing at the edge
A Soltoggio, E Ben-Iwhiwhu, V Braverman, E Eaton, B Epstein, Y Ge, ...
Nature Machine Intelligence 6 (3), 251-264, 2024
112024
In-context learning of a linear Transformer block: benefits of the MLP component and one-step GD initialization
R Zhang, J Wu, PL Bartlett
arXiv preprint arXiv:2402.14951, 2024
112024
Fixed design analysis of regularization-based continual learning
H Li, J Wu, V Braverman
Conference on Lifelong Learning Agents, 513-533, 2023
82023
Risk bounds of multi-pass sgd for least squares in the interpolation regime
D Zou, J Wu, V Braverman, Q Gu, S Kakade
Advances in Neural Information Processing Systems 35, 12909-12920, 2022
82022
The system can't perform the operation now. Try again later.
Articles 1–20