Convergence of adagrad for non-convex objectives: Simple proofs and relaxed assumptions B Wang, H Zhang, Z Ma, W Chen The Thirty Sixth Annual Conference on Learning Theory, 161-190, 2023 | 39 | 2023 |
The implicit bias for adaptive optimization algorithms on homogeneous neural networks B Wang, Q Meng, W Chen, TY Liu International Conference on Machine Learning, 10849-10858, 2021 | 39 | 2021 |
Provable adaptivity of adam under non-uniform smoothness B Wang, Y Zhang, H Zhang, Q Meng, R Sun, ZM Ma, TY Liu, ZQ Luo, ... Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | 36 | 2024 |
Piecewise linear activations substantially shape the loss surfaces of neural networks F He*, B Wang*, D Tao International Conference on Learning Representations (ICLR) 2020, 2020 | 33 | 2020 |
Machine-learning nonconservative dynamics for new-physics detection Z Liu, B Wang, Q Meng, W Chen, M Tegmark, TY Liu Physical Review E 104 (5), 055302, 2021 | 29 | 2021 |
Does Momentum Change the Implicit Regularization on Separable Data? B Wang, Q Meng, H Zhang, R Sun, W Chen, ZM Ma Neurips 2022, 0 | 19* | |
Creating training sets via weak indirect supervision J Zhang, B Wang, X Song, Y Wang, Y Yang, J Bai, A Ratner ICLR 2022, 0 | 19* | |
Tighter generalization bounds for iterative differentially private learning algorithms F He*, B Wang*, D Tao Uncertainty in Artificial Intelligence (UAI) 2021, 2021 | 16 | 2021 |
Closing the gap between the upper bound and lower bound of Adam's iteration complexity B Wang, J Fu, H Zhang, N Zheng, W Chen Advances in Neural Information Processing Systems 36, 2024 | 15 | 2024 |
Robustness, privacy, and generalization of adversarial training F He, S Fu, B Wang, D Tao arXiv preprint arXiv:2012.13573, 2020 | 11 | 2020 |
On the trade-off of intra-/inter-class diversity for supervised pre-training J Zhang, B Wang, Z Hu, PWW Koh, AJ Ratner Advances in Neural Information Processing Systems 36, 2024 | 9 | 2024 |
-GNN: incorporating ring priors into molecular modeling J Zhu, K Wu, B Wang, Y Xia, S Xie, Q Meng, L Wu, T Qin, W Zhou, H Li, ... The Eleventh International Conference on Learning Representations, 2023 | 9 | 2023 |
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD B Wang, H Zhang, J Zhang, Q Meng, W Chen, TY Liu 35th Conference on Neural Information Processing Systems (Neurips 2021), 2021 | 8 | 2021 |
When and why momentum accelerates sgd: An empirical study J Fu, B Wang, H Zhang, Z Zhang, W Chen, N Zheng arXiv preprint arXiv:2306.09000, 2023 | 7 | 2023 |
Fast conditional mixing of mcmc algorithms for non-log-concave distributions X Cheng, B Wang, J Zhang, Y Zhu Advances in Neural Information Processing Systems 36, 2024 | 5 | 2024 |
Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study P Phunyaphibarn, J Lee, B Wang, H Zhang, C Yun arXiv preprint arXiv:2311.15051, 2023 | 1 | 2023 |
Towards Understanding the Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space M Yi, B Wang arXiv preprint arXiv:2401.13530, 2024 | | 2024 |
Fast conditional mixing of MCMC algorithms for non-log-concave distributions B Wang, X Cheng, J Zhang, Y Zhu Proceedings of the 37th International Conference on Neural Information …, 2023 | | 2023 |
Gradient Descent with Polyak’s Momentum Finds Flatter Minima via Large Catapults P Phunyaphibarn, J Lee, B Wang, H Zhang, C Yun High-dimensional Learning Dynamics 2024: The Emergence of Structure and …, 0 | | |