volume 14 issue 1

Global convergence of SGD on two layer neural nets

Publication typeJournal Article
Publication date2025-01-15
scimago Q1
wos Q2
SJR1.220
CiteScore3.7
Impact factor1.6
ISSN20498772, 20498764
Abstract

In this note, we consider appropriately regularized $\ell _{2}-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for Stochastic Gradient Descent (SGD) iterates on it—for arbitrary data and if the activation is adequately smooth and bounded like sigmoid and tanh. This, in turn, leads to a proof of global convergence of SGD for a special class of initializations. We also prove an exponentially fast convergence rate for continuous time SGD that also applies to smooth unbounded activations like SoftPlus. Our key idea is to show the existence of Frobenius norm regularized loss functions on constant-sized neural nets that are ‘Villani functions’ and thus be able to build on recent progress with analyzing SGD on such objectives. Most critically, the amount of regularization required for our analysis is independent of the size of the net.

Found 
Found 

Top-30

Publishers

1
Institute of Electrical and Electronics Engineers (IEEE)
1 publication, 100%
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
1
Share
Cite this
GOST |
Cite this
GOST Copy
Gopalani P., Mukherjee A. Global convergence of SGD on two layer neural nets // Information and Inference. 2025. Vol. 14. No. 1.
GOST all authors (up to 50) Copy
Gopalani P., Mukherjee A. Global convergence of SGD on two layer neural nets // Information and Inference. 2025. Vol. 14. No. 1.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1093/imaiai/iaae035
UR - https://academic.oup.com/imaiai/article/doi/10.1093/imaiai/iaae035/7960061
TI - Global convergence of SGD on two layer neural nets
T2 - Information and Inference
AU - Gopalani, Pulkit
AU - Mukherjee, Anirbit
PY - 2025
DA - 2025/01/15
PB - Oxford University Press
IS - 1
VL - 14
SN - 2049-8772
SN - 2049-8764
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2025_Gopalani,
author = {Pulkit Gopalani and Anirbit Mukherjee},
title = {Global convergence of SGD on two layer neural nets},
journal = {Information and Inference},
year = {2025},
volume = {14},
publisher = {Oxford University Press},
month = {jan},
url = {https://academic.oup.com/imaiai/article/doi/10.1093/imaiai/iaae035/7960061},
number = {1},
doi = {10.1093/imaiai/iaae035}
}