Event Calendar

Statistical analysis of overparameterized neural networks

Tuesday, 14 January 2025, 15:45

For many years, classical learning theory suggested that neural networks with a large num-ber of parameters would overfit their training data and thus generalize poorly to new, unseen data. Contrary to this long-held belief, the empirical success of such networks has been remarkable. However, from a mathematical perspective, the reasons behind their perfor-mance are not fully understood.
In this talk, we consider overparameterized neural networks learned by gradient descent in a statistical setting. We show that an estimator based on an overparameterized neural net-work - trained with a suitable step size and for an appropriate number of gradient descent steps - can be universally consistent. Furthermore, under suitable smoothness assumptions on the regression function, we derive rates of convergence for this estimator.
These results provide new insights into why overparameterized neural networks can
generalize effectively despite their high complexity.

Homepage
https://www.mathsee.kit.edu/downloads/drews.pdf

Speaker
Selina Drews

TU Darmstadt

Organizer
KIT Zentrum MathSEE
Karlsruher Institut für Technologie
Englerstraße 2
76131 Karlsruhe
Mail: MathSEE ∂does-not-exist.kit edu
https://www.mathsee.kit.edu/

Service-Menu