Download:
Abstract:
This paper establishes a precise high-dimensional asymptotic theory for Boosting on separable data, taking statistical and computational perspectives. We consider the setting where the number of features (weak learners) $p$ scales with the sample size $n$, in an over-parametrized regime. On the statistical front, we provide an exact analysis of the generalization error of Boosting, when the algorithm interpolates the training data and maximizes an empirical $L_1$ margin. The angle between the Boosting solution and the ground truth is characterized explicitly. On the computational front, we provide a sharp analysis of the stopping time when Boosting approximately maximizes the empirical $L_1$ margin. Furthermore we discover that, the larger the margin, the smaller the proportion of active features (with zero initialization). At the heart of our theory lies a detailed study of the maximum $L_1$ margin, using tools from convex geometry. The maximum $L_1$ margin can be precisely described by a new system of non-linear equations, which we study using a novel uniform deviation argument. Preliminary numerical results are presented to demonstrate the accuracy of our theory.
Citation
Tengyuan Liang, and Pragya Sur. 2022. “A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-L1-Norm Interpolated Classifiers.” The Annals of Statistics, 50 (3): 1669-1695.
@article{LiangSur2022,
title = {A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers},
author = {Liang, Tengyuan and Sur, Pragya},
journal = {The Annals of Statistics},
volume = {50},
number = {3},
year = {2022},
month = jun,
publisher = {Institute of Mathematical Statistics},
issn = {0090-5364},
doi = {10.1214/22-AOS2170},
url = {http://dx.doi.org/10.1214/22-AOS2170},
}