Authors: Robert M. Freund,Paul Grigas,Rahul Mazumder
ArXiv: 1505.04243
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1505.04243v1
In this paper we analyze boosting algorithms in linear regression from a new
perspective: that of modern first-order methods in convex optimization. We show
that classic boosting algorithms in linear regression, namely the incremental
forward stagewise algorithm (FS$_\varepsilon$) and least squares boosting
(LS-Boost($\varepsilon$)), can be viewed as subgradient descent to minimize the
loss function defined as the maximum absolute correlation between the features
and residuals. We also propose a modification of FS$_\varepsilon$ that yields
an algorithm for the Lasso, and that may be easily extended to an algorithm
that computes the Lasso path for different values of the regularization
parameter. Furthermore, we show that these new algorithms for the Lasso may
also be interpreted as the same master algorithm (subgradient descent), applied
to a regularized version of the maximum absolute correlation loss function. We
derive novel, comprehensive computational guarantees for several boosting
algorithms in linear regression (including LS-Boost($\varepsilon$) and
FS$_\varepsilon$) by using techniques of modern first-order methods in convex
optimization. Our computational guarantees inform us about the statistical
properties of boosting algorithms. In particular they provide, for the first
time, a precise theoretical description of the amount of data-fidelity and
regularization imparted by running a boosting algorithm with a prespecified
learning rate for a fixed but arbitrary number of iterations, for any dataset.