Authors: Arturs Backurs,Piotr Indyk,Ludwig Schmidt
Where published:
NeurIPS 2017 12
ArXiv: 1704.02958
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1704.02958v1
Empirical risk minimization (ERM) is ubiquitous in machine learning and
underlies most supervised learning methods. While there has been a large body
of work on algorithms for various ERM problems, the exact computational
complexity of ERM is still not understood. We address this issue for multiple
popular ERM problems including kernel SVMs, kernel ridge regression, and
training the final layer of a neural network. In particular, we give
conditional hardness results for these problems based on complexity-theoretic
assumptions such as the Strong Exponential Time Hypothesis. Under these
assumptions, we show that there are no algorithms that solve the aforementioned
ERM problems to high accuracy in sub-quadratic time. We also give similar
hardness results for computing the gradient of the empirical loss, which is the
main computational burden in many non-convex learning tasks.