Authors: RĂ©mi Flamary,Alain Rakotomamonjy,Gilles Gasso
ArXiv: 1606.07286
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1606.07286v1
As the number of samples and dimensionality of optimization problems related
to statistics an machine learning explode, block coordinate descent algorithms
have gained popularity since they reduce the original problem to several
smaller ones. Coordinates to be optimized are usually selected randomly
according to a given probability distribution. We introduce an importance
sampling strategy that helps randomized coordinate descent algorithms to focus
on blocks that are still far from convergence. The framework applies to
problems composed of the sum of two possibly non-convex terms, one being
separable and non-smooth. We have compared our algorithm to a full gradient
proximal approach as well as to a randomized block coordinate algorithm that
considers uniform sampling and cyclic block coordinate descent. Experimental
evidences show the clear benefit of using an importance sampling strategy.