Authors: Minhyung Cho,Jaehyung Lee
Where published:
NeurIPS 2017 12
ArXiv: 1709.09603
Document:
PDF
DOI
Artifact development version:
GitHub
Abstract URL: http://arxiv.org/abs/1709.09603v3
Batch Normalization (BN) has proven to be an effective algorithm for deep
neural network training by normalizing the input to each neuron and reducing
the internal covariate shift. The space of weight vectors in the BN layer can
be naturally interpreted as a Riemannian manifold, which is invariant to linear
scaling of weights. Following the intrinsic geometry of this manifold provides
a new learning rule that is more efficient and easier to analyze. We also
propose intuitive and effective gradient clipping and regularization methods
for the proposed algorithm by utilizing the geometry of the manifold. The
resulting algorithm consistently outperforms the original BN on various types
of network architectures and datasets.