Check the preview of 2nd version of this platform being developed by the open MLCommons taskforce on automation and reproducibility as a free, open-source and technology-agnostic on-prem platform.

Training Neural Networks with Stochastic Hessian-Free Optimization

lib:439444bb8d9cb57f (v1.0.0)

Authors: Ryan Kiros
ArXiv: 1301.3641
Document:  PDF  DOI 
Abstract URL: http://arxiv.org/abs/1301.3641v3


Hessian-free (HF) optimization has been successfully used for training deep autoencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be computed on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens' HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against overfitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives

Comments  

Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!