Authors: Marjolein Troost,Katja Seeliger,Marcel van Gerven
ArXiv: 1802.03488
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1802.03488v1
An important issue in neural network research is how to choose the number of
nodes and layers such as to solve a classification problem. We provide new
intuitions based on earlier results by An et al. (2015) by deriving an upper
bound on the number of nodes in networks with two hidden layers such that
linear separability can be achieved. Concretely, we show that if the data can
be described in terms of N finite sets and the used activation function f is
non-constant, increasing and has a left asymptote, we can derive how many nodes
are needed to linearly separate these sets. This will be an upper bound that
depends on the structure of the data. This structure can be analyzed using an
algorithm. For the leaky rectified linear activation function, we prove
separately that under some conditions on the slope, the same number of layers
and nodes as for the aforementioned activation functions is sufficient. We
empirically validate our claims.