Authors: Ivo Gonçalves,Sara Silva,Carlos M. Fonseca,Mauro Castelli
ArXiv: 1706.06195
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1706.06195v1
In iterative supervised learning algorithms it is common to reach a point in
the search where no further induction seems to be possible with the available
data. If the search is continued beyond this point, the risk of overfitting
increases significantly. Following the recent developments in inductive
semantic stochastic methods, this paper studies the feasibility of using
information gathered from the semantic neighborhood to decide when to stop the
search. Two semantic stopping criteria are proposed and experimentally assessed
in Geometric Semantic Genetic Programming (GSGP) and in the Semantic Learning
Machine (SLM) algorithm (the equivalent algorithm for neural networks). The
experiments are performed on real-world high-dimensional regression datasets.
The results show that the proposed semantic stopping criteria are able to
detect stopping points that result in a competitive generalization for both
GSGP and SLM. This approach also yields computationally efficient algorithms as
it allows the evolution of neural networks in less than 3 seconds on average,
and of GP trees in at most 10 seconds. The usage of the proposed semantic
stopping criteria in conjunction with the computation of optimal
mutation/learning steps also results in small trees and neural networks.