Authors: Sven Buechel,João Sedoc,H. Andrew Schwartz,Lyle Ungar
ArXiv: 1810.10949
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1810.10949v1
Deep Learning has drastically reshaped virtually all areas of NLP. Yet on the
downside, it is commonly thought to be dependent on vast amounts of training
data. As such, these techniques appear ill-suited for areas where annotated
data is limited, like emotion analysis, with its many nuanced and
hard-to-acquire annotation formats, or other low-data scenarios encountered in
under-resourced languages. In contrast to this popular notion, we provide
empirical evidence from three typologically diverse languages that today's
favorite neural architectures can be trained on a few hundred observations
only. Our results suggest that high-quality, pre-trained word embeddings are
crucial for achieving high performance despite such strong data limitations.