This paper presents a new semi-supervised framework with convolutional neural
networks (CNNs) for text categorization. Unlike the previous approaches that
rely on word embeddings, our method learns embeddings of small text regions
from unlabeled data for integration into a supervised CNN. The proposed scheme
for embedding learning is based on the idea of two-view semi-supervised
learning, which is intended to be useful for the task of interest even though
the training is done on unlabeled data. Our models achieve better results than
previous approaches on sentiment classification and topic classification tasks.