Authors: Krassimir Valev,Arne Schumann,Lars Sommer,Jürgen Beyerer
ArXiv: 1806.02987
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1806.02987v1
Fine-grained vehicle classification is the task of classifying make, model,
and year of a vehicle. This is a very challenging task, because vehicles of
different types but similar color and viewpoint can often look much more
similar than vehicles of same type but differing color and viewpoint. Vehicle
make, model, and year in com- bination with vehicle color - are of importance
in several applications such as vehicle search, re-identification, tracking,
and traffic analysis. In this work we investigate the suitability of several
recent landmark convolutional neural network (CNN) architectures, which have
shown top results on large scale image classification tasks, for the task of
fine-grained classification of vehicles. We compare the performance of the
networks VGG16, several ResNets, Inception architectures, the recent DenseNets,
and MobileNet. For classification we use the Stanford Cars-196 dataset which
features 196 different types of vehicles. We investigate several aspects of CNN
training, such as data augmentation and training from scratch vs. fine-tuning.
Importantly, we introduce no aspects in the architectures or training process
which are specific to vehicle classification. Our final model achieves a
state-of-the-art classification accuracy of 94.6% outperforming all related
works, even approaches which are specifically tailored for the task, e.g. by
including vehicle part detections.