Authors: Xiaojie Jin,Yunpeng Chen,Jiashi Feng,Zequn Jie,Shuicheng Yan
ArXiv: 1608.07706
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1608.07706v3
In this paper, we consider the scene parsing problem and propose a novel
Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene
images. MPF-RNN can enhance the capability of RNNs in modeling long-range
context information at multiple levels and better distinguish pixels that are
easy to confuse. Different from feedforward CNNs and RNNs with only single
feedback, MPF-RNN propagates the contextual features learned at top layer
through \textit{multiple} weighted recurrent connections to learn bottom
features. For better training MPF-RNN, we propose a new strategy that considers
accumulative loss at multiple recurrent steps to improve performance of the
MPF-RNN on parsing small objects. With these two novel components, MPF-RNN has
achieved significant improvement over strong baselines (VGG16 and Res101) on
five challenging scene parsing benchmarks, including traditional SiftFlow,
Barcelona, CamVid, Stanford Background as well as the recently released
large-scale ADE20K.