Authors: Siyuan Qi,Yixin Zhu,Siyuan Huang,Chenfanfu Jiang,Song-Chun Zhu
Where published:
CVPR 2018 6
ArXiv: 1808.08473
Document:
PDF
DOI
Artifact development version:
GitHub
Abstract URL: http://arxiv.org/abs/1808.08473v1
We present a human-centric method to sample and synthesize 3D room layouts
and 2D images thereof, to obtain large-scale 2D/3D image data with perfect
per-pixel ground truth. An attributed spatial And-Or graph (S-AOG) is proposed
to represent indoor scenes. The S-AOG is a probabilistic grammar model, in
which the terminal nodes are object entities. Human contexts as contextual
relations are encoded by Markov Random Fields (MRF) on the terminal nodes. We
learn the distributions from an indoor scene dataset and sample new layouts
using Monte Carlo Markov Chain. Experiments demonstrate that our method can
robustly sample a large variety of realistic room layouts based on three
criteria: (i) visual realism comparing to a state-of-the-art room arrangement
method, (ii) accuracy of the affordance maps with respect to groundtruth, and
(ii) the functionality and naturalness of synthesized rooms evaluated by human
subjects. The code is available at
https://github.com/SiyuanQi/human-centric-scene-synthesis.