Authors: Imme Ebert-Uphoff,Yi Deng
ArXiv: 1512.08279
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1512.08279v1
Causal discovery algorithms based on probabilistic graphical models have
emerged in geoscience applications for the identification and visualization of
dynamical processes. The key idea is to learn the structure of a graphical
model from observed spatio-temporal data, which indicates information flow,
thus pathways of interactions, in the observed physical system. Studying those
pathways allows geoscientists to learn subtle details about the underlying
dynamical mechanisms governing our planet. Initial studies using this approach
on real-world atmospheric data have shown great potential for scientific
discovery. However, in these initial studies no ground truth was available, so
that the resulting graphs have been evaluated only by whether a domain expert
thinks they seemed physically plausible. This paper seeks to fill this gap. We
develop a testbed that emulates two dynamical processes dominant in many
geoscience applications, namely advection and diffusion, in a 2D grid. Then we
apply the causal discovery based information tracking algorithms to the
simulation data to study how well the algorithms work for different scenarios
and to gain a better understanding of the physical meaning of the graph
results, in particular of instantaneous connections. We make all data sets used
in this study available to the community as a benchmark.
Keywords: Information flow, graphical model, structure learning, causal
discovery, geoscience.