Authors: Tim Zerrell,Jeremy Bruestle
ArXiv: 1903.06498
Document:
PDF
DOI
Artifact development version:
GitHub
Abstract URL: http://arxiv.org/abs/1903.06498v1
Hardware architectures and machine learning (ML) libraries evolve rapidly.
Traditional compilers often fail to generate high-performance code across the
spectrum of new hardware offerings. To mitigate, engineers develop hand-tuned
kernels for each ML library update and hardware upgrade. Unfortunately, this
approach requires excessive engineering effort to scale or maintain with any
degree of state-of-the-art performance. Here we present a Nested Polyhedral
Model for representing highly parallelizable computations with limited
dependencies between iterations. This model provides an underlying framework
for an intermediate representation (IR) called Stripe, amenable to standard
compiler techniques while naturally modeling key aspects of modern ML
computing. Stripe represents parallelism, efficient memory layout, and multiple
compute units at a level of abstraction amenable to automatic optimization. We
describe how Stripe enables a compiler for ML in the style of LLVM that allows
independent development of algorithms, optimizations, and hardware
accelerators. We also discuss the design exploration advantages of Stripe over
kernel libraries and schedule-based or schedule-space-based code generation.