Authors: Chen Liu,Mathieu Salzmann,Sabine Süsstrunk
Where published:
ICLR 2020 1
ArXiv: 1912.04792
Document:
PDF
DOI
Abstract URL: https://arxiv.org/abs/1912.04792v2
Training certifiable neural networks enables one to obtain models with robustness guarantees against adversarial attacks. In this work, we introduce a framework to bound the adversary-free region in the neighborhood of the input data by a polyhedral envelope, which yields finer-grained certified robustness. We further introduce polyhedral envelope regularization (PER) to encourage larger polyhedral envelopes and thus improve the provable robustness of the models. We demonstrate the flexibility and effectiveness of our framework on standard benchmarks; it applies to networks of different architectures and general activation functions. Compared with the state-of-the-art methods, PER has very little computational overhead and better robustness guarantees without over-regularizing the model.