Authors: Xiaolei Liu,Yuheng Luo,Xiaosong Zhang,Qingxin Zhu
ArXiv: 1901.09892
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1901.09892v1
Neural networks play an increasingly important role in the field of machine
learning and are included in many applications in society. Unfortunately,
neural networks suffer from adversarial samples generated to attack them.
However, most of the generation approaches either assume that the attacker has
full knowledge of the neural network model or are limited by the type of
attacked model. In this paper, we propose a new approach that generates a
black-box attack to neural networks based on the swarm evolutionary algorithm.
Benefiting from the improvements in the technology and theoretical
characteristics of evolutionary algorithms, our approach has the advantages of
effectiveness, black-box attack, generality, and randomness. Our experimental
results show that both the MNIST images and the CIFAR-10 images can be
perturbed to successful generate a black-box attack with 100\% probability on
average. In addition, the proposed attack, which is successful on distilled
neural networks with almost 100\% probability, is resistant to defensive
distillation. The experimental results also indicate that the robustness of the
artificial intelligence algorithm is related to the complexity of the model and
the data set. In addition, we find that the adversarial samples to some extent
reproduce the characteristics of the sample data learned by the neural network
model.