Check the preview of 2nd version of this platform being developed by the open MLCommons taskforce on automation and reproducibility as a free, open-source and technology-agnostic on-prem platform.

HotFlip: White-Box Adversarial Examples for Text Classification

lib:047e4928bfc1b6e3 (v1.0.0)

Vote to reproduce this paper and share portable workflows   1 
Authors: Javid Ebrahimi,Anyi Rao,Daniel Lowd,Dejing Dou
Where published: ACL 2018 7
ArXiv: 1712.06751
Document:  PDF  DOI 
Artifact development version: GitHub
Abstract URL: http://arxiv.org/abs/1712.06751v2


We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier. We find that only a few manipulations are needed to greatly decrease the accuracy. Our method relies on an atomic flip operation, which swaps one token for another, based on the gradients of the one-hot input vectors. Due to efficiency of our method, we can perform adversarial training which makes the model more robust to attacks at test time. With the use of a few semantics-preserving constraints, we demonstrate that HotFlip can be adapted to attack a word-level classifier as well.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives

Comments  

Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!