Open library

Check the preview of 2nd version of this platform being developed by the open MLCommons taskforce on automation and reproducibility as a free, open-source and technology-agnostic on-prem platform.

HotFlip: White-Box Adversarial Examples for Text Classification

lib:047e4928bfc1b6e3 (v1.0.0)

Vote to reproduce this paper and share portable workflows ▲ 1 ▼

Authors: Javid Ebrahimi,Anyi Rao,Daniel Lowd,Dejing Dou
Where published: ACL 2018 7
ArXiv: 1712.06751
Document: PDF DOI

Artifact development version: GitHub

Abstract URL: http://arxiv.org/abs/1712.06751v2

We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier. We find that only a few manipulations are needed to greatly decrease the accuracy. Our method relies on an atomic flip operation, which swaps one token for another, based on the gradients of the one-hot input vectors. Due to efficiency of our method, we can perform adversarial training which makes the model more robust to attacks at test time. With the use of a few semantics-preserving constraints, we demonstrate that HotFlip can be adapted to attack a word-level classifier as well.

Relevant initiatives

Related knowledge about this paper

Search on this portal

Reproduced results (crowd-benchmarking and competitions)

Artifact and reproducibility checklists

Common formats for research projects and shared artifacts

Collective Knowledge (organizing research projects based on FAIR principles)

Reproducibility initiatives

Comments

Please log in to add your comments!

If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!

HotFlip: White-Box Adversarial Examples for Text Classification

Relevant initiatives Hide

Comments Hide

Relevant initiatives

Comments