Vision Transformers under Data Poisoning Attacks

Gabriel Peery

Owing to state-of-the-art performance and parallelizability, the Vision Transformer architecture is growing in prevalence for security-critical computer vision tasks. Designers may collect training images from public sources, but such data may be sabotaged; otherwise natural images may have subtle patterns added to them, crafted to cause a specific image to be incorrectly classified after training. Poisoning attack methods have been developed and tested on ResNets, but Vision Transformers' vulnerability has not been investigated. I develop a new poisoning attack method that augments Witches' Brew with heuristics for choosing which images to poison. I use it to attack DeiT, a Vision Transformer, while it is fine-tuned for benchmarks like classifying CIFAR-10. I also evaluate how DeiT's image tokenization introduces risk in the form of efficient attacks where sample modification is constrained to a limited count of patches. Progressively tightening constraints in extensive experiments, I compare the strength of attacks by observing which remain successful under the most challenging limitations. Accordingly, I find that the choice of objective greatly influences strength. In addition, I find that constraints on patch count deteriorate success rate more than those on image count. Attention rollout selection helps compensate, but image selection by gradient magnitude increases strength more. I find that Mixup and Cutmix are an effective defense, so I recommend them in security-critical applications.