Natural Backdoor Datasets. (arXiv:2206.10673v1 [cs.CV])

Extensive literature on backdoor poison attacks has studied attacks and
defenses for backdoors using “digital trigger patterns.” In contrast, “physical
backdoors” use physical objects as triggers, have only recently been
identified, and are qualitatively different enough to resist all defenses
targeting digital trigger backdoors. Research on physical backdoors is limited
by access to large datasets containing real images of physical objects
co-located with targets of classification. Building these datasets is time- and
labor-intensive. This works seeks to address the challenge of accessibility for
research on physical backdoor attacks. We hypothesize that there may be
naturally occurring physically co-located objects already present in popular
datasets such as ImageNet. Once identified, a careful relabeling of these data
can transform them into training samples for physical backdoor attacks. We
propose a method to scalably identify these subsets of potential triggers in
existing datasets, along with the specific classes they can poison. We call
these naturally occurring trigger-class subsets natural backdoor datasets. Our
techniques successfully identify natural backdoors in widely-available
datasets, and produce models behaviorally equivalent to those trained on
manually curated datasets. We release our code to allow the research community
to create their own datasets for research on physical backdoor attacks.