Robust Physical-World Attacks on Deep Learning Models (Stop Sign Attack)

“Robust Physical-World Attacks on Deep Learning Models” was submitted to arXiv on July 27, 2017 by Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song, and was later accepted to CVPR 2018. It answered a natural skeptic’s objection to earlier adversarial-example work: those attacks altered the digital pixels of an image, but a real camera sees a physical object from changing distances, angles, and lighting. Could an attack survive the trip through the physical world?

The authors showed it could. They developed an attack method, Robust Physical Perturbations, that produces perturbations robust to varying viewpoints, and applied it to a real road stop sign. Using only carefully placed black and white stickers, made to look like ordinary graffiti or weathering, they caused a classifier to read the stop sign as a different sign. They reported targeted misclassification in 100% of images in controlled lab conditions and 84.8% of video frames captured from a moving vehicle.

The result was widely covered because of its obvious relevance to self-driving cars and other safety-critical vision systems. It moved adversarial examples out of the realm of theoretical digital manipulation and into a tangible, physically realizable threat that an attacker could deploy with a printer and some adhesive.

The paper became a standard reference for the physical-world threat model in adversarial machine learning, and it sharpened the argument that robustness testing must account for attacks that persist across real sensing conditions, not just clean digital inputs.

Robust Physical-World Attacks on Deep Learning Models (Stop Sign Attack)

Sources

Related