Adversarial examples refer to carefully crafted perturbations such that, when added to natural examples, will lead the state-of-the-art deep neural network models to misbehave. In the image learning tasks, the adversarial perturbations can be made visually imperceptible to human eyes, and therefore resulting in inconsistent decision making between human and well-trained machine learning models, especially for deep neural networks. Even worse, adversarial examples not only exist in the digital space but also have been realized in the physical world by means of colorful stickers or 3D printing, giving rise to rapidly increasing concerns on safety-critical and security-critical machine learning tasks.
Despite various efforts to improve the robustness of neural networks against adversarial perturbations, a comprehensive measure of a model’s robustness is still lacking. Current robustness evaluation relies on the empirical defense performance against existing adversarial attacks and may result in a false sense of robustness, since the defense is neither certified nor guaranteed to be generalizable to unseen attacks.
CLEVER (Cross Lipschitz Extreme Value for nEtwork Robustness) was created to tackle this challenge. It offers an attack-agnostic measure for evaluating the robustness of any trained neural network classifier against adversarial perturbations. The proposed CLEVER score is:
Without invoking any specific adversarial attack, the CLEVER score can be directly used to compare the robustness of different network designs and training procedures towards building more reliable systems, as demonstrated in the paper. One possible use case is the before-after scenario, where users can obtain a score that reflects the improvement in model robustness before and after a given defense strategy. It’s also the first attack-independent robustness metric that can be applied to any neural network classifier.
Pin-Yu Chen offers an overview of adversarial attack and defense methods for neural networks, details the CLEVER framework for evaluating model robustness, and shares an intriguing demo of adversarial examples.
Pin-Yu Chen is a research staff member in the AI Foundations Learning Group at the IBM Thomas J. Watson Research Center in Yorktown Heights, NY. His recent research focuses on adversarial machine learning and robustness analysis of neural networks; he’s also interested in graph and network data analytics and their applications to data mining, machine learning, signal processing, and cybersecurity. Pin-Yu received the NIPS 2017 Best Reviewer Award and the IEEE GLOBECOM 2010 GOLD Best Paper Award as well as several travel grants, including IEEE ICASSP 2014 (NSF), IEEE ICASSP 2015 (SPS), IEEE Security and Privacy Symposium, NSF Graph Signal Processing Workshop 2016, and ACM KDD 2016. He is a member of the Tau Beta Pi Honor Society and the Phi Kappa Phi Honor Society and was the recipient of the Chia-Lun Lo Fellowship from the University of Michigan Ann Arbor. Pin-Yu holds a BS in electrical engineering and computer science (undergraduate honors program) from National Chiao Tung University, Taiwan, an MS in communication engineering from National Taiwan University, Taiwan, and an MA in statistics and a PhD in electrical engineering and computer science, both from the University of Michigan, Ann Arbor.
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org