As machine learning (ML) is increasingly used in security, practitioners and researchers must understand the pitfalls ML presents in the adversarial context. Attackers currently evade signatures and heuristics, and they evade statistical models too. Yacin Nadji offers some background on the academic security world’s attempt at understanding how to break and fix ML systems, which inevitably devolves into the cat-and-mouse game seen in many facets of security. However, those that can find mice better will stay cats longer.
Yacin begins by providing a high-level background of the adversarial machine learning space before walking you through tearing down, evaluating, and fixing a deployed network-based domain-name generation algorithm detector that uses graph clustering. While the described attacks and fixes are specific to graph clustering, the process used can be applied to other ML systems to perform adversarial evaluation. Novel contributions include evaluating unsupervised graph learning as well as considering the level of knowledge an attacker possesses, which is paramount when ML systems rely on a nonlocal feature space. Consider an ML system that extracts features from a large ISP’s network traffic to detect infected hosts. An adversary that only knows the network traffic of their infections is less equipped to evade this system than an attacker that has compromised the ISP’s training dataset from which the features are constructed. Prior work often considers an attacker that only has black-box access to the model, but the most sophisticated attackers are likely to have reverse engineered the training dataset or surreptitiously acquired it through illegitimate means. Threat models for ML systems must include these sophisticated attackers if they are to remain relevant.
Yacin Nadji is a research scientists at the Georgia Institute of Technology. An expert in computer security, he has worked at numerous companies building and improving machine learning-based fraud and abuse detection systems at scale. Yacin is the author of 16 academic publications with over 600 citations, has served as a reviewer for academic security conferences and journals, and has given talks at several industry conferences and symposia. He holds a PhD in computer science from the Georgia Institute of Technology.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org