How can machine learning be employed to create a system that monitors network traffic, operations data, and system logs to reliably flag risk and unearth potential threats? There are many challenges with developing such a system. With new types of behavior constantly emerging, a robust and exhaustively tagged dataset is extremely difficult to obtain. Training models on such a large amount of data can often take longer than practicality might dictate. Once trained, it is often difficult to employ the model in an expeditious way and get actionable results in a production environment.
Drawing on NVIDIA’s system for detecting anomalies on various NVIDIA platforms, Joshua Patterson and Aaron Sant-Miller explain how to bootstrap a machine learning framework to detect risk and threats in operational production systems, using best-of-breed GPU-accelerated open source tools and employing a multimodel approach to iterate quickly through the data science lifecycle. They then demonstrate how to speed up threat detection leveraging clusters of GPUs, shortening the training time from days to hours, dramatically cutting the inferencing time, and generally making the entire system much more adaptive.
Join Josh and Aaron as they walk you through how they built such a system, covering the architecture, the algorithms they implemented, how they sped up various parts of the data pipeline, and their future roadmap to incorporate more acceleration from the GPU Open Analytics Initiative (GOAI).
Joshua Patterson is a director of AI infrastructure at NVIDIA leading engineering for RAPIDS.AI. Previously, Josh was a White House Presidential Innovation Fellow and worked with leading experts across public sector, private sector, and academia to build a next-generation cyberdefense platform. His current passions are graph analytics, machine learning, and large-scale system design. Josh loves storytelling with data and creating interactive data visualizations. He holds a BA in economics from the University of North Carolina at Chapel Hill and an MA in economics from the University of South Carolina Moore School of Business.
Aaron Sant-Miller is a lead data scientist at Booz Allen Hamilton, where he specializes in applied mathematics, machine learning, and statistical modeling. He has architected, developed, and deployed data science solutions and machine learning suites across a wide range of domains, including tax fraud detection, climate science trend forecasting, cybersecurity risk scoring, and professional athlete performance prediction. Aaron’s current areas of research are focused on Bayesian modeling design, synthetic data generation, and optimized algorithm training design. He holds a BS and an MS in applied and computational mathematics and statistics from the University of Notre Dame.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org