Presented By O’Reilly and Intel AI
Put AI to work
Sep 4-5, 2018: Training
Sep 5-7, 2018: Tutorials & Conference
San Francisco, CA

Deep learning for large-scale online fraud detection

Ting-Fang Yen (DataVisor)
1:45pm-2:25pm Friday, September 7, 2018
Implementing AI, Models and Methods
Location: Continental 1-3
Secondary topics:  Deep Learning models, Temporal data and time-series
Average rating: ****.
(4.00, 1 rating)

Who is this presentation for?

  • Data scientists, machine learning engineers, AI infrastructure engineers, and AI researchers

Prerequisite knowledge

  • Familiarity with machine learning methods and related terminology
  • Experience with Apache Spark, TensorFlow, and deep learning methods (useful but not required)

What you'll learn

  • Understand how deep learning can be applied to cybersecurity and how deep learning models perform versus traditional fraud detection solutions
  • Learn how to implement a production-ready deep learning solution


Online fraud is often orchestrated by organized crime rings, who use coordinated malicious user accounts, either created anew or obtained via user hijacking, to actively target modern online services for financial gain. Existing fraud solutions either rely on reputation lists for blocking known suspicious activities or require extensive feature engineering by human analysts for model training. These approaches don’t adapt well to changing fraud patterns; nor are they able to scale to large data volumes.

DataVisor analyzes activities from billions of accounts across global online services to detect fraud and abuse, giving the company unique insights into the online fraud landscape that allow it to tackle the coordinated fraud attacks holistically. Ting-Fang Yen shares DataVisor’s real-time, scalable fraud detection solution, which is backed by deep learning and built on Spark and TensorFlow and demonstrates how the system significantly outperforms traditional solutions such as blacklists and machine learning at terabyte-data scale. The solution represents one of the few production examples where deep learning models are applied to security problems and is based on digital information commonly collected by online services, including IP addresses, user-agent strings, email domains, and user nicknames. The general fraud detection framework can identify fraudulent activities in log data that contain (all or a subnet of) this common digital information. By leveraging common digital information, the model is agnostic to the specific application or service from which data queries originate.

Photo of Ting-Fang Yen

Ting-Fang Yen


Ting-Fang Yen is the director of research at DataVisor, the leading fraud, crime, and abuse-detection solution using unsupervised machine learning to detect fraudulent and malicious activity such as fake account registrations, fraudulent transactions, spam, account takeovers, and more. She has over 10 years of experience in applying big data analytics and machine learning to tackle problems in cybersecurity. Ting-Fang holds a PhD in electrical and computer engineering from Carnegie Mellon University.