Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Leveraging Spark to analyze billions of user actions to reveal hidden fraudsters

Yinglian Xie (DataVisor)
4:20pm–5:00pm Wednesday, 03/30/2016

Location: LL21 B
Tags: real-time
Average rating: **...
(2.88, 8 ratings)

Prerequisite knowledge

Attendees should have a general understanding of big data technologies, including Spark. Security and online fraud detection domain expertise is not required.


Today’s consumer-facing websites and mobile apps are measured by the size and growth of their user account base, as users are both contributors of content and a channel for monetization. Despite being the backbone of online services, these user accounts are also their Achilles heel. Well-organized crime rings have created millions of fake user accounts to hide among billions of benign users, and they are waging a variety of large-scale attacks to exploit these services for financial gain.

Yinglian Xie describes the anatomy of modern online services and the sophisticated attack techniques that have been used across a number of services, including social networking, gaming, financial, ecommerce, and other vertical markets. Yinglian demonstrates how these types of attacks can be detected and mitigated by leveraging Spark-based big data security analytics.

Topics include:

  • Real-life case studies that describe the anatomy of modern attacks and the damage they inflict on a variety of consumer-facing online services
  • How cyber criminals evade traditional security solutions and why traditional rule-based and ML-based security solutions are inadequate
  • The Spark-based big data technologies that can detect “sleeper cells” in the early incubation stage of an attack and stop these threats before any damage is done
Photo of Yinglian Xie

Yinglian Xie


Yinglian Xie is the CEO and cofounder of DataVisor, a startup in the area of big data analytics for security. Yinglian has been working in the area of internet security and privacy for over 10 years and has helped improve the security of billions of online users. Her work combines parallel-computing techniques, algorithms for mining large datasets, and security-domain knowledge into new solutions that prevent and combat a wide variety of attacks targeting consumer-facing online services. Prior to DataVisor, Yinglian was a senior researcher at Microsoft Research Silicon Valley, where she shipped a series of new techniques in production. She has been widely published in top conferences and served on the committees of many of them. Yinglian holds a PhD in computer science from Carnegie Mellon University.