Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Apache Spark in fintech: Building fraud detection applications with distributed machine learning at Intel

Yuhao Yang (Intel)
1:15pm–1:55pm Thursday, 09/29/2016
Spark & beyond
Location: Hall 1B Level: Beginner
Average rating: ****.
(4.00, 5 ratings)

Prerequisite knowledge

  • A basic understanding of Spark
  • What you'll learn

  • Discover the tool stack in Intel's financial fraud detection system
  • Learn how to build a powerful and fast pipeline for feature derivation, selection, and transform
  • Take a deep dive into Intel's fraud detection algorithm, based on an ensemble of neural networks
  • Description

    There is a growing trend to use modern advanced technology in the finance industry. Information is often obtained on much larger scales, in various modalities, and from multiple dimensions, which greatly enriches the profiles of financial entities and leads to a rapid increase in the complexity of financial analytics. In the meantime, there’s increasing demand for automating the process of data statistics, feature engineering, and model tuning.

    Through collaboration with some of the top payments companies around the world, Intel has developed an end-to-end solution for building fraud detection applications. Yuhao Yang explains how Intel used and extended Spark DataFrames and ML Pipelines to build the tool chain for financial fraud detection and shares the lessons learned during development.

    Topics include:

    • An overview of the overall system architecture
    • How to build a powerful and fast pipeline for feature derivation, selection, and transform
    • A deep dive into the algorithm, based on an ensemble of neural networks, which resolves difficulty from unbalanced data and outperforms other algorithms for fraud detection
    • Other insights and experience learned during the development and deployment
    Photo of Yuhao Yang

    Yuhao Yang


    Yuhao Yang is a senior software engineer on the big data team at Intel, where he focuses on deep learning algorithms and applications—particularly distributed deep learning and machine learning solutions for fraud detection, recommendation, speech recognition, and visual perception. He’s also an active contributor to Apache Spark MLlib.

    Comments on this page are now closed.


    09/29/2016 10:11am EDT

    Can you share the slides.