Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Machine-learning techniques for class imbalances and adversaries

Brendan Herger (Capital One)
2:55pm–3:35pm Thursday, 09/29/2016
Data science & advanced analytics
Location: 3D 10 Level: Intermediate
Average rating: ****.
(4.80, 5 ratings)

Prerequisite knowledge

  • A basic understanding of machine-learning techniques and terminology
  • What you'll learn

  • Understand common modeling techniques for mitigating rare occurrences and adversarial users
  • Description

    Many areas of applied machine learning require models optimized for rare occurrences, such as class imbalances, and users actively attempting to subvert the system (adversaries). The Data Innovation Lab at Capital One has explored advanced modeling techniques for just these challenges. The lab’s use case necessitated that it survey the many related fields that deal with these issues and perform many of the suggested modeling techniques. It has also introduced a few novel variations of its own.

    Brendan Herger offers an introduction to the problem space and a brief overview of the modeling frameworks the Data Innovation Lab has chosen to work with, outlines the lab’s approaches, discusses the lessons learned along the way, and explores proposed future work.

    Topics include:

    • Ensemble models
    • Deep learning
    • Genetic algorithms
    • Outlier detection via dimensionally reduction (PCA and neural network auto-encoders)
    • Time-decay weighting
    • The synthetic minority over-sampling technique (SMOTE sampling)

    Brendan Herger

    Capital One

    Brendan Herger is a data scientist at Capital One working on understanding how to leverage its data to empower its customers.