Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

How to be fair: A tutorial for beginners

Aileen Nielsen (Skillman Consulting)

1:30pm–5:00pm Tuesday, 09/11/2018

Data science and machine learning
Location: 1E 11 Level: Intermediate

Secondary topics: Ethics and Privacy

Average rating:

(4.00, 4 ratings)

Who is this presentation for?

Data scientists

Prerequisite knowledge

Basic knowledge of machine learning and either Python or R (The tutorial will be conducted in both languages, but it's structured so that knowing only one or the other will be sufficient to get most of the material worked through.)

Materials or downloads needed in advance

A laptop with Python and R installed
Please download relevant materials from the course GitHub repository (link TBD)

What you'll learn

Learn how to apply practical ethics lessons to your day-to-day workflows

Description

There is mounting evidence that the widespread deployment of machine learning and artificial intelligence in business and government applications is reproducing or even amplifying existing prejudices and social inequalities. Even when an organization or an individual software engineer seeks to maintain fairness and accuracy, it’s easy to unintentionally create software that exhibits discriminatory or privacy-violating behavior.

Aileen Nielsen demonstrates how to identify and avoid bias and other unfairness in your analyses and apply best practices when developing new software and machine learning products.

Outline:

Introduction and social relevance

Relevant news stories
A brief introduction to relevant legal concepts and their applicability to data analysis and model building

Data discovery

Examples of how “bad” or incomplete datasets can lead to discriminatory models
How to examine your input data and balance your input data before inputting into an analysis pipeline

Data processing

Examples of how data processing has resulted in discriminatory models
How to examine your preprocessing pipeline to prevent discriminatory inputs
Examples of how data processing has resulted in privacy-violating models
How to examine your process for privacy leaks

Modeling

Examples of how choice of model can lead to discriminatory results
Examples of how models can be designed to be more or less vulnerable to discriminatory input data
How to test your model and examine final parameters and fits for discriminatory behavior for a variety of common model families

Auditing your model

Examples of how even models following processes above may still yield discriminatory behavior
Auditing your model as a black box with existing Python language solutions

Research frontiers

Updates on how computer scientists and sociologists are developing new methods to avoid discriminatory and privacy-violating models
A roundup of newly published papers that illustrate the breadth and current state of this active area of research

Aileen Nielsen

Skillman Consulting

Aileen Nielsen works at an early-stage NYC startup that has something to do with time series data and neural networks, and she’s the author of a Practical Time Series Analysis (2019) and an upcoming book, Practical Fairness, (summer 2020). Previously, Aileen worked at corporate law firms, physics research labs, a variety of NYC tech startups, the mobile health platform One Drop, and on Hillary Clinton’s presidential campaign. Aileen is the chair of the NYC Bar’s Science and Law Committee and a fellow in law and tech at ETH Zurich. Aileen is a frequent speaker at machine learning conferences on both technical and legal subjects.

Comments on this page are now closed.

Comments

shruti sinha | BUSINESS ANALYTICS CONSULTANT

09/30/2018 7:28am EDT

can you please post the slides? Thank you

Aileen Nielsen | SOFTWARE ENGINEER

09/11/2018 7:35am EDT

Here is the git repo: https://github.com/StrataFairnessTutorial/DemoCode

Alexander Pelivan | DATA ENGINEER

09/10/2018 7:06am EDT

Hi, can you please provide the link to the github repo?

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com