Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Hands-on data science with Python (Day 2)

Robert Schroll (The Data Incubator)
Location: Capital Suite 1

Who is this presentation for?

  • You're a software engineer or programmer with a background in Python, and you want to develop a basic understanding of machine learning.
  • You're in a nontechnical role, and you want to more effectively communicate with the engineers and data scientists in your company about machine learning.

Level

Intermediate

Prerequisite knowledge

  • A working knowledge of Python
  • Familiarity with pandas (useful but not required)

What you'll learn

  • Understand machine learning and feature engineering basics
  • Explore anomaly detection and recommendation engines
  • Learn scikit-learn fundamentals
  • Create machine learning processes with scikit-learn
  • Evaluate machine learning applications to real-world problems

Description

Outline

Day 1: Anomaly detection

  • Data format and goal
  • Limitations of time series data
  • Detrending and seasonality
  • Windowing and local scores
  • Setting thresholds for classification
  • Online learning

Day 2: Recommendation engine

  • Overview of data and its wrangling
  • Item-item correlations and finding similar items
  • User similarity and predicting user ratings
  • Collaborative filtering
  • Evaluating model performance
Photo of Robert Schroll

Robert Schroll

The Data Incubator

Robert Schroll is a data scientist in residence at the Data Incubator. Previously, he held postdocs in Amherst, Massachusetts, and Santiago, Chile, where he realized that his favorite parts of his job were teaching and analyzing data. He made the switch to data science and has been at the Data Incubator since. Robert holds a PhD in physics from the University of Chicago.