Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Hands-on data science with Python

Robert Schroll (The Data Incubator)
Monday, 29 April & Tuesday, 30 April, 9:00 - 17:00
Data Science, Machine Learning & AI
Location: Capital Suite 1
Secondary topics:  Data preparation, data governance, and data lineage
Average rating: ****.
(4.75, 4 ratings)

Participants should plan to attend both days of this 2-day training course. To attend training courses, you must register for a Platinum or Training pass; does not include access to tutorials on Tuesday.

Robert Schroll walks you through all the steps of developing a machine learning pipeline from prototyping to production. You'll explore data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python.

What you'll learn, and how you can apply it

  • Understand machine learning and feature engineering basics
  • Explore anomaly detection and recommendation engines
  • Learn scikit-learn fundamentals
  • Create machine learning processes with scikit-learn
  • Evaluate machine learning applications to real-world problems

This training is for you because...

  • You're a software engineer or programmer with a background in Python, and you want to develop a basic understanding of machine learning.
  • You're in a nontechnical role, and you want to more effectively communicate with the engineers and data scientists in your company about machine learning.

Prerequisites:

  • A working knowledge of Python
  • Familiarity with pandas (useful but not required)

Outline

Day 1: Anomaly detection

  • Data format and goal
  • Limitations of time series data
  • Detrending and seasonality
  • Windowing and local scores
  • Setting thresholds for classification
  • Online learning

Day 2: Recommendation engine

  • Overview of data and its wrangling
  • Item-item correlations and finding similar items
  • User similarity and predicting user ratings
  • Collaborative filtering
  • Evaluating model performance

About your instructor

Photo of Robert Schroll

Robert Schroll is a data scientist in residence at the Data Incubator. Previously, he held postdocs in Amherst, Massachusetts, and Santiago, Chile, where he realized that his favorite parts of his job were teaching and analyzing data. He made the switch to data science and has been at the Data Incubator since. Robert holds a PhD in physics from the University of Chicago.

Conference registration

Get the Platinum pass or the Training pass to add this course to your package.