Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

In-Person Training
Hands-on data science with Python

Zachary Glassman (The Data Incubator)
Monday, 21 May & Tuesday, 22 May, 9:00 - 17:00
Location: Capital Suite 7

Participants should plan to attend both days of this 2-day training course. Platinum and Training passes do not include access to tutorials on Tuesday.

Zachary Glassman offers a foundation in building intelligent business applications using machine learning, walking you through all the steps of developing a machine learning pipeline, from prototyping to production. You'll explore data cleaning, feature engineering, model building and evaluation, and deployment and extend these models into two applications using real-world datasets.

What you'll learn, and how you can apply it

  • Learn how to develop a machine learning pipeline

Prerequisites:

  • A working knowledge of Python
  • Familiarity with pandas (useful but not required)

Hardware and/or installation requirements:

    Laptop with an internet connection
  • A modern web browser (preferably Firefox or Chrome)

Zachary Glassman offers a foundation in building intelligent business applications using machine learning, walking you through all the steps of developing a machine learning pipeline, from prototyping to production. You’ll explore data cleaning, feature engineering, model building and evaluation, and deployment and extend these models into two applications using real-world datasets. All work will be done in Python.

Outline

Day 1: Anomaly detection

  • Data format and goal
  • Limitations of time series data
  • Detrending and seasonality
  • Windowing and local scores
  • Setting thresholds for classification
  • Online learning

Day 2: Recommendation engine

  • Overview of data and its wrangling
  • Item-item correlations and finding similar items
  • User similarity and predicting user ratings
  • Collaborative filtering
  • Evaluating model performance

About your instructor

Photo of Zachary Glassman

Zachary Glassman is a data scientist in residence at the Data Incubator. Zachary has a passion for building data tools and teaching others to use Python. He studied physics and mathematics as an undergraduate at Pomona College and holds a master’s degree in atomic physics from the University of Maryland.

Conference registration

Get the Platinum pass or the Training pass to add this course to your package.

Comments on this page are now closed.

Comments

Picture of Zachary Glassman
Zachary Glassman | DATA SCIENTIST IN RESIDENCE
10/05/2018 16:57 BST

Hi,
This course will be focused on two applications, one an anomaly detection system for time series data and the other a recommendation system. Along the way, we will learn some of the fundamentals of data wrangling and machine learning in Python. This will be a technical course (we will write code!).

Zach

Puck van Gerwen |
10/05/2018 12:13 BST

Hi – I’m interested in this course. I’ve just started using machine learning through Python for chemoinformatics applications. Would this course be suitable or is it more focussed on business applications?

Picture of Zachary Glassman
Zachary Glassman | DATA SCIENTIST IN RESIDENCE
2/05/2018 21:57 BST

Hi Abderrahim,
To prepare ahead of time, please take some time to look over the Python Pandas package. I think you will be able to catch up pretty quickly in the afternoon session.

Zach

Picture of Abdé Essaidi
Abdé Essaidi | SENIOR DATA CONSULTANT
2/05/2018 9:49 BST

Dear Zachary,

Unfortunately I won’t be able to attend class Monday morning due to professional constraints. I will be joining the classe for the afternoon session. Is it possible somehow to catch up on the missing lessons prior to the course ?

Thank you !

Picture of Zachary Glassman
Zachary Glassman | DATA SCIENTIST IN RESIDENCE
28/03/2018 22:21 BST

Hi Daniele,
We will be using scikit-learn for ML and pandas for data manipulation. Aside from that we will be using some other libraries like matplotlib, NumPy, and SciPy.

Daniele Bonacorsi | RESEARCHER
28/03/2018 15:04 BST

Dear Zachary,

apart from python and pandas, which are explicitly quoted, may you please elaborate a bit more on the sentence “all work will be done in python”? i.e. which tools are you going to use throughout the training? (e.g. which ML frameworks, etc)

Thanks!