San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Please log in

Add to Your Schedule

Hands-on data science with Python

Don Fox (The Data Incubator)

Monday, March 25 & Tuesday, March 26, 9:00am - 5:00pm

Data Science, Machine Learning & AI
Location: 2016

Average rating:

(4.75, 12 ratings)

Participants should plan to attend both days of this 2-day training course. To attend training courses, you must register for a Platinum or Training pass; does not include access to tutorials on Tuesday.

Don Fox walks you through developing a machine learning pipeline, from prototyping to production. You'll learn about data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python.

What you'll learn, and how you can apply it

Understand the basics of machine learning, feature engineering, anomaly detection, and recommendation engines
Explore scikit-learn fundamentals
Create machine learning processes with scikit-learn
Evaluate and apply machine learning to real-world problems

This training is for you because...

You're a software engineer or programmer with a background in Python, and you want to develop a basic understanding of machine learning.
You're in a nontechnical role, and you want to more effectively communicate about machine learning with the engineers and data scientists in your company.

Prerequisites:

A working knowledge of Python
Familiarity with pandas (useful but not required)

Don Fox walks you through developing a machine learning pipeline, from prototyping to production. You’ll learn about data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python.

Outline

Day 1: Anomaly detection

Data format and goal
Limitations of time series data
Detrending and seasonality
Windowing and local scores
Setting thresholds for classification
Online learning

Day 2: Recommendation engine

Overview of data and its wrangling
Item-item correlations and finding similar items
User similarity and predicting user ratings
Collaborative filtering
Evaluating model performance

About your instructor

Don Fox is a Boston-based data scientist in residence at the Data Incubator. Previously, Don developed numerical models for a geothermal energy startup. Born and raised in South Texas, Don holds a PhD in chemical engineering, where he researched renewable energy systems and developed computational tools to analyze the performance of these systems.

Conference registration

Get the Platinum pass or the Training pass to add this course to your package.

Comments on this page are now closed.

Comments

Don Fox | DATA SCIENTIST IN RESIDENCE

03/18/2019 12:19am PDT

Hi Travis Craven,

No need to install anything before hand. Every participant will get access to a remote account where we host our training curriculum and environment. On these accounts, we will be using Jupyter Notebook; you may want to install it on your personal machine if you want to get familiar with it.

Travis Craven | BI TECH LEAD

03/16/2019 9:26pm PDT

Is there a preferred IDE to install before the training?

Don Fox | DATA SCIENTIST IN RESIDENCE

03/09/2019 12:15am PST

Hi Jagdish,

1. Make sure you are comfortable with Python. I might spend some 30-60 minutes in the beginning to go over some elementary Python.

2. regarding data sets, it would help if you are familiar with pandas, the better. It is the power package for data wrangling and manipulation. Take a look at https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html to easily get started with Pandas

Jagdish Gajula | ENGINEER

03/06/2019 5:53am PST

Hi Don, We are planing to do some prep work before the training. can you suggest an documentation or work on some data set to prepare for the training.
thanks
Jagdish

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com