Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK

Introduction to machine learning with IPython and scikit-learn

Olivier Grisel (Inria & scikit-learn)
9:00–12:30 Tuesday, 5/05/2015
Data Science
Location: St. James / Regents
Average rating: ****.
(4.00, 5 ratings)

Scikit-learn is a versatile machine learning library for Python that blends well with the NumPy and SciPy ecosystems, and is used by a growing user base of academic researchers as well as data scientists and engineers in the tech industry.

IPython with its notebook interface is an interactive programming environment that is particularly well suited for data exploration, modeling, and sharing of analysis results, notably via nbviewer.ipython.org.

The objective of this tutorial is to get acquainted both with machine learning concepts in general and the pydata ecosystem in particular.

The session will cover the following topics:

  • How to extract a machine learning-friendly representation of raw data (feature extraction)
  • How to train various machine learning models such as logistic regression, support vector machines, and randomized ensembles of decision trees
  • How to evaluate the predictive accuracy of a model and detect overfitting
  • How to automatically tune the model parameters from data.
Photo of Olivier Grisel

Olivier Grisel

Inria & scikit-learn

Olivier Grisel is a software engineer in the Parietal team at Inria. He works to improve the speed and scalability of the scikit-learn machine learning library for the Python / NumPy / SciPy ecosystems. He also likes to share interesting machine learning papers and tricks on Twitter.