Skip to main content
Make Data Work
Oct 15–17, 2014 • New York, NY

PyData at Strata

Fernando Pérez (University of California at Berkeley), Brian Granger (Cal Poly San Luis Obispo), Andy Terrel (Fashion Metric), Peter Wang (Continuum Analytics, Inc.), Jake Vanderplas (eScience Institute, University of Washington), Olivier Grisel (Inria & scikit-learn), Travis Oliphant (Continuum Analytics, Inc.), William McKinney (Cloudera), Trent Nelson (Continuum Analytics), Kayur Patel (Google), Kester Tong (Google)
9:00am–5:00pm Wednesday, 10/15/2014
Data Science
Location: 1 E12/1 E13
Average rating: ****.
(4.43, 14 ratings)

PyDataPython has become an increasingly important part of the data engineer and analytic tool landscape. Pydata at Strata provides in-depth coverage of the tools and techniques gaining traction with the data audience, including iPython Notebook, NumPy/matplotlib for visualization, SciPy, scikit-learn, and how to scale Python performance, including how to handle large, distributed data sets. Come see how the leading lights in the Python data community are making Python ever more useful to data analysts and data engineers.

SCHEDULE:

9:00am – 10:30am: IPython – Fernando Pérez (University of California at Berkeley) and Brian Granger (Cal Poly San Luis Obispo)
10:10am – 10:30am: Collaborative Data Science with coLaboratory, Kayur Patel (Google) and Kester Tong (Google)

10:30am – 11:00am: BREAK

11:00am – 12:30am: Python for Distributed Analytics and Visualization, Andy Terrel (Continuum Analytics) and Peter Wang (Continuum Analytics, Inc.)

12:30pm – 1:30pm: LUNCH

1:30pm – 3:00pm: Room 1 E 12 – Intro to NumPy and matplotlib, Jake Vanderplas (eScience Institute, University of Washington)
1:30pm – 3:00pm: Room 1 E 13 – Intro scikit-learn + pandas for Predictive Modeling, Olivier Grisel (Inria & scikit-learn)

3:00pm – 3:30pm: BREAK

3:30pm – 5:00pm Room 1 E 12 – SciPy – An Exploration of the Most Useful Bits, Travis Oliphant (Continuum Analytics, Inc.)
3:30pm – 4:15pm Room 1 E 13 – New and Upcoming Features in Pandas, Wes McKinney (Cloudera)
4:20pm – 5:00pm Room 1 E 13 – High Performance Python, Trent Nelson (Continuum Analytics)

Photo of Fernando Pérez

Fernando Pérez

University of California at Berkeley

Fernando Pérez is a research scientist at UC Berkeley, working at the
intersection of brain imaging and open tools for scientific computing. He
created IPython while a PhD student in Physics at the University of Colorado in
Boulder. Today, with all the hard work done by a talented team, he continues
to lead IPython’s development as the interface between the humans at the
keyboard and the bits in the machine.

He is a founding member of NumFOCUS, a PSF member, and received the 2012 Award for the Advancement of Free Software for IPython and contributions to
scientific Python.

Photo of Brian Granger

Brian Granger

Cal Poly San Luis Obispo

Brian Granger is an Assistant Professor of Physics at Cal Poly State
University in San Luis Obispo, CA. He has a background in theoretical
atomic, molecular and optical physics, with a Ph.D from the University of Colorado. His current research interests include quantum computing, parallel and distributed computing and interactive computing environments for scientific and technical computing. He is a core developer of the IPython project and is an active contributor to a number of other open source projects focused on scientific computing in Python. He is @ellisonbg on Twitter and GitHub.

Photo of Andy Terrel

Andy Terrel

Fashion Metric

Data architect, computational scientist, and technical leader. Andy is the CTO of Fashion Metric, where he is bringing his experience building smart scalable data systems to the fashion industry. You will also find him leading the board of the NumFOCUS foundation. As a passionate advocate for open source scientific codes Andy has been involved in the wider scientific Python community since 2006, contributing to numerous projects in the scientific stack.

Photo of Peter Wang

Peter Wang

Continuum Analytics, Inc.

Peter has a B.A. in Physics from Cornell University, and has been developing commercial scientific computing and visualization software for over 15 years. He has software design and development experience across a broad variety of areas, including 3D graphics, geophysics, financial risk modeling, large data simulation and visualization, and medical imaging.
Peter’s interests in the fundamentals of vector computing and interactive, large-scale visualization led him to co-founding Continuum Analytics. As CTO, Peter is the technology visionary and leads the product engineering team for the Anaconda platform as well as open source projects including Bokeh and Blaze. As a creator of the PyData conference, he also devotes time and energy to growing the Python data community by advocating, teaching, and speaking about Python at conferences worldwide.

Photo of Jake Vanderplas

Jake Vanderplas

eScience Institute, University of Washington

Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is a maintainer and/or frequent contributor to many open source Python projects, including scikit-learn, scipy, mpld3, and others. He occasionally blogs about Python, machine learning, data visualization, open science, and related topics at Jakevdp.github.io.

Photo of Olivier Grisel

Olivier Grisel

Inria & scikit-learn

Olivier Grisel is a software engineer at Inria Saclay, France, where he works on scikit-learn, an open source project for machine learning in Python. Olivier also contributes occasional bug fixes to upstream projects in the NumPy/SciPy ecosystem.

Photo of Travis Oliphant

Travis Oliphant

Continuum Analytics, Inc.

Travis Oliphant has a Ph.D. from the Mayo Clinic and B.S. and M.S. degrees in Mathematics and Electrical Engineering from Brigham Young University. Since 1997, he has worked extensively with Python for numerical and scientific programming, most notably as the primary developer of the NumPy package, and as a founding contributor of the SciPy package. He is also the author of the definitive Guide to NumPy.

Travis was an assistant professor of Electrical and Computer Engineering at BYU from 2001-2007, where he taught courses in probability theory, electromagnetics, inverse problems, and signal processing. He also served as Director of the Biomedical Imaging Lab, where he researched satellite remote sensing, MRI, ultrasound, elastography, and scanning impedance imaging.

From 2007-2011, Travis was the president at Enthought, Inc. During his tenure there, the company grew from 15 to 50 employees, and Travis worked with well-known Fortune 50 companies in finance, oil-and-gas, and consumer-products. He was involved in all aspects of the contractual relationship, including consulting, training, code-architecture, and development.

As CEO of Continuum Analytics, Travis engages customers in finance, consumer products, and oil and gas, develops business strategy, and helps guide technical direction of the company. He actively contributes to software development and engages with the wider open source community in the Python ecosystem by serving as a director of the Python Software Foundation and past director of Numfocus.

Photo of William McKinney

William McKinney

Cloudera

Data systems @ Cloudera. Formerly founder/CEO of DataPad (http://www.datapad.io). Author of “Python for Data Analysis” from O’Reilly Media. Created pandas project.

Trent Nelson

Continuum Analytics

Photo of Kayur Patel

Kayur Patel

Google

Kayur Patel makes data science tools easier to use and studies how people apply machine learning to solve problems and build software. Kayur received his PhD in Computer Science and Engineering from the University of Washington. His graduate work was funded by grants from the NSF and Google as well as the NDSEG and Microsoft Research fellowships. He is currently working at Google and recently taught the Introduction to Data Science course at Columbia.

Photo of Kester Tong

Kester Tong

Google

I am a software engineer at Google Research. I work on machine learning algorithms and infrastructure, and on a product for collaborative data analysis, coLaboratory.

Comments on this page are now closed.

Comments

Picture of Andy Terrel
Andy Terrel
10/20/2014 4:08pm EDT

@Muni, Sorry this took so long:

https://speakerdeck.com/aterrel/visualization-with-blaze-and-bokeh
https://speakerdeck.com/aterrel/visualization-with-bokeh-pydata-at-stratanyc2014

Picture of Muni Xu
Muni Xu
10/15/2014 2:24pm EDT

@Andy

Thanks a lot Andy! BTW, very enjoyable talk you gave in the morning!

Picture of Andy Terrel
Andy Terrel
10/15/2014 1:57pm EDT

@Muni

I’ll try to get them posted here later tonight.

— Andy

Picture of Muni Xu
Muni Xu
10/15/2014 12:37pm EDT

Hi Andy,

Where can I get access to all the slides you guys showed this morning?

Thanks,
Muni

Picture of Andy Terrel
Andy Terrel
10/15/2014 9:46am EDT

To get the blaze and bokeh codes working:

conda install blaze bokeh

Picture of Olivier Grisel
Olivier Grisel
10/15/2014 9:41am EDT

People who plan to attend the scikit-learn tutorial this afternoon, please check that you have a recent Anaconda installation up and running on your laptop:

http://continuum.io/downloads

Check that you have scikit-learn and pandas there:

conda install scikit-learn pandas

You can fetch the notebooks for this session from:

https://github.com/ogrisel/parallel_ml_tutorial

If you don’t have git installed on your laptop, you can use the “Download ZIP” button on that page.

Note: we will only cover a subset of this tutorial and there is no need to download any data beyond what is already included in the repository.

Picture of Sophia DeMartini
Sophia DeMartini
10/15/2014 6:43am EDT

Hi David,

I’ve gone in and edited your registration so that you’re now registered for the PyData all-day tutorial.

Thank you,
Sophia

David Sedgwick
10/15/2014 4:45am EDT

Hello, I would like to register for this tutorial. I’m currently registered for AM and PM tutorials but this all day session appears more interesting. How do I change registration?

Picture of Sophia DeMartini
Sophia DeMartini
10/08/2014 5:51pm EDT

Hi Jason,

I just double checked your registration, and you’re already signed up to attend PyData Day. If I can help with anything else, please email speakersatoreilly.com.

Thanks,
Sophia

Jason Gilbertson
10/08/2014 5:22pm EDT

Would love to attend (already have the 3-day pass) but seem to be missing how to register for this event. All I see in the top right is ‘Registration option’ with a star but nothing happens.

Picture of Fernando Pérez
Fernando Pérez
10/08/2014 3:06am EDT

@Luciano, for the IPython tutorial, you can find teaching materials on github: https://github.com/ipython/ipython-in-depth.

You can find the installation instructions for IPython itself, in case you don’t have it, here:

http://ipython.org/install.html

Luciano Tozato dos Reis
10/07/2014 4:30pm EDT

Are there materials or downloads that we need to prepare in advance for the PyData tutorial?

Picture of Sophia DeMartini
Sophia DeMartini
10/01/2014 8:50pm EDT

Hi Dylan,

You should have some basic knowledge of Python. The tutorial presentations will be at an introductory level and will be coding oriented. This will be a great opportunity to explore the use of Python in data analytics whether you are at a novice or intermediate level.

Thanks,
Sophia

Dylan Patterson
10/01/2014 11:13am EDT

I’m considering joining this seminar for the conference. How much proficiency in python is expected in order to be able to benefit from this course? Is this a coding oriented class, or is it a high level discussion?