Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

PyData at Strata

Travis Oliphant (Anaconda), Peter Wang (Anaconda), Kyle Kelley (Netflix), Andrew Odewahn (O'Reilly Media), Paige Bailey (Microsoft), Jeff Reback (Continuum Analytics), Andy Terrel (NumFOCUS), Bryan Van de Ven (Continuum Analytics), Sarah Bird (Aptivate), James Powell (NumFOCUS), Phil Cloud (Continuum), Jason Grout (Bloomberg LP), Chris Colbert (Anaconda Powered by Continuum Analytics), Owen Zhang (DataRobot), Peter Prettenhofer (DataRobot), Damon McDougall (UT Austin), Michael Droettboom (Space Telescope Science Institute), Jim Crist (Continuum Analytics), Benjamin Zaitlen (Anaconda), Andreas Mueller (NYU, scikit-learn)
9:00am–5:00pm Tuesday, 09/29/2015
Data Science & Advanced Analytics
Location: 1 E12/ 1 E13
Average rating: ***..
(3.50, 10 ratings)
Slides:   1-PDF 


Python has become an increasingly important part of the data engineer and analytic tool landscape. Pydata at Strata provides in-depth coverage of the tools and techniques gaining traction with the data audience, including IPython Notebook, NumPy/matplotlib for visualization, SciPy, scikit-learn, and how to scale Python performance, including how to handle large, distributed data sets. Come see how the leading lights in the Python data community are making Python ever more useful to data analysts and data engineers.


9:00 – 9:45

  • How to Build a Company on Open Source
    Travis Oliphant & Peter Wang

9:45 – 10:30

  • How to Build Publishing & On-Demand Learning Environments with IPython
    Kyle Kelley & Andrew Odewahn

10:30am – 11:00am

11:00am – 12:30pm

Track 1 (room 1 E12):

  • How to Use Pandas for Data Analysis
    Jeff Reback

Track 2 (room 1 E13):

  • How to Create Beautiful Visualizations with Bokeh
    Sarah Bird & Bryan Van de Ven

12:30pm – 1:30pm

1:30pm – 2:15pm

Track 1 (room 1 E12):

  • How to Build Big Data Workflows
    Andy Terrel & Ben Zaitlen

Track 2 (room 1 E13):

  • How to Solve Problems in Geophysics with Python
    Paige Bailey

2:15pm – 3:00pm

Track 1 (room 1 E12):

  • How to Leverage the Blaze Ecosystem
    Jim Crist & Phil Cloud

Track 2 (room 1 E13):

  • How to Think About Python
    James Powell

3:30pm – 4:00pm

3:30pm – 4:15pm

Track 1 (room 1 E12):

  • How to Use Scikit-Learn for Machine Learning
    Andreas Müller

Track 2 (room 1 E13):

  • How to Use Python for Predictive Modeling
    Owen Zhang & Peter Prettenhofer

4:15pm – 5:00pm

Track 1 (room 1 E12):

  • Introduction to Publication Quality Plotting with Matplotlib
    Damon McDougall & Michael Droettboom

Track 2 (room 1 E13):

  • Interactive Computing in the Jupyter Notebook – Present and Future
    Jason Grout & Chris Colbert
Photo of Travis Oliphant

Travis Oliphant


Travis Oliphant has a Ph.D. from the Mayo Clinic and B.S. and M.S. degrees in Mathematics and Electrical Engineering from Brigham Young University. Since 1997, he has worked extensively with Python for numerical and scientific programming, most notably as the primary developer of the NumPy package, and as a founding contributor of the SciPy package. He is also the author of the definitive Guide to NumPy.

Travis was an assistant professor of Electrical and Computer Engineering at BYU from 2001-2007, where he taught courses in probability theory, electromagnetics, inverse problems, and signal processing. He also served as Director of the Biomedical Imaging Lab, where he researched satellite remote sensing, MRI, ultrasound, elastography, and scanning impedance imaging.

From 2007-2011, Travis was the president at Enthought, Inc. During his tenure there, the company grew from 15 to 50 employees, and Travis worked with well-known Fortune 50 companies in finance, oil-and-gas, and consumer-products. He was involved in all aspects of the contractual relationship, including consulting, training, code-architecture, and development.

As CEO of Continuum Analytics, Travis engages customers in finance, consumer products, and oil and gas, develops business strategy, and helps guide technical direction of the company. He actively contributes to software development and engages with the wider open source community in the Python ecosystem by serving as a director of the Python Software Foundation and past director of Numfocus.

Photo of Peter Wang

Peter Wang


Peter Wang is the cofounder and CTO of Anaconda, where he leads the product engineering team for the Anaconda platform and open source projects including Bokeh and Blaze. Peter’s been developing commercial scientific computing and visualization software for over 15 years and has software design and development experience across a broad variety of areas, including 3-D graphics, geophysics, financial risk modeling, large data simulation and visualization, and medical imaging. As a creator of the PyData conference, he also devotes time and energy to growing the Python data community by advocating, teaching, and speaking about Python at conferences worldwide. Peter holds a BA in physics from Cornell University.

Photo of Kyle Kelley

Kyle Kelley


Kyle Kelley is a senior software engineer at Netflix, a maintainer on, and a core developer of the IPython/Jupyter project. He wants to help build great environments for collaborative analysis, development, and production workloads for everyone, from small teams to massive scale.

Photo of Andrew Odewahn

Andrew Odewahn

O'Reilly Media

Andrew Odewahn is the CTO of O’Reilly Media, where he helps define and create the new products, services, and business models that will help O’Reilly continue to make the transition to an increasingly digital future. The author of two books on database development, he has experience as a software developer and consultant in a number of industries, including manufacturing, pharmaceuticals, and publishing. Andrew holds an MBA from New York University and a degree in computer science from the University of Alabama. He’s also thru-hiked the Appalachian Trail from Georgia to Maine.

Photo of Paige Bailey

Paige Bailey


Paige Bailey is a senior cloud developer advocate at Microsoft specializing in machine learning and artificial intelligence. Previously, Paige was a data scientist and machine learning engineer in the energy industry (drilling and completions optimization, subsurface characterization). Paige has over a decade of experience doing data analysis with Python and five years of building predictive models with R. She serves on the core committee for JupyterCon and SciPy, is a Python instructor for EdX, founded PyLadies-HTX in Houston, and is currently writing both an introductory children’s book on machine learning and a technical cookbook for machine learning at scale with tools like Apache Spark.

Jeff Reback

Continuum Analytics

Jeff Reback is a senior software developer for Continuum Analytics. As a former quant, he has lots of experiencing build financial trading systems, using Python, and working with very large data. Jeff has been a core committer to the pandas project for the past few years and currently manages the project.

Photo of Andy Terrel

Andy Terrel


Data architect, computational scientist, and technical leader. Andy is the CTO of Fashion Metric, where he is bringing his experience building smart scalable data systems to the fashion industry. You will also find him leading the board of the NumFOCUS foundation. As a passionate advocate for open source scientific codes Andy has been involved in the wider scientific Python community since 2006, contributing to numerous projects in the scientific stack.

Photo of Bryan Van de Ven

Bryan Van de Ven

Continuum Analytics

Bryan Van de Ven is a software engineer at Continuum Analytics. Previously, Bryan worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms, and Enthought, where he worked on problems in financial risk modeling and fluid mixing simulation. Bryan has also worked on an assortment of iOS projects as an independent consultant. Bryan is a core contributor of Bokeh and contributed to the Chaco visualization library. Bryan holds undergraduate degrees in computer science and mathematics from UT Austin and a master’s degree in physics from UCLA.

Photo of Sarah Bird

Sarah Bird


After a brief spell designing ejection seats for fighter jets, Sarah Bird’s career turned to applying technology to international development. She has worked in many sectors including mobile health and data collection in Pakistan, Peru, Haiti, and elsewhere. Having always dabbled in software in her spare time, in 2012 Sarah gave in and became a full-time software developer. She is now a full-stack web developer at Aptivate, a non-profit that builds IT solutions for the international development sector.

Photo of James Powell

James Powell


James Powell is a NYC-based Python programmer and master trainer with experience in quantitative finance and data science. James is very active in the Python community in NYC, where he organizes NYC Python (the world’s largest and most active Python meetup group). He also works with the numeric and scientific computing nonprofit NumFOCUS to help organize the PyData conference series. James is a frequent speaker at Python conferences and has been invited to speak at events such as PyData New York, PyData London, PyGotham, the conference For Python Quants, and PyCon Spain.

Photo of Phil  Cloud

Phil Cloud


Phillip Cloud is a software engineer at Continuum Analytics. He started doing open source work by contributing heavily to Pandas. Now he works mostly on Blaze and its associated libraries, along with a bit of consulting. He enjoys building data-related tools that help people get their jobs done.

Photo of Jason Grout

Jason Grout

Bloomberg LP

Jason Grout is a Jupyter developer at Bloomberg, working primarily on JupyterLab and the interactive Jupyter widgets library. He has also been a major contributor to the open source Sage mathematical software system and co-organizes the PyDataNYC Meetup. Previously, Jason was an assistant professor of mathematics at Drake University in Des Moines, Iowa. He holds a PhD in mathematics from Brigham Young University.

Photo of Chris Colbert

Chris Colbert

Anaconda Powered by Continuum Analytics

Chris is a software architect for Continuum Analytics, and is based in the
New York City area. He has worked previously for top Wall St. firms and was
the lead designer of the UI framework for a front office trading platform.
He is the creator of the PhosphorJS and Nucleic projects which provide
libraries for developing enterprise quality applications on the desktop and in
the browser. He received his MS in Mechanical Engineering from the University
of South Florida.

Photo of Owen Zhang

Owen Zhang


Owen Zhang is the chief product officer at DataRobot. Owen spent most of his career in the property and casualty insurance industry. Most recently Owen served as vice president of modeling of the newly formed AIG Science team.

After spending several years in IT building transactional systems for commercial insurance, Owen discovered his passion for machine learning and started building insurance underwriting, pricing, and claims models. Owen has a master’s degree in electrical engineering from the University of Toronto and a bachelor’s degree from the University of Science and Technology of China. Owen is currently ranked #1 on the Kaggle leaderboard out of a community of 200,000 data scientists.

Photo of Peter Prettenhofer

Peter Prettenhofer


Peter Prettenhofer is a data scientist / software engineer at DataRobot. He studied computer science at Graz University of Technology, Austria and Bauhaus University Weimar, Germany, focusing on machine learning and natural language processing. He is a contributor to scikit-learn where he co-authored a number of modules such as Gradient Boosted Regression Trees, Stochastic Gradient Descent, and Decision Trees.

Photo of Damon McDougall

Damon McDougall

UT Austin

Damon McDougall did his PhD in Mathematics at the University of Warwick in the UK. His research focuses are in Bayesian inverse problems, parameter estimation, learning, computational science, high-performance computing, and software engineering. Damon is a core developer of Matplotlib and contributes heavily to the open source community.

Photo of Michael Droettboom

Michael Droettboom

Space Telescope Science Institute

Michael Droettboom is a main contributor to matplotlib, the premier plotting library in the scientific Python ecosystem. He is the creator of “airspeed velocity” for benchmarking Python projects over time, the author of Understanding JSON Schema, and a primary contributor to astropy.

Jim Crist

Continuum Analytics

Photo of Benjamin Zaitlen

Benjamin Zaitlen


Ben Zaitlen is the technical lead of the Anaconda Cluster product at Continuum Analytics. Ben received undergraduate degrees in mathematics and physics from UC Santa Cruz, and a Master’s degree in physics from Indiana University. Previous to Continuum, he worked at the Biocomplexity Institute developing and supporting a multi-scale modeling environment for developmental biology. Ben is also passionate about electronics and has developed a number of embedded and wearable hardware projects.

Photo of Andreas Mueller

Andreas Mueller

NYU, scikit-learn

Andreas Mueller received his PhD in machine learning from the University of Bonn. After working as a machine learning researcher on computer vision applications at Amazon for a year, he recently joined the Center for Data Science at New York University. In the last four years, he has been maintainer and one of the core contributors of scikit-learn, a machine learning toolkit widely used in industry and academia, and author and contributor to several other widely-used machine learning packages. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science, and democratize access to high-quality machine learning algorithms.

Comments on this page are now closed.


Lawrence Hecht
09/22/2015 9:26am EDT

I haven’t even finished my first online Python course. Will I get anything out of this if I attend?