Python has become an increasingly important part of the data engineer and analytic tool landscape. Pydata at Strata provides in-depth coverage of the tools and techniques gaining traction with the data audience, including iPython Notebook, NumPy/matplotlib for visualization, SciPy, scikit-learn, and how to scale Python performance, including how to handle large, distributed data sets. Come see how the leading lights in the Python data community are making Python ever more useful to data analysts and data engineers.
9:00am – 10:30am: IPython – Fernando Pérez (University of California at Berkeley) and Brian Granger (Cal Poly San Luis Obispo)
10:10am – 10:30am: Collaborative Data Science with coLaboratory, Kayur Patel (Google) and Kester Tong (Google)
10:30am – 11:00am: BREAK
11:00am – 12:30am: Python for Distributed Analytics and Visualization, Andy Terrel (Continuum Analytics) and Peter Wang (Continuum Analytics, Inc.)
12:30pm – 1:30pm: LUNCH
1:30pm – 3:00pm: Room 1 E 12 – Intro to NumPy and matplotlib, Jake Vanderplas (eScience Institute, University of Washington)
1:30pm – 3:00pm: Room 1 E 13 – Intro scikit-learn + pandas for Predictive Modeling, Olivier Grisel (Inria & scikit-learn)
3:00pm – 3:30pm: BREAK
3:30pm – 5:00pm Room 1 E 12 – SciPy – An Exploration of the Most Useful Bits, Travis Oliphant (Continuum Analytics, Inc.)
3:30pm – 4:15pm Room 1 E 13 – New and Upcoming Features in Pandas, Wes McKinney (Cloudera)
4:20pm – 5:00pm Room 1 E 13 – High Performance Python, Trent Nelson (Continuum Analytics)
Fernando Pérez is a staff scientist at Lawrence Berkeley National Laboratory and a founding investigator of the Berkeley Institute for Data Science at UC Berkeley, created in 2013. He received a PhD in particle physics from the University of Colorado at Boulder, followed by postdoctoral research in applied mathematics, developing numerical algorithms. Today, his research focuses on creating tools for modern computational research and data science across domain disciplines, with an emphasis on high-level languages, interactive and literate computing, and reproducible research. He created IPython while a graduate student in 2001 and continues to lead its evolution into Project Jupyter, now as a collaborative effort with a talented team that does all the hard work. He regularly lectures about scientific computing and data science, and is a member of the Python Software Foundation, a founding member of the NumFOCUS Foundation, and a National Academy of Science Kavli Frontiers of Science Fellow. He is the recipient of the 2012 Award for the Advancement of Free Software from the Free Software Foundation.
Brian is an Associate Professor of Physics and Data Science at Cal Poly State University in San Luis Obispo, CA, where he teaches Data Science. He is a leader of the IPython project, co-founder of Project Jupyter and is an active contributor to a number of other open source projects focused on data science in Python. Recently, he co-created the Altair package for statistical visualization in Python. He is a advisory board member of the NumFOCUS Foundation and a faculty fellow of the Cal Poly Center for Innovation and Entrepreneurship.
Data architect, computational scientist, and technical leader. Andy is the CTO of Bold Metrics, where he is bringing his experience building smart scalable data systems to the fashion industry. You will also find him leading the board of the NumFOCUS foundation. As a passionate advocate for open source scientific codes Andy has been involved in the wider scientific Python community since 2006, contributing to numerous projects in the scientific stack.
Peter Wang is the cofounder and CTO of Continuum Analytics, where he leads the product engineering team for the Anaconda platform and open source projects including Bokeh and Blaze. Peter has been developing commercial scientific computing and visualization software for over 15 years and has software design and development experience across a broad variety of areas, including 3D graphics, geophysics, financial risk modeling, large data simulation and visualization, and medical imaging. As a creator of the PyData conference, he also devotes time and energy to growing the Python data community by advocating, teaching, and speaking about Python at conferences worldwide. Peter has a BA in physics from Cornell University.
Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is a maintainer and/or frequent contributor to many open source Python projects, including scikit-learn, scipy, mpld3, and others. He occasionally blogs about Python, machine learning, data visualization, open science, and related topics at Jakevdp.github.io.
Olivier Grisel is a software engineer at Inria Saclay, France, where he works on scikit-learn, an open source project for machine learning in Python. Olivier also contributes occasional bug fixes to upstream projects in the NumPy/SciPy ecosystem.
Travis Oliphant has a Ph.D. from the Mayo Clinic and B.S. and M.S. degrees in Mathematics and Electrical Engineering from Brigham Young University. Since 1997, he has worked extensively with Python for numerical and scientific programming, most notably as the primary developer of the NumPy package, and as a founding contributor of the SciPy package. He is also the author of the definitive Guide to NumPy.
Travis was an assistant professor of Electrical and Computer Engineering at BYU from 2001-2007, where he taught courses in probability theory, electromagnetics, inverse problems, and signal processing. He also served as Director of the Biomedical Imaging Lab, where he researched satellite remote sensing, MRI, ultrasound, elastography, and scanning impedance imaging.
From 2007-2011, Travis was the president at Enthought, Inc. During his tenure there, the company grew from 15 to 50 employees, and Travis worked with well-known Fortune 50 companies in finance, oil-and-gas, and consumer-products. He was involved in all aspects of the contractual relationship, including consulting, training, code-architecture, and development.
As CEO of Continuum Analytics, Travis engages customers in finance, consumer products, and oil and gas, develops business strategy, and helps guide technical direction of the company. He actively contributes to software development and engages with the wider open source community in the Python ecosystem by serving as a director of the Python Software Foundation and past director of Numfocus.
Data systems @ Cloudera. Formerly founder/CEO of DataPad (http://www.datapad.io). Author of “Python for Data Analysis” from O’Reilly Media. Created pandas project.
Kayur Patel makes data science tools easier to use and studies how people apply machine learning to solve problems and build software. Kayur received his PhD in Computer Science and Engineering from the University of Washington. His graduate work was funded by grants from the NSF and Google as well as the NDSEG and Microsoft Research fellowships. He is currently working at Google and recently taught the Introduction to Data Science course at Columbia.
I am a software engineer at Google Research. I work on machine learning algorithms and infrastructure, and on a product for collaborative data analysis, coLaboratory.
Comments on this page are now closed.