Building Rich, High Performance Tools for Practical Data Analysis

Data Science, Sutton Center / Sutton South (NY Hilton)
Average rating: **...
(2.40, 5 ratings)

This will be a somewhat advanced, technical talk connecting computer science
concepts like data structure design and algorithms with the details of building
intuitive, high performance, and flexible tools for data analysis. It is an
accumulation of lessons learned and experience gained building pandas, a widely
used, battle-tested data analysis toolkit for Python. I will give a number of
short code demonstrations as a means of illustrating the various points.

Some of the important topics here include missing data handling, simple and
hierarchical indexing, efficient serialization, pivoting and reshaping, grouped
data aggregation and transformation, time series-specific computations, and
merge and join algorithms. I will also discuss structuring data for
visualization and output to other tools such as JavaScript visualization
toolkits like D3.js.

Photo of Wes McKinney

Wes McKinney

Two Sigma Investments

Building analytics libraries and research tools for quantitative finance and other fields. Actively involved in data analysis and statistics applications in the scientific Python community. Author of pandas library, contributor to statsmodels. Upcoming author of “Python for Data Analysis” from O’Reilly Media. CEO of Lambda Foundry, Inc.


Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners

Press and Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata contacts.