Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

HOWTO Make Your Future Data Scientists Love You

sasha laundy (Warby Parker)
1:30pm–1:50pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Average rating: *****
(5.00, 1 rating)

It’s a common story. Software developers are working hard to get a project off the ground. They set up logging to catch errors, which is great, but when they go to do data science with those logs down the road, they find problems. Maybe their logs are missing crucial information, or their database schema may not have the right unique identifiers across different data sets. A few days up front doing a “data audit” could have saved them time, made them piles of money, and helped them gain insight into their customers.

This talk will give you the toolkit you need to collect data properly, years before you bring on a data scientist. You will be able to do your own data audit, even if you don’t know anything about data science. You will learn the three major things to check: is your data complete? Is it correct? And is it connectable?

You’ll also get a concise list of command line tools to quickly look through your data to get some intuition for what’s hiding in those CSVs. Be a hero to your future data team.

Photo of sasha laundy

sasha laundy

Warby Parker

Sasha is the founding data scientist and engineer at Polynumeral, a data science consultancy in New York City. She helps clients solve hard data problems and design their data strategy, including the World Bank, New York Public Radio,, and Warby Parker. Previously she worked at Twilio and was an early employee at Codecademy. She founded Women Who Code, a global non-profit which connects 16,000 technical women in 14 countries.