Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Building a data science idea factory: How to prioritize the portfolio of a large, diverse, and opinionated data science team

Katie Malone (Civis Analytics), Skipper Seabold (Civis Analytics)
2:40pm3:20pm Wednesday, March 7, 2018
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data science managers

What you'll learn

  • Learn processes and best practices for choosing effective data science projects


Data science is a field where the expectations are high but the guidance around how to deliver “data science impact” can be low. What are the most important projects for a data science team to work on? How can people with technical context and people with business context bring their priorities together in a common discussion? How does a data science team ensure that all team members get their best ideas heard by the organization?

Civis Analytics’s data science research and development team consists of data scientists working on a wide variety of data science software and consulting tasks. One challenge the team confronted together (as it tripled in size) was how to prioritize projects in the company’s data science portfolio. Collectively, the team has better ideas and more experience than any single member alone, which suggests a bottom-up approach to sourcing project ideas. However, the team found that delivering a few high-quality, high-impact deliverables is better for the organization than lots of smaller, disorganized projects, which invites a more top-down approach.

Katie Malone and Skipper Seabold share a framework and best practices for quickly and collaboratively proposing, discussing, selecting, and managing high-impact data science projects.

Topics include:

  • What information is most important to solicit during the discussion process
  • How to structure project proposals
  • How to keep discussion constructive and on track even among a large and diverse group
  • How to weigh the costs and benefits of a few different methods of project selection
  • When and how to make adjustments to the process to make it fit your team
Photo of Katie  Malone

Katie Malone

Civis Analytics

Katie Malone is director of data science at data science software and services company Civis Analytics, where she leads a team of diverse data scientists who serve as technical and methodological advisors to the Civis consulting team and write the core machine learning and data science software that underpins the Civis Data Science Platform. Previously, she worked at CERN on Higgs boson searches and was the instructor of Udacity’s Introduction to Machine Learning course. Katie hosts Linear Digressions, a weekly podcast on data science and machine learning. She holds a PhD in physics from Stanford.

Photo of Skipper Seabold

Skipper Seabold

Civis Analytics

Skipper is Director of Data Science R&D and a Product Lead at Civis Analytics in Chicago. He leads a team of data scientists from all walks of life from physicists and biologists to statisticians and computer scientists. Together they drive the data science behind the products Civis offers and push the capabilities of solutions that Civis provides to its clients. He is an economist by training and has a decade of experience working in the Python data open source community. He started and led the statsmodels Python project, was formerly on the core pandas team, and has contributed to many projects in Python data stack. He holds strong opinions about writing and barbecue.