Sep 23–26, 2019

Data Science vs Engineering: Does it really have to be this way?

Ann Spencer (Domino Data Lab), Paco Nathan (Derwen, Inc.), Amy Heineike (Primer), Pete Warden (TensorFlow)
11:20am12:00pm Wednesday, September 25, 2019
Location: 1A 08/10
Secondary topics:  Culture and Organization

Who is this presentation for?

data scientists, machine learning engineers, researchers, data engineers, data product managers

Level

Intermediate

Prerequisite knowledge

Assumes familiarity with concepts including model development and model deployment

What you'll learn

- Identification of specific common tension points that arise when building data and ML driven products - Understanding the “why” behind the tension points enables problem-solving for tension points - Practical advice for for addressing the tension points that include building/hardening communication lines between different functional roles either through pairing, product prototyping (i.e., “Wizard of Ozzing”), tech talks/weekly seminars, as well as working towards interdisciplinary understanding through cross-functional work or cross-training.

Description

Collaboration between data science and engineering is a known challenge. This challenge has the potential to stymie innovation and hobble the acceleration of data science work. Should we, in data science, just shrug our shoulders and say “That is just the way it is.” or say “This is too hard of a problem to solve. I’d rather solve something else.” Yet, isn’t data science grounded in the idea of solving for previously unsolvable problems? We have all heard stories, from brilliant data scientists and exceptional engineers, of their frustrations regarding collaboration around developing and deploying models. This is not an insurmountable problem.

The panelists will discuss differing perspectives about collaboration when building and deploying models. Just a few topics candidly discussed during the panel will include potential tension points that arise (i.e., potentially stemming from a sense of ownership over workflows, the sheer amount and variety of work involved, differing expectations about process, etc.), problem solving to address tension points (i.e., mindset, communication best practices, cross training, etc.) and hopeful reflections on the potential future state.

Panelists

Paco Nathan, is known as a “player/coach”, with core expertise in data science, natural language processing, machine learning, and cloud computing. He is the Evil Mad Scientist at Derwen, Co-chair of Rev, Advisor for Amplify Partners, Deep Learning Analytics, Recognai, Data Spartan, Primer.

Amy Heineike is the VP of Product Engineering at Primer AI, where she leads teams to build machines that read and write text leveraging NLP, NLG and a host of other algorithms to augment human analysts. Previously she built out technology for visualizing large document sets as network maps at Quid. A Cambridge Mathematician who previously worked in London modeling cities, Amy is fascinated by complex human systems and the algorithms and data that help us understand them.
Pete Warden is the Technical Lead on the TensorFlow Mobile Embedded Team at Google doing Deep Learning. He is formerly the CTO of Jetpac, which was acquired by Google. He is also an Apple alumnus and blogs at petewarden.com.

Moderator: Ann Spencer is the Head of Content at Domino Data Lab. She is responsible for ensuring Domino’s data science content provides a high degree of value, density, and analytical rigor that sparks respectful candid public discourse from multiple perspectives. Discourse that is anchored in the intention of helping accelerate data science work. Previously, she was the Data Editor at O’Reilly Media (2012-2014) focusing on data science and data engineering. It was in this role where she previously met and worked with the panelists.

Photo of Ann Spencer

Ann Spencer

Domino Data Lab

Moderator: Ann Spencer is the Head of Content at Domino Data Lab. She is responsible for ensuring Domino’s data science content provides a high degree of value, density, and analytical rigor that sparks respectful candid public discourse from multiple perspectives. Discourse that is anchored in the intention of helping accelerate data science work. Previously, she was the Data Editor at O’Reilly Media (2012-2014) focusing on data science and data engineering. It was in this role where she previously met and worked with the panelists.

Photo of Paco Nathan

Paco Nathan

Derwen, Inc.

Paco Nathan, is known as a “player/coach”, with core expertise in data science, natural language processing, machine learning, and cloud computing. He is the Evil Mad Scientist at Derwen, Co-chair of Rev, Advisor for Amplify Partners, Deep Learning Analytics, Recognai, Data Spartan, Primer.

Photo of Amy Heineike

Amy Heineike

Primer

Amy Heineike is the VP of Product Engineering at Primer AI, where she leads teams to build machines that read and write text leveraging NLP, NLG and a host of other algorithms to augment human analysts. Previously she built out technology for visualizing large document sets as network maps at Quid. A Cambridge Mathematician who previously worked in London modeling cities, Amy is fascinated by complex human systems and the algorithms and data that help us understand them

Photo of Pete Warden

Pete Warden

TensorFlow

Pete Warden is the Technical Lead on the TensorFlow Mobile Embedded Team at Google doing Deep Learning. He is formerly the CTO of Jetpac, which was acquired by Google. He is also an Apple alumnus and blogs at petewarden.com.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts