Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Why data scientists should love Linux containers

William Benton (Red Hat)

1:15pm–1:55pm Wednesday, 09/12/2018

Data engineering and architecture, Data science and machine learning
Location: 1A 08 Level: Beginner

Secondary topics: Model lifecycle management

Average rating:

(5.00, 2 ratings)

Download slides (PDF)

Who is this presentation for?

Data scientists and AI developers

Prerequisite knowledge

Familiarity with Python or data science workflows (useful but not required)

What you'll learn

Learn how containers and automated build pipelines can realize the potential of interactive notebooks as truly reproducible research, how data scientists can use containers and workflows from the DevOps world to communicate with application development teams, how container platforms let data scientists scale experiments beyond their laptops with easy access to powerful and specialized hardware and simplify governing access to sensitive internal data and provide a clearer path to regulatory compliance, and how to get started using key open source projects that enable data scientists and machine learning engineers to make the most of container technology

Description

Linux containers make it easy for teams to deploy, manage, and scale distributed applications and for operators to exploit compute capacity in the cloud. Although it might not be obvious, a great foundation for production applications can also support the exploratory work of data scientists and machine learning engineers.

William Benton details the advantages of containers for data scientists and AI developers, focusing on high-level tools that will enable you to become more productive and collaborate more effectively. To provide context, William briefly explains what containers are and why developers love them. He then covers several key benefits of containers for data scientists, focusing on repeatability, collaboration, scalability, and compliance. You’ll learn how containers fulfill the promise of reproducible research, ease moving techniques from prototype to production, enable painless publishing and collaboration workflows, and empower you to safely develop techniques against sensitive data in a production environment from the comfort of your laptop.

There are myriad tutorial resources explaining how to build and run container images, but these largely assume an audience whose primary responsibilities include packaging, releasing, and managing applications. William focuses on why data scientists should care about containers and the high-level tools built on top of containers that will enhance their daily work. Data scientists will leave with a better understanding of the advantages of containers and concrete suggestions for how to use higher-level tools to make their work more productive. Application and AI developers will learn about the commonalities between engineering workflows and data science workflows and leave with a better understanding of how containers can support their data scientist colleagues and enable cross-functional collaboration.

William Benton

Red Hat

William Benton is an engineering manager and senior principal software engineer at Red Hat, where he leads a team of data scientists and engineers. He’s applied machine learning to problems ranging from forecasting cloud infrastructure costs to designing better cycling workouts. His focus is investigating the best ways to build and deploy intelligent applications in cloud native environments, but he’s also conducted research and development in the areas of static program analysis, managed language runtimes, logic databases, cluster configuration management, and music technology.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com