San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Please log in

Add to Your Schedule

Deep learning beyond the learning

Tobias Knaup (Mesosphere), Joerg Schad (ArangoDB)

11:50am–12:30pm Wednesday, March 27, 2019

Data Engineering & Architecture
Location: 2008

Secondary topics: AI and Data technologies in the cloud, Automation in data science and big data, Model lifecycle management, Storage

Average rating:

(4.50, 2 ratings)

Who is this presentation for?

Data scientists, developers, and architects interested in building deep learning pipelines

Level

Intermediate

Prerequisite knowledge

Familiarity with data science tools and concepts (useful but not required)

What you'll learn

Understand the challenges involved in implementing a deep learning pipeline
Explore a potential end-to-end implementation

Description

Open source frameworks such as Spark, TensorFlow, MXNet, and PyTorch enable anyone to model and train deep learning models. While there are many great tutorials and talks showing the best way to train models, there’s little information on what happens before and after training your model—in other words, how to develop, store, utilize, test, and refine it.

Tobias Knaup and Joerg Schad offer an introduction to building a complete automated deep learning pipeline, starting with exploratory analysis, overtraining, model storage, model serving, and monitoring.

Topics include:

How to enable data scientists to develop models without having to worry about the underlying infrastructure
How to automatize distributed training, model optimization, and serving using CI/CD
How to easily deploy these distributed deep learning frameworks on any public or private infrastructure
How to manage multiple different deep learning frameworks on a single cluster, especially considering heterogeneous resources such as GPUs
The best interface to use when working with the cluster
How to store and serve models at scale
How to update models that are currently in use without causing downtime for the services using them
How to monitor the entire pipeline and track performance of the deployed models

Tobias Knaup

Mesosphere

Tobi Knaup is the cofounder and CTO at Mesosphere, a hybrid cloud platform company that helps companies such as NBCUniversal, Deutsche Telekom, and Royal Caribbean adopt transformative technologies like machine learning and real-time analytics with ease. He was one of the first engineers and tech lead at Airbnb, where he wrote large parts of the company’s infrastructure, including its search and fraud prediction services, and helped scale the site to millions of users and build a world-class engineering team. Tobi is the main author of Marathon, Mesosphere’s container orchestrator.

Website

Joerg Schad

ArangoDB

Jörg Schad is Head of Machine Learning at ArangoDB. In a previous life, he has worked on or built machine learning pipelines in healthcare, distributed systems at Mesosphere, and in-memory databases. He received his Ph.D. for research around distributed databases and data analytics. He’s a frequent speaker at meetups, international conferences, and lecture halls.

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com