The Containerized Jupyter Platform

Moderated by: Joshua Cook

Who is this presentation for?

Data Scientists, Machine Learning Engineers, Devops Engineers

Prerequisite knowledge

Python

What you'll learn

Docker, docker-compose, distributed python

Description

It is not uncommon that a real-world data set will fail to be easily manageable. The set
may not fit well into accessed memory or may require prohibitively long processing. As a solution to this problem this session presents using the “infrastructure as code” technology, Docker, to define a system for performing very standard but non-trivial data tasks on medium- to large-scale datasets, using Jupyter as the master controller.

We explore using existing pre-compiled public images created by the major open-source technologies – Python, Jupyter, Postgres – as well as using the Dockerfile to extend these images to suit our specific purposes. We examine the docker-compose technology, and how it can be used to build a linked system, Python workers churning data behind the scenes, Jupyter managing these background tasks. We explore best practices in using existing libraries, as well as developing our own libraries to deploy state-of-the-art machine learning and optimization algorithms.

Finally, we present two use cases for the technologies and methods outlined. First, we
explore a multi-service system for developing machine learning pipelines using scikit-learn. Second, we explore best practices in using Docker and Jupyter to build and run neural networks using AWS GPU instances and keras with a tensorflow backend.
Throughout these case studies, we consider how the average data science practitioner
would perform the requisite tasks in advanced numerical computing, developing locally,
then deploying to cloud for final model development and tuning.

Elite Sponsors

Strategic Sponsor

Bloomberg

Contributing Sponsor

Impact Sponsor

Domino Data Lab

Supporting Sponsors

Premier Exhibitors

Innovators

Community Partners

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email jupytersponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with JupyterCon, email partners@oreilly.com

Contact Us

View a complete list of JupyterCon contacts

©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com