Science at the Speed of Thought: Enhancing Jupyter to Enable Interactive "Human-in-the-loop" Supercomputing

Moderated by: Matt Henderson and Shreyas Cholia

Who is this presentation for?

Researchers, Data Scientists and Research Software Engineers

Prerequisite knowledge

Knowledge of the basics of Jupyter notebooks, Jupyter kernels, Jupyter widgets should be sufficient.

What you'll learn

What the current challenges are for HPC systems, current state of the art in the Jupyter ecosystem with respect to those challenges, and how interactive HPC could be integrated into a future Jupyter ecosystem.

Description

High Performance Computing (HPC) systems and workflows process and analyze data produced by large-scale experiments and simulations, such as first-principles materials structure calculations, supernovae simulations, and mass spectrometry image analysis. These largely focus on a non-interactive, asynchronous, batch execution process that can use thousands of cores and run for hours to days. Historically, these systems have not been designed to maximize the human utility and ease of interactive use, but rather to optimize for raw performance.

Simplifying and accelerating the mode of experimentation, which aligns with how scientists think and operate, is key to enhancing their productivity. This includes easy job submission and resubmission, introspection of jobs and their contents as they run, and easy ways of intercepting and manipulating data inputs and outputs for analysis and chaining of operations into pipelines. Introducing interactivity to scientific HPC applications and workflows provides a key missing human-in-the-loop capability to inspect the state of an execution in real time. The Jupyter architecture (kernels, Notebooks, widgets, etc.) provides a solid foundation to address these challenges, and is already familiar to many scientists, but is missing certain key ingredients. Our work has been centered around extending the Jupyter platform to facilitate an interactive HPC experience for scientists. Our vision is to provide an interactive Jupyter based-system that makes working on your laptop and working on a supercomputer a seamless experience that yields the best of both worlds, effectively “bringing supercomputing to your laptop.” We will motivate and demonstrate our work with real use cases from major science projects that need human-in-the-loop interaction with their large jobs and workflows. We will illustrate the power of coupling Jupyter with HPC systems and discuss how our system addresses some of the problems scientists face today, including:

Interacting with a large HPC job in real time, sampling/querying the job for specific information or data slices through a Jupyter notebook
Seamlessly moving data back and forth between Jupyter infrastructure and the HPC system
Managing notebooks with both synchronous local operations and asynchronous tasks that run on the supercomputing cluster
Interaction with workflow managers, batch systems and job schedulers

Elite Sponsors

Strategic Sponsor

Bloomberg

Contributing Sponsor

Impact Sponsor

Domino Data Lab

Supporting Sponsors

Premier Exhibitors

Innovators

Community Partners

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email jupytersponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with JupyterCon, email partners@oreilly.com

Contact Us

View a complete list of JupyterCon contacts

©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com