Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY

Managing a 1,000+ student JupyterHub without losing your sanity

Ryan Lovett (Department of Statistics, UC Berkeley), Yuvi Panda (Data Science Education Program (UC Berkeley))
1:50pm–2:30pm Thursday, August 24, 2017
JupyterHub deployments
Location: Nassau Level: Intermediate
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Instructors, DevOps engineers, and sysadmins

Prerequisite knowledge

  • Experience installing complex software

What you'll learn

  • Learn how the UC Berkeley Data Science Education program uses Jupyter notebooks on a JupyterHub

Description

JupyterHub is a distributed system at heart, and distributed systems are fundamentally hard. Best practices for running particular types of distributed systems are learned over time with trial, error, and tears.

The UC Berkeley Data Science Education program uses Jupyter notebooks on a JupyterHub. Ryan Lovett and Yuvi Panda outline the DevOps principles that keep the largest reported educational hub (with 1,000+ users) stable and performant while enabling all the features instructors and students require. Along the way, Ryan and Yuvi share lessons learned building, scaling, and providing support while making deployment and maintenance of JupyterHubs as automated as possible.

Topics include:

  • The technical goals of the deployment (reduced human maintenance, larger scale, being good open source citizens, absolute reproducibility, etc.)
  • The technology chosen to meet these goals and the reasons for choosing them
  • The lightweight human processes put in place to ensure that students’, instructor assistants’, instructors’, and the system administrators’ needs are being met
  • How to use these technical and human processes at your institution
Photo of Ryan Lovett

Ryan Lovett

Department of Statistics, UC Berkeley

Ryan Lovett manages research and instructional computing for the Department of Statistics at UC Berkeley and is a member of the Data Science Education Program’s infrastructure team. He is most often a sysadmin, though he also enjoys programming and consulting with faculty and students.

Photo of Yuvi Panda

Yuvi Panda

Data Science Education Program (UC Berkeley)

Yuvi Panda is infrastructure lead for the Data Science Education Program at UC Berkeley, where he works on scaling JupyterHub for use by thousands of students. A programmer and DevOps engineer, he wants to make it easy for people who don’t traditionally consider themselves programmers to do things with code and builds tools (Quarry, PAWS, etc.) to sidestep the list of historical accidents that constitute the “command-line tax” that people have to pay before doing productive things with computing. He’s a core member of the JupyterHub team and works on mybinder.org as well. Yuvi is also a Wikimedian, since you can check out of Wikimedia, but you can never leave.