Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

How we run A case study in open infrastructure

Yuvi Panda (Data Science Education Program (UC Berkeley))
11:55am–12:35pm Thursday, August 23, 2018
JupyterHub deployments, Usage and application
Location: Sutton Center/Sutton South Level: Non-technical

Who is this presentation for?

  • Open source developers, those running infrastructure for open source projects, operations engineers, data engineers, managers, and executives

What you'll learn

  • Learn how the team at develops code and runs infrastructure in an open, transparent, collaborative manner and how you can too


Thousands of users use each day, for everything from K–12 education to demonstrating scientific research such as how gravitational waves are detected. Reliability and performance are the core features that give people the confidence to use However, running infrastructure is challenging for an open source community. Yuvi Panda shares lessons drawn from the small community that operates, covering the social and technical processes for keeping reliable in the most open, transparent, and inclusive way possible, using pretty graphs about the state of that anyone can see in real time.

Topics include:

  • Continuous deployment: Code changes should be fearlessly deployable by large numbers of people of varying skill levels, to prevent bottlenecking resulting from a small number of “roots” or “deployers.” The team at has built automation around its deployments that let them easily safely deploy changes to the live site minutes after they get merged. Yuvi explains why this is important and how it is done.
  • Operational metrics: The team collects and visualizes real-time data from the many layers and components that run, allowing them to spot anomalies quickly when they occur and do retroactive analysis of outages. Yuvi demos these visualizations and showcases how they can be useful.
  • Usage metrics: Authors want to know how many users are using their repositories, and wants to know how many people are benefiting from the service. However, the team is also aware they need to balance a user’s privacy needs with their need for analytics. Yuvi explores these trade-offs and explains how the team has dealt with them.
  • Culture: Building a culture of continuous improvement devoid of “rock star” behavior is key to the long-term sustainability of any operations team. The team does blameless incident reports after every incident, does all its communication in open channels (GitHub, Gitter, etc.), and provides everyone on the team equal infrastructure access. Yuvi talks about why these social processes are far more important than the technical processes and shares some lessons learned.
  • Transparency: The team uses public communication channels for everything. The project’s entire infrastructure repository is public, and anyone can propose a deployment. Its detailed cloud compute costs are public. Its real-time operational metrics are public. Anyone can fork and run their own instance on whichever cloud provider they want. Yuvi explains how the team does this, why it is important, and how you can access these resources.
Photo of Yuvi Panda

Yuvi Panda

Data Science Education Program (UC Berkeley)

Yuvi Panda is infrastructure lead for the Data Science Education Program at UC Berkeley, where he works on scaling JupyterHub for use by thousands of students. A programmer and DevOps engineer, he wants to make it easy for people who don’t traditionally consider themselves programmers to do things with code and builds tools (Quarry, PAWS, etc.) to sidestep the list of historical accidents that constitute the “command-line tax” that people have to pay before doing productive things with computing. He’s a core member of the JupyterHub team and works on as well. Yuvi is also a Wikimedian, since you can check out of Wikimedia, but you can never leave.