Speaker slides: Jupyter Notebook conference & training: JupyterCon

Containerizing notebooks for serverless execution (sponsored by AWS)

Kevin McCormick (Amazon Web Services), Vladimir Zhukov (Amazon Web Services)

Download slides (PPTX)

Kevin McCormick explains the story of two approaches which were used internally at AWS to accelerate new ML algorithm development, and easily package Jupyter notebooks for scheduled execution, by creating custom Jupyter kernels that automatically create Docker containers, and dispatch them to either a distributed training service or job execution environment.

Advanced data science, part 2: Five ways to handle missing data in Jupyter notebooks

Matt Brems (General Assembly)

Download slides (ZIP)

Missing data plagues nearly every data science problem. Often, people just drop or ignore missing data. However, this usually ends up with bad results. Matt Brems explains how bad dropping or ignoring missing data can be and teaches you how to handle missing data the right way by leveraging Jupyter notebooks to properly reweight or impute your data.

All the cool kids are doing it; maybe we should too? Jupyter, gravitational waves, and the LIGO and Virgo Scientific Collaborations

Will M Farr (Stony Brook University)

Watch the keynote

Will Farr shares examples of Jupyter use within the LIGO and Virgo Scientific Collaborations and offers lessons about the (many) advantages and (few) disadvantages of Jupyter for large, global scientific collaborations. Along the way, Will speculates on Jupyter's future role in gravitational wave astronomy.

Beyond interactive: Scaling impact with notebooks at Netflix

Michelle Ufford (Netflix)

Watch the keynote

Netflix is reimagining what a Jupyter notebook is, who works with it, and what you can do with it. Michelle Ufford shares how Netflix leverages notebooks today and describes a brief vision for the future.

Building an Enterprise/Cloud Analytics Platform with Jupyter Enterprise Gateway

Moderated by: Kevin Bates

Download slides (PDF)

Data science and analytics departments are now common place for enterprises determined to maximize their operations. While Jupyter Notebooks have significantly decreased the cost of admission into this space, enterprises are finding that data science at scale is difficult within the current framework. Jupyter Enterprise Gateway is designed to address these scalability issues for the enterprise.

Canadians land on Jupyter

Ian Allison (Pacific Institute for the Mathematical Sciences), James Colliander (Pacific Institute for the Mathematical Sciences)

Download slides (PDF)

Over the past 18 months, Ian Allison and James Colliander have deployed Jupyter to more than 8,000 users at universities across Canada. Ian and James offer an overview of the Syzygy platform and explain how they plan to scale and deliver the service nationally and how they intend to make Jupyter integral to the working experience of students, researchers, and faculty members.

Citizen data science: An enterprise use case from inside the US intelligence community

Dave Stuart (Department of Defense )

Download slides (PPTX)

Dave Stuart explains how Jupyter was used inside the US Department of Defense and the greater intelligence community to empower thousands of "citizen data scientists" to build and share analytics in order to meet the community’s dynamic challenges.

Data science as a catalyst for scientific discovery

Michelle Gill (BenevolentAI)

Watch the keynote

View slides

Michelle Gill explains how data science methodologies and tools can be used to link information from different scientific fields and accelerate discovery in a variety of areas, including the biological sciences.

Data science in US and Canadian higher education

Laura Noren (Obsidian Security)

Download slides (PDF)

Laura Noren offers an overview of a research project on the various infrastructure models supporting data science in research settings in terms of funding, educational uses, and research utilization. Laura then shares some of the findings, comparing the national federation model currently established in Canada to the more grassroots efforts in many US universities.

Democratizing data

Tracy Teal (The Carpentries)

Watch the keynote

We are generating vast amounts of data, but it's not the data itself that is valuable—it's the information and knowledge that can come from this data. Tracy Teal explains how to bring people to data and empower them to address their questions, reach their potential, and solve issues that are important in science, scholarship, and society.

Design and analysis of the world’s most advanced microprocessors using Jupyter notebooks

Kerim Kalafala (IBM), NICHOLAI L'ESPERANCE (IBM)

Download slides (PPTX)

Kerim Kalafala and Nicholai L'Esperance share their experiences using Jupyter notebooks as a critical aid in designing the next generation of IBM Power and Z processors, focusing on analytics on graphs consisting of hundreds of millions of nodes. Along the way, Kerim and Nicholai explain how they leverage Jupyter notebooks as part of their overall design system.

Designing for interaction

Scott Sanderson (Quantopian)

Download slides (PDF)

Scott Sanderson explores how interactivity can and should influence the design of software libraries, details how the needs of interactive users differ from the needs of application developers, and shares techniques for improving the usability of libraries in interactive environments without sacrificing robustness in noninteractive environments.

How JupyterLab and widgets enable interactive analysis of the Earth's past, present, and future

Tyler Erickson (Google)

View slides

Massive collections of data on the Earth's changing environment, collected by satellite sensors and generated by Earth system models, are being exposed via web APIs by multiple providers. Tyler Erickson highlights the use of JupyterLab and Jupyter widgets in analyzing complex high-dimensional datasets, providing insights into how our Earth is changing and what the future might look like.

I don't like notebooks.

Joel Grus (Allen Institute for Artificial Intelligence)

View slides

I have been using and teaching Python for many years. I wrote a best-selling book about learning data science. And here's my confession: I don't like notebooks. (There are dozens of us!) I'll explain why I find notebooks difficult, show how they frustrate my preferred pedagogy, demonstrate how I prefer to work, and discuss what Jupyter could do to win me over.

Jupyter for every high schooler

Rob Newton (Trinity School)

Download slides (PDF)

In an effort to broaden graduates' mathematical toolkit and address gender equity in STEM education, Rob Newton has led the implementation of Python projects across his school's entire ninth-grade math courses. Now every student in the ninth grade completes three python projects that introduce programming and integrate them with the ideas developed in class.

Jupyter graduates

Douglas Blank (Comet.ML)

Download slides (PDF)

For the last four years, Douglas Blank has used nothing but Jupyter in the classroom—from a first-year writing course to a course on assembly language, from biology to computer science, from lectures to homework. Join in to learn how Douglas has leveraged Jupyter and discover the successes and failures he experienced along the way. Nicole Petrozzo then offers a student's perspective.

Jupyter in the enterprise

LUCIANO RESENDE (IBM)

Watch the keynote

IBM has leveraged the Jupyter stack in many of its products to offer industry-leading and business-critical services to its clients. Luciano Resende explores some of the open source initiatives that IBM is leading in the Jupyter ecosystem to address enterprise requirements in the community.

Jupyter in the modern enterprise data and analytics ecosystem: Trends, experiments, and opportunities

Gerald Rousselle (Teradata)

Download slides (PPTX)

Gerald Rouselle reviews some of the trends in modern data and analytics ecosystems for large enterprises and shares some of the key challenges and opportunities for Jupyter adoption. He also details some recent examples and experiments in incorporating Jupyter in commercial products and platforms.

Jupyter notebooks and the intersection of data science and data engineering

David Schaaf (Capital One)

Watch the keynote

Download slides (ZIP)

David Schaaf explains how data science and data engineering can work together in cross-functional teams—with Jupyter notebooks at the center of collaboration and the analytic workflow—to more effectively and more quickly deliver results to decision makers.

Jupyter trends in 2018

Paco Nathan (derwen.ai)

Watch the keynote

Download slides (PDF)

Jupyter is built on a set of extensible, reusable building blocks, expressed through various open protocols, APIs, and standards. For many use cases, these are combined to provide extensible software architecture for interactive computing with data. Paco Nathan shares a few somewhat unexpected things that emerged in 2018.

Jupyter's configuration system

Afshin Darian (Two Sigma | Project Jupyter), M Pacer (Netflix), Min Ragan-Kelley (Simula Research Laboratory), Matthias Bussonnier (UC Berkeley BIDS)

Download slides (ZIP)

Jupyter's straightforward, out-of-the-box experience has been important for its success in widespread adoption. But good defaults only go so far. Join Afshin Darian, M Pacer, Min Ragan-Kelley, and Matthias Bussonnier to go beyond the defaults and make Jupyter your own.

JupyterHub for domain-focused integrated learning modules

Mariah Rogers (UC Berkeley Division of Data Sciences), Julian Kudszus (UC Berkeley Division of Data Sciences)

Download slides (PDF)

The Data Science Modules program at UC Berkeley creates short explorations into data science using notebooks to allow students to work hands-on with a dataset relevant to their course. Mariah Rogers, Ronald Walker, and Julian Kudszus explain the logistics behind such a program and the indispensable features of JupyterHub that enable such a unique learning experience.

Keynote by Dan Romuald Mbanga

Dan Mbanga (Amazon Web Services)

Watch the keynote

Keynote by Dan Romuald Mbanga

Machine learning at scale with Kubernetes

chris cho (Google)

Download slides (PDF)

Christopher Cho demonstrates how Kubernetes can be easily leveraged to build a complete deep learning pipeline, including data ingestion and aggregation, preprocessing, ML training, and serving with the mighty Kubernetes APIs.

nbinteract: Shareable interactive web pages from notebooks

Sam Lau (UC Berkeley), Caleb Siu (UC Berkeley)

Download slides (ZIP)

The nbinteract package converts Jupyter notebooks with widgets into interactive, standalone HTML pages. Its built-in support for function-driven plotting makes authoring interactive pages simpler by allowing users to focus on data, not callbacks. Sam Lau and Caleb Siu offer an overview of nbinteract and walk you through the steps to publish an interactive web page from a Jupyter notebook.

Pangeo: Big data climate science in the cloud

Ryan Abernathey (Columbia University), Yuvi Panda (Data Science Education Program (UC Berkeley))

Download slides (PDF)

Climate science is being flooded with petabytes of data, overwhelming traditional modes of data analysis. The Pangeo project is building a platform to take big data climate science into the cloud using SciPy and large-scale interactive computing tools. Join Ryan Abernathey and Yuvi Panda to find out what the Pangeo team is building and why and learn how to use it.

PayPal Notebooks: Data science and machine learning at scale, powered by Jupyter

Romit Mehta (PayPal), Praveen Kanamarlapudi (PayPal)

Download slides (PPTX)

Hundreds of PayPal's data scientists, analysts, and developers use Jupyter to access data spread across filesystem, relational, document, and key-value stores, enabling complex analytics and an easy way to build, train, and deploy machine learning models. Romit Mehta and Praveen Kanamarlapudi explain how PayPal built its Jupyter infrastructure and powerful extensions.

Rapid data science exploration for cybersecurity

George Williams (GSI Technology), Harini Kannan (Capsule8), Alex Comerford (Capsule8)

Download slides (PDF)

The key to successful threat detection in cybersecurity is fast response. George Williams, Harini Kannan, and Alex Comerford offer an overview of specialized extensions they have built for data scientists working in cybersecurity that can be used and deployed via JupyterHub.

Reproducible data dependencies for Jupyter: Distributing massive, versioned image datasets from the Allen Institute for Cell Science

Jackson Brown (Allen Institute for Cell Science), Aneesh Karve (Quilt)

View slides

Reproducible data is essential for notebooks that work across time, across contributors, and across machines. Jackson Brown and Aneesh Karve demonstrate how to use an open source data registry to create reproducible data dependencies for Jupyter and share a case study in open science over terabyte-size image datasets.

Reproducible quantum chemistry in Jupyter

Chris Harris (Kitware)

Download slides (PDF)

In silico prediction of chemical properties has seen vast improvements in both veracity and volume of data but is currently hamstrung by a lack of transparent, reproducible workflows coupled with environments for visualization and analysis. Chris Harris offers an overview of a platform that uses Jupyter notebooks to enable an end-to-end workflow from simulation setup to visualizing the results.

Reproducible science with the Renku platform

Sandra Savchenko-de Jong (Swiss Data Science Center)

Download slides (PDF)

Sandra Savchenko-de Jong offers an overview of Renku, a highly scalable and secure open software platform designed to make (data) science reproducible, foster collaboration between scientists, and share resources in a federated environment.

Scheduled notebooks: A means for manageable and traceable code execution

Matthew Seal (Netflix)

Download slides (PPTX)

Using an nteract project, papermill, Matthew Seal walks you through how Netflix uses notebooks to track user jobs and make a simple interface for work submission. You’ll get an inside peek at how Netflix is tackling the scheduling problem for a range of users who want easily managed workflows.

Sea change: What happens when Jupyter becomes pervasive at a university?

Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)

Watch the keynote

In 2018, UC Berkeley launched a new major in data science, anchored by two core courses that are the fastest-growing in the history of the university. Fernando Pérez discusses the program and explains how the core courses, which now reach roughly 40% of the campus population, are extending data science into specific domains that cover virtually all disciplinary areas of the campus.

Serverless machine learning with TensorFlow

Vijay Reddy (Google Cloud)

Download slides (PDF)

Vijay Reddy walks you through the process of building machine learning models with TensorFlow. You'll learn about data exploration, feature engineering, model creation, training, evaluation, deployment, and more.

SoS: A polyglot notebook and workflow system for both interactive multilanguage data analysis and batch data processing

Bo Peng (The University of Texas, MD Anderson Cancer Center)

Download slides (ZIP)

Bo Peng offers an overview of Script of Scripts (SoS), a Python 3-based workflow engine with a Jupyter frontend that allows the use of multiple kernels in one notebook. This unique combination enables users to analyze data using multiple scripting languages in one notebook and, if needed, convert scripts to workflows in situ to analyze large amounts of data on remote systems.

Supporting reproducibility in Jupyter through dataflow notebooks

David Koop (University of Massachusetts Dartmouth)

Download slides (PDF)

Dataflow notebooks build on the Jupyter Notebook environment by adding constructs to make dependencies between cells explicit and clear. David Koop offers an overview of the Dataflow kernel, shows how it can be used to robustly link cells as a notebook is developed, and demonstrates how that notebook can be reused and extended without impacting its reproducibility.

Sustaining wonder: Jupyter and the knowledge commons

Carol Willing (Cal Poly San Luis Obispo)

Watch the keynote

New challenges are emerging for Jupyter, open information, and investing in the future. You, the innovators of this growing knowledge commons, will determine how we meet these challenges and sustain the ecosystem. Carol Willing shows how you can start.

SWAN: CERN's Jupyter-based interactive data analysis service

Diogo Castro (CERN)

Download slides (PDF)

SWAN, CERN’s service for web-based analysis, leverages the power of Jupyter to provide the high energy physics community access to state-of-the-art infrastructure and services through a web service. Diogo Castro offers an overview of SWAN and explains how researchers and students are using it in their work.

Terraforming Jupyter: Changing JupyterLab to suit your needs

Stephanie Stattel (Bloomberg LP), Paul Ivanov (Bloomberg LP)

Download slides (PDF)

Stephanie Stattel and Paul Ivanov walk you through a series of extensions that demonstrate the power and flexibility of JupyterLab’s architecture, from targeted functionality modifications to more extreme atmospheric changes that require extensive decoupling and flexibility within JupyterLab.

The Emacs Ipython Notebook

John Miller (Honeywell UOP)

Download slides (ZIP)

John Miller offers an overview of the Emacs IPython Notebook (EIN), a full-featured client for the Jupyter Notebook in Emacs, and shares a brief history of its development.

The future of data-driven discovery in the cloud

Ryan Abernathey (Columbia University)

Watch the keynote

Drawing on his experience with the Pangeo project, Ryan Abernathey makes the case for the large-scale migration of scientific data and research to the cloud. The cloud offers a way to make the largest datasets instantly accessible to the most sophisticated computational techniques. A global scientific data commons could usher in a golden age of data-driven discovery.

The reporter’s notebook

mark hansen (Columbia Journalism School | The Brown Institute for Media Innovation)

Watch the keynote

Beyond Twitter, Facebook, and similar networks, without question, data, code, and algorithms are forming systems of power in our society. Mark Hansen explains why it is crucial that journalists—explainers of last resort—be able to interrogate these systems, holding power to account.

Using Jupyter notebooks in highly regulated environments

David Schaaf (Capital One), Shivraj Ramanan (Capital One)

Download slides (ZIP)

In Capital One's recent exploration of "notebook" offerings, JupyterHub emerged as a top contender that could serve as a potential platform for analytics even in highly regulated industries like financial services. David Schaaf and Shivraj Ramanan discuss Capital One's journey and explain how Jupyter has become a part of the company's ever-growing analytics toolkit.

Using Jupyter to create a community for satellite imagery analysis and sharing

Sean Gorman (DigitalGlobe)

Download slides (PDF)

Satellite imagery can be a critical resource during disasters and humanitarian crises. While the community has improved data sharing, we still struggle to create reusable data science to solve problems on the ground. Sean Gorman offers an overview of GBDX Notebooks, a step toward creating an open data science community built around Jupyter to stream imagery and share analysis at scale.

Using the MapD kernel for the Jupyter Notebook

Randy Zwitch (MapD)

Download slides (PDF)

MapD Core is an open source analytical SQL engine that has been designed from the ground up to harness the parallelism inherent in GPUs. This enables queries on billions of rows of data in milliseconds. Randy Zwitch offers an overview of the MapD kernel extension for the Jupyter Notebook and explains how to use it in a typical machine learning workflow.

Visualizing high-dimensional biological data with Clustergrammer-Widget in the Jupyter Notebook

Nicolas Fernandez (Icahn School of Medicine at Mount Sinai)

View slides

Nicolas Fernandez offers an overview of Clustergrammer-Widget, an interactive heatmap Jupyter widget that enables users to easily explore high-dimensional data within a Jupyter notebook and share their interactive visualizations using nbviewer.

Visualizing machine learning models in the Jupyter Notebook (sponsored by Bloomberg LP)

Chakri Cherukuri (Bloomberg LP)

Download slides (PDF)

Chakri Cherukuri offers an overview of the interactive widget ecosystem available in the Jupyter notebook and illustrates how Jupyter widgets can be used to build rich visualizations of machine learning models. Along the way, Chakri walks you through algorithms like regression, clustering, and optimization and shares a wizard for building and training deep learning models with diagnostic plots.

Why contribute to open source?

Julia Meinwald (Two Sigma Investments)

Watch the keynote

Download slides (PDF)

Julia Meinwald outlines a few effective ways Two Sigma has identified to support the unseen labor maintaining a healthy open source ecosystem and details how the company’s thinking on this topic has evolved.

Speaker slides & video

Sponsorship Opportunities

Partner Opportunities

Contact Us