Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

Speaker slides & video

Presentation slides will be made available after the session has concluded and the speaker has given us the files. Check back if you don't see the file you're looking for—it might be available later! (However, please note some speakers choose not to share their presentations.)

If you are looking for slides and video from 2017, visit the JupyterCon 2017 site.

All
Wenming Ye (Amazon Web Services), Miro Enev (Nvidia Corp.)
2-Day Training Please note: to attend, you must be registered for a Platinum pass.
Machine Learning and IoT projects are now common for enterprises and startups alike. These advanced technologies have been the key innovation engine for businesses such as Amazon Go, Alexa, and Robotics. In this hands-on workshop, we will explore the AWS Machine Learning Platform using project Jupyter-based Amazon SageMaker to build, train, and deploy ML/DL models to Cloud, and AWS DeepLens.
Enterprise and organizational adoption, Extensions and customization, Usage and application
Zachary Glassman (The Data Incubator)
2-Day Training Please note: to attend, you must be registered for a Platinum pass.
This course offers a foundation in building intelligent business applications using machine learning. We will walk through all the steps of developing a machine learning pipeline. We’ll look at data cleaning, feature engineering, model building/evaluation, and deployment. Students will extend these models into two applications from real-world datasets.
Extensions and customization, JupyterHub deployments, Reproducible research and open science
Adam Thornton (LSST)
LSST is an ambitious project to map the sky in the the fastest, widest and deepest survey ever made. This petabyte-scale, 7 trillion-row database disrupts traditional astronomical workflows. Our science platform requires a paradigm shift in how astronomy is done. Learn the challenges of providing production services on a notebook-based architecture and the compelling advantages of JupyterLab.
Data visualization, Reproducible research and open science, Training and education
Bruno Gonçalves (New York University)
Tutorial Please note: to attend, your registration must include Tutorials.
The fundamental concepts and ideas behind human visual perception and how it informs scientific data visualization are introduced in an intuitive and grounded manner. These concepts are illustrated through practical examples using matplotlib and seaborn, following a tutorial on these two libraries. Finally, the main ideas will be summarized in the form of rules of thumb for ease of reference.
Reproducible research and open science, Training and education, Usage and application
Matt Brems (General Assembly)
Tutorial Please note: to attend, your registration must include Tutorials.
Missing data plagues nearly every data science problem. Oftentimes, people just drop or ignore missing data. However, this usually ends up with bad results. We'll show how bad dropping or ignoring missing data can be, then we'll learn how to fix this - the right way! Leverage Jupyter notebooks to properly reweight or impute your data.
Community, Training and education
Jane Herriman (Julia Computing, Inc.)
Tutorial Please note: to attend, your registration must include Tutorials.
This introductory workshop assumes no prior exposure to Julia. It should be accessible (and hopefully useful!) to scientists, engineers, and anyone else with technical computing needs. We will use Jupyter notebooks to show you why Julia is special, demonstrate how easy it is to learn Julia, and get you writing your first Julia programs.
Community, JupyterHub deployments, Reproducible research and open science
Tim Head (Wild Tree Tech)
The Binder project drastically lowers the bar to sharing and re-using software. As a user wanting to try out someone else’s work requires only clicking a single link. This talk will introduce the audience to the concepts and ideas behind the Binder project. We will showcase examples from the community to Show off the power of Binder.
Enterprise and organizational adoption, JupyterHub deployments, Usage and application
Ian Allison (Pacific Institute for the Mathematical Sciences), James Colliander (Pacific Institute for the Mathematical Sciences)
Over the past 18 months, we have deployed Jupyter to more than 8000 users at Universities across Canada. In this talk, we'll discuss how we did it, how we plan to scale and deliver the service nationally, how people are using the platform, and how we intend to make Jupyter integral to the working experience of students, researchers, and faculty members.
Enterprise and organizational adoption, Usage and application
Dave Stuart (Department of Defense )
How Jupyter was used inside the U.S. Department of Defense (DOD) and the Intelligence Community (IC) to empower thousands of “Citizen Data Scientists” to build and share analytics in order to meet the community’s dynamic challenges. These Citizen Data Scientists have the aptitude, curiosity, and creativity to put their tradecraft into code but historically lacked the technical training to do so.
Extensions and customization, Training and education, Usage and application
Damián Avila (Anaconda Powered by Continuum Analytics)
RISE has evolved into the main slideshow machinery for live presentations within the Jupyter notebook. In this talk, we'll explain how to install/use RISE and how to customize it. Additionally, we will show some new capabilities. Finally, we'll show the beginning of the migration from RISE into a new jupyterlab-rise extension providing RISE-based capabilities in the new Jupyter Lab interface.
Enterprise and organizational adoption, JupyterHub deployments, Training and education
Laura Noren (NYU Center for Data Science)
This talk will be based on research that of the various infrastructure models supporting data science in research settings in terms of funding, educational uses, and research utilization. Specifically, we explore the national federation model currently established in Canada, with the support of the Canadian federal government, in comparison to the more grassroots efforts in many US universities.
JupyterHub deployments, Reproducible research and open science, Training and education
Carol Willing (Cal Poly San Luis Obispo), Min Ragan-Kelley (Simula Research Laboratory), Erik Sundell (IT-Gymnasiet Uppsala)
Tutorial Please note: to attend, your registration must include Tutorials.
This tutorial will let you provide a group of your colleagues or students with easy access to Jupyter notebooks and JupyterLab without asking them to install anything on their computers. You will configure and deploy a cloud-based JupyterHub using Kubernetes. You will learn how to customize and extend it for your needs.
Data visualization, Integrations with other Software, Usage and application
We will present our experiences using Jupyter notebooks, as a critical aid in the design the next generation of IBM Power and Z processors. Analytics on graphs consisting of hundreds of millions of nodes will be emphasized along with leveraging Jupyter notebooks as part of our overall design system.
Integrations with other Software, Reproducible research and open science, Usage and application
Scott Sanderson (Quantopian)
This presentation explores how interactivity can and should influence the design of software libraries. We discuss ways that the needs of interactive users differ from the needs of application developers, and we describe techniques for improving the usability of libraries in interactive environments without sacrificing robustness in non-interactive environments.
Extensions and customization, Reproducible research and open science
Kevin Zielnicki (Stitch Fix)
Even with good intentions, analysis notebooks can quickly accumulate a mess of false starts and out-of-order statements. Best practices encourage cleaning up a notebook to ensure reproducibility, but many analyses will never reach this cleaned-up state. As an alternative, this talk will describe Nodebook, a Jupyter plugin that encourages reproducibility by preventing inconsistency.
Keynotes
Paco Nathan (O'Reilly Media), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
Friday Opening Remarks
Data visualization, Integrations with other Software, Kernels
Sylvain Corlay (QuantStack), Johan Mabille (QuantStack), Wolf Vollprecht (QuantStack), Loic Gouarin (CNRS, Laboratoire de Mathématiques d'Orsay)
In this talk, we present the latest features of the C++ Jupyter kernel including - live help, auto-completion, - rich mime type rendering, - interactive widgets, making it one of the most featureful implementations of the Jupyter kernel protocol, and bringing Jupyter closer to the metal.
Integrations with other Software, Reproducible research and open science, Usage and application
Tyler Erickson (Google)
Massive collections of data on the Earth's changing environment, collected by satellite sensors and generated by Earth system models, are being exposed via web APIs by multiple providers. This presentation will highlight the use of JupyterLab and Jupyter Widgets in analyzing complex high-dimensional datasets, providing insights into how our Earth is changing and what the future might look like.
JupyterHub deployments, Usage and application
Yuvi Panda (Data Science Education Program (UC Berkeley))
Running infrastructure is challenging for an open source community. A small community of individuals with varying skills operates MyBinder.org. In this talk, we'll talk about our social & technical processes for keeping mybinder.org reliable in the most open, transparent & inclusive way possible. We'll also share pretty graphs about the state of mybinder.org that anyone can see real-time!
Data visualization, Reproducible research and open science, Training and education
Pramit Choudhary (DataScience.com)
Tutorial Please note: to attend, your registration must include Tutorials.
Just predicting the target labels for a datascience use-case is not enough. It is important to understand the “why”, “what” & “how” about the model’s behavior. In the tutorial, we will explore algorithms(posthoc and rule extraction) to faithfully interpret ML models globally and locally with jupyter's interactiveness and “Skater”, an opensource library to demystify inner working of ML models
Training and education
Rachael Tatman (Kaggle)
Tutorial Please note: to attend, your registration must include Tutorials.
A practical introduction on incorporating notebooks into the classroom using active learning techniques.
Training and education, Usage and application
Joel Grus (Allen Institute for Artificial Intelligence)
I have been using and teaching Python for many years. I wrote a bestselling book about learning data science. And here's my confession: I don't like notebooks. [There are dozens of us!] In this talk I'll explain why I find notebooks difficult, show how they frustrate my preferred pedagogy, demonstrate how I prefer to work, and discuss what Jupyter could do to win me over.
Training and education
Rob Newton (Trinity School)
In an effort to broaden our graduates' mathematical toolkit as well as address gender equity in STEM education I've led the implementation of python projects across our entire 9th grade math courses. Every student in the 9th grade completes 3 python projects that introduce programming and integrate it with the ideas developed in class.
Extensions and customization, Training and education, Usage and application
Douglas Blank (Bryn Mawr College), Nicole Petrozzo (Bryn Mawr College)
For the last four years, I have used nothing but Jupyter in the classroom. From a firstyear writing course to a course on assembly language; from Biology to Computer Science; from lectures to homework---everything has been in Jupyter. In this talk, I explore the ways I have leveraged Jupyter, and detail the successes and failures experienced along the way.
Gerald Rousselle (Teradata)
Gerald Rouselle reviews some of the trends in modern data and analytics ecosystems for large enterprises and shares some of the key challenges and opportunities for Jupyter adoption. He also shares some recent examples and experiments in incorporating Jupyter in commercial products and platforms.
JupyterCon Business Summit, Training and education, Usage and application
Catherine Ordun (Booz Allen Hamilton)
Many U.S. government agencies are just getting started in machine learning. As a result, data scientists need to de-"black box" models as much as possible. One simple way to do this is to transparently show how the model is coded and its results at each step. Notebooks do just this. We will walk through a notebook we built for RNNs and discuss how we think agencies can use Notebooks.
Julia Lane (Center for Urban Science and Progress and Wagner School, NYU)
As Lew Platt, CEO of Hewlett Packard, once said, 'If only HP knew what HP knows, we would be three times more productive” Yet government agencies have found it difficult to serve taxpayers because of the technical, bureaucratic and ethical issues associated with access and use of sensitive data.
JupyterHub deployments, Training and education, Usage and application
Mariah Rogers (UC Berkeley Division of Data Sciences), Ronald Walker (UC Berkeley Division of Data Sciences)
The modules program at UC Berkeley creates short explorations into data science using notebooks to allow students to work hands-on with a dataset relevant to their course. We’ve served over 1500 students in over 25 different courses primarily in the social sciences, arts, and humanities by plugging in for 1-3 class periods, an impossibility without a JupyterHub eliminating installation time.
Brian Granger (Cal Poly San Luis Obispo)
1-Day Training Please note: you must be registered for a Platinum pass.
Brian Granger offers an in-depth view of JupyterLab, which enables users to work with the core building blocks of the classic Jupyter Notebook in a more flexible and integrated manner.
Brian Granger (Cal Poly San Luis Obispo)
Tutorial Please note: to attend, your registration must include Tutorials.
Brian Granger offers an in-depth view of JupyterLab, which enables users to work with the core building blocks of the classic Jupyter Notebook in a more flexible and integrated manner.
Data visualization, Kernels, Usage and application
Lindsay Richman (McKinsey & Co.)
JupyterLab and Plotly both provide a rich set of tools for working with data. When combined, they create a powerful computational environment that enables users to produce versatile, robust visualizations in a fast-paced setting. This session demonstrates how McKinsey uses JupyterLab and Plotly to create dynamic charts and web apps, including those that stream IoT data, in Python, Julia, and R.
Community, Jupyter subprojects, Training and education
Carol Willing (Cal Poly San Luis Obispo), Jessica Forde (Jupyter), Erik Sundell (IT-Gymnasiet Uppsala)
Students can learn by doing. In this talk, we will show how interactive content, using Jupyter Notebooks, Widgets, and visualization libraries put the student in charge. We will share notable examples of projects within the Jupyter community and offer ways in which educators can help students to develop data science literacy and use computational skills to build upon their interests.
Christopher Cho (Google)
1-Day Training Please note: you must be registered for a Platinum pass.
This talk will explore how Kubernetes can be easily leveraged to build a complete Deep Learning pipelines starting all the way from data ingestion/aggregation, pre-processing, ML training, and serving with the mighty Kubernetes APIs.
Core architecture, Data visualization, Extensions and customization
M Pacer (Project Jupyter | Berkeley Institute for Data Science)
Jupyter displays a rich array of media types out-of-the-box. In this talk, we will describe how to use these capabilities to their full potential. We will show how to add rich displays to existing and new Python classes. We will also show you how to customise the way notebooks are converted to other formats. These skills will enable anyone to make beautiful objects with Jupyter.
Community, Extensions and customization, Usage and application
Sam Lau (UC Berkeley), Caleb Siu (UC Berkeley)
The nbinteract package converts Jupyter notebooks with widgets into interactive, standalone HTML pages. nbinteract’s built-in support for function-driven plotting makes authoring interactive pages simpler by allowing users to focus on data, not callbacks. We will introduce nbinteract and walk through the steps to publish an interactive web page from a Jupyter notebook.
Community
Julia Meinwald (Two Sigma Investments)
The presentation will explain why Two Sigma, a company in a space notorious for protecting IP, thinks it's important to contribute to the open source community. I'll talk about the evolution of our thinking and policies over the past five years, and make a case for why other companies should make a commitment to the open source ecosystem.
Integrations with other Software, JupyterHub deployments, Reproducible research and open science
Ryan Abernathey (Columbia University), Yuvi Panda (Data Science Education Program (UC Berkeley))
Climate science is being flooded with petabytes of data, overwhelming traditional modes of data analysis. The Pangeo project is building a platform to take Big Data Climate Science into the cloud using scientific python and large-scale interactive computing tools. Come find out what we are building, why we are building it, and how you can use it!
Enterprise and organizational adoption, Extensions and customization, Integrations with other Software
Romit Mehta (PayPal), Praveen Kanamarlapudi (PayPal)
Hundreds of data scientists, analysts and developers at PayPal use Jupyter to access data spread across filesystem, relational, document and key-value stores. Jupyter enables complex analytics and an easy way to build, train and deploy machine learning models at PayPal. Learn more about how we built the Jupyter infrastructure and powerful extensions at PayPal.
Community, Reproducible research and open science, Training and education
April Clyburne-Sherin (Code Ocean)
Tutorial Please note: to attend, your registration must include Tutorials.
This is a practical tutorial to prepare Jupyter notebooks for computationally reproducible publication. We start with introductory information about computational reproducibility but the bulk of the tutorial is guided work. Best practices for publishing notebooks are covered, with participants preparing their research for reuse, creating documentation, and submitting their notebook to share.
Data visualization, Extensions and customization, JupyterHub deployments
George Williams (Capsule8)
The key to successful threat detection in cybersecurity is fast response. Many actors are involved including operations specialists, cybersecurity experts, developers, and more recently data scientists. We have built specialized extensions for data scientists working in cybersecurity, that can be used and deployed via JupyterHub.
Documentation, Reproducible research and open science, Training and education
Elizabeth Wickes (School of Information Sciences, University of Illinois at Urbana-Champaign)
As practitioners of open science begin to migrate their educational material into pubic repositories, many of their common practices and platforms can be used to streamline the instruction material development process. This talk will compare how many open science practices can be used in an educational context, and are best facilitated by usage of tools like the Jupyter Notebook.
Data visualization, Extensions and customization, Reproducible research and open science
Chris Harris (Kitware)
In-silico prediction of chemical properties has seen vast improvements in both veracity and volume of data, but is currently hamstrung by a lack of transparent, reproducible workflows coupled with environments for visualization and analysis. We have developed a platform that uses Jupyter notebooks to enable end-to-end workflow from simulation setup, right through to visualizing the results.
Rachael Tatman (Kaggle)
1-Day Training Please note: you must be registered for a Platinum pass.
In this workshop, we’ll take an existing research project and make it fully reproducible using Kaggle Kernels. This workshop will include hands-on instruction and best practices for each of the three components necessary for completely reproducible research.
Reproducible research and open science
Renga is a highly-scalable and secure open software platform designed to make (data) science reproducible, to foster collaboration between scientists, and to share resources in a federated environment.
Extensions and customization, Jupyter subprojects, Usage and application
Matthew Seal (Netflix)
Using an nteract project, papermill, we’ll walk through how we use notebooks to track user jobs and make a simple interface for work submission. You’ll get an inside peek at how Netflix is tackling the scheduling problem for a range of users who want easily managed workflows.
Carl Osipov (Google)
1-Day Training Please note: you must be registered for a Platinum pass.
In this workshop, we walk through the process of building machine learning models with TensorFlow. We cover data exploration, feature engineering, model creation, training, evaluation and deployment.
Data visualization, Extensions and customization, Reproducible research and open science
David Koop (University of Massachusetts Dartmouth)
Dataflow notebooks build on the Jupyter Notebook environment by adding constructs to make dependencies between cells explicit and clear. In this environment, unique, persistent cell identifiers make references between cells more robust. In addition, support for recursive dataflow execution of cells allows users to better organize, reuse, and reproduce their work.
Enterprise and organizational adoption, Extensions and customization, Integrations with other Software
Diogo Castro (CERN)
SWAN, CERN’s Service for Web-based ANalysis, is leveraging the power of Jupyter to provide the High Energy Physics community with access to state-of-the-art infrastructure and services through a web service. This presentation details how this was possible and how is being used by researchers and students.
Enterprise and organizational adoption, Extensions and customization
Stephanie Stattel (Bloomberg LP), Paul Ivanov (Bloomberg LP)
We will walk through a series of extensions that demonstrate the power and flexibility of JupyterLab’s architecture. From targeted functionality modifications to more extreme atmospheric changes that require extensive decoupling and flexibility within JupyterLab, we explore the complexity and stability of extensions and how they can combine to form custom, opinionated JupyterLab environments.
Integrations with other Software, Usage and application
John Miller (Honeywell UOP)
A full-featured client for the Jupyter Notebook in Emacs. The Emacs IPython Notebook, or EIN, is a full-feature client for the Jupyter Notebook that runs in the venerable Emacs":https://www.gnu.org/software/emacs/ text editor. This presentation is intended to provide a general introduction to the tool along with a brief history of its development.
Community, Reproducible research and open science, Usage and application
Viral Shah (Julia Computing, Inc.), Jane Herriman (Julia Computing, Inc.)
Julia and Jupyter share a common evolution path. Julia is the language for modern technical computing, while Jupyter is the development and presentation environment for modern technical computing. This talk will explore the journey of Julia and the impact of Jupyter on Julia's growth.
Keynotes
Paco Nathan (O'Reilly Media), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
Thursday Opening Remarks
Enterprise and organizational adoption, JupyterCon Business Summit, JupyterHub deployments
David Schaaf (Capital One), Shivraj Ramanan (Capital One)
Capital One recently explored different "notebook", looking for new ways to support our information based strategy. As an outcome, JupyterHub emerged as one of many options that can serve as a potential platform for analytics even in highly regulated industries like Financial Services. Come learn more about our journey and how Jupyter has become a part of an ever growing analytics toolkit!
Enterprise and organizational adoption, Reproducible research and open science, Usage and application
Sean Gorman (Timbr.io)
Satellite imagery can be a critical resource during disasters and humanitarian crises. While the community has improved data sharing we still struggle to create reusable data science to solve on the ground problems. GBDX Notebooks is a step towards creating an open data science community built around Jupyter to stream imagery and share analysis at scale.
Data visualization, Reproducible research and open science, Usage and application
Seth Lawler (Dewberry)
Creating flood maps for coastal & riverine communities requires geospatial processing, statistical analysis, finite element modeling, and a team of specialists working together. This talk will demo the process of how using the feature-rich JupyterLab to develop tools, share code with team members, and document workflows used in the creation of flood maps improves productivity and reproducibility.
Extensions and customization, Kernels, Usage and application
Veda Shankar (MapD Inc.)
The Jupyter MapD kernel allows seamless integration of the GPU-based MapD Core SQL engine into a machine learning pipeline. (I am submitting this abstract for a speaker)
Data visualization
Nicolas Fernandez (Icahn School of Medicine at Mount Sinai)
Exploring high-dimensional requires the development of sophisticated interactive visualizations to enable users to easily discover complex patterns within their data. We developed Clustergrammer-widget, an interactive heatmap Jupyter widget, that enables users to easily explore high-dimensional data within a Jupyter notebook and share their interactive visualizations using NBviewer.
Community, Usage and application
Holden Karau (Google), Matt Hunt (Bloomberg)
Many of us believe that gender diversity in open source projects is important (and if you don’t this isn’t going to convince you), but what things are correlated with improved gender diversity and what can we learn from similar historic industries? We will explore the diversity of different projects possible factors. We’ll examine historic EEOC complaints & parallels + historic solutions.