Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

Speaker slides & video

Presentation slides will be made available after the session has concluded and the speaker has given us the files. Check back if you don't see the file you're looking for—it might be available later! (However, please note some speakers choose not to share their presentations.)

If you are looking for slides and video from 2017, visit the JupyterCon 2017 site.

All
Enterprise and organizational adoption, Extensions and customization, Usage and application
Zachary Glassman (The Data Incubator)
2-Day Training Please note: to attend, you must be registered for a Platinum pass.
Zachary Glassman leads a hands-on dive into building intelligent business applications using machine learning, walking you through all the steps of developing a machine learning pipeline. You'll explore data cleaning, feature engineering, model building and evaluation, and deployment and extend these models into two applications from real-world datasets.
Extensions and customization, JupyterHub deployments, Reproducible research and open science
Adam Thornton (LSST)
LSST is an ambitious project to map the sky in the fastest, widest, and deepest survey ever made. The project's database disrupts traditional astronomical workflows, and its science platform requires a paradigm shift in how astronomy is done. Adam Thornton discusses the challenges of providing production services on a notebook-based architecture and the compelling advantages of JupyterLab.
Data visualization, Reproducible research and open science, Training and education
Bruno Gonçalves (New York University)
Tutorial Please note: to attend, your registration must include Tutorials.
Bruno Gonçalves offers an overview of the fundamental concepts and ideas behind human visual perception and explains how it informs scientific data visualization. To illustrate these concepts, Bruno shares practical examples using matplotlib and seaborn.
Reproducible research and open science, Training and education, Usage and application
Matt Brems (General Assembly)
Tutorial Please note: to attend, your registration must include Tutorials.
Missing data plagues nearly every data science problem. Often, people just drop or ignore missing data. However, this usually ends up with bad results. Matt Brems explains how bad dropping or ignoring missing data can be and teaches you how to handle missing data the right way by leveraging Jupyter notebooks to properly reweight or impute your data.
Keynotes
Will Farr (Stony Brook University)
Will Farr shares examples of Jupyter use within the LIGO and Virgo Scientific Collaborations and offers lessons about the (many) advantages and (few) disadvantages of Jupyter for large, global scientific collaborations. Along the way, Will speculates on Jupyter's future role in gravitational wave astronomy.
Community, Training and education
Jane Herriman (Julia Computing)
Tutorial Please note: to attend, your registration must include Tutorials.
Jane Herriman uses Jupyter notebooks to show you why Julia is special, demonstrate how easy it is to learn Julia, and get you writing your first Julia programs.
Poster, Reproducible research and open science, Training and education, Usage and application
Alaa Moussawi offers an overview of anomaly detection algorithms that use data from phasor measurement unit sensors in the power grid. These algorithms are designed from first principles. They classify anomalies using fundamental classification algorithms such as decision trees and neural networks. Feature selection is used to identify the optimal set of parameters for the learning algorithms.
Community, JupyterHub deployments, Reproducible research and open science
Tim Head (Wild Tree Tech)
The Binder project drastically lowers the bar to sharing and reusing software. A user wanting to try out someone else’s work need only click a single link to do so. Tim Head offers an overview of the Binder project and explores the concepts and ideas behind it. Tim then showcases examples from the community to show off the power of Binder.
Data visualization, Integrations with other Software
Explore efforts to bring full ipywidget support to the plotly.py data visualization library. This work brings many exciting new features to Jupyter Notebook users working with plotly.py, including Python callbacks, offline image export, binary array serialization, and integration with the broader ipywidget ecosystem.
JupyterCon Business Summit
The Business Summit concludes with "unconference"-style breakout sessions that allow enterprise stakeholders to give input to Project Jupyter directly.
JupyterCon Business Summit
Joel Horwitz (IBM), David Schaaf (Capital One), Dan Romuald Mbanga (Amazon Web Services), Pramit Choudhary (DataScience.com), Dave Stuart (Department of Defense )
Join in for the Business Summit's roundtable discussion with participation from IBM, Capital One, the DoD, Amazon AWS, Oracle, and others. Speakers will discuss important issues in our current environment—everything from compliance and GDPR to ML models.
Enterprise and organizational adoption, JupyterHub deployments, Usage and application
Ian Allison (Pacific Institute for the Mathematical Sciences), James Colliander (Pacific Institute for the Mathematical Sciences)
Over the past 18 months, Ian Allison and James Colliander have deployed Jupyter to more than 8,000 users at universities across Canada. Ian and James offer an overview of the Syzygy platform and explain how they plan to scale and deliver the service nationally and how they intend to make Jupyter integral to the working experience of students, researchers, and faculty members.
Enterprise and organizational adoption, JupyterCon Business Summit, Usage and application
Dave Stuart (Department of Defense )
Dave Stuart explains how Jupyter was used inside the US Department of Defense and the greater intelligence community to empower thousands of "citizen data scientists" to build and share analytics in order to meet the community’s dynamic challenges.
Keynotes
Closing remarks
Keynotes
Closing remarks
Extensions and customization, Training and education, Usage and application
Damián Avila (Anaconda, Inc.)
RISE has evolved into the main slideshow machinery for live presentations within the Jupyter notebook. Damián Avila explains how to install and use RISE. You'll also discover how to customize it and see some of its new capabilities. Damián concludes by discussing the migration from RISE into a new JupyterLab-RISE extension providing RISE-based capabilities in the new JupyterLab interface.
Keynotes
Michelle Gill, Ph.D. (BenevolentAI)
Michelle Gill explains how data science methodologies and tools can be used to link information from different scientific fields and accelerate discovery in a variety of areas, including the biological sciences.
Enterprise and organizational adoption, JupyterHub deployments, Training and education
Laura Noren (NYU Center for Data Science)
Laura Noren offers an overview of a research project on the various infrastructure models supporting data science in research settings in terms of funding, educational uses, and research utilization. Laura then shares some of the findings, comparing the national federation model currently established in Canada to the more grassroots efforts in many US universities.
Keynotes
Tracy Teal (The Carpentries)
We are generating vast amounts of data, but it's not the data itself that is valuable—it's the information and knowledge that can come from this data. Tracy Teal explains how to bring people to data and empower them to address their questions, reach their potential, and solve issues that are important in science, scholarship, and society.
JupyterHub deployments, Reproducible research and open science, Training and education
Carol Willing (Cal Poly San Luis Obispo), Min Ragan-Kelley (Simula Research Laboratory), Erik Sundell (IT-Gymnasiet Uppsala)
Tutorial Please note: to attend, your registration must include Tutorials.
Carol Willing, Min Ragan-Kelley, and Erik Sundell demonstrate how to provide easy access to Jupyter notebooks and JupyterLab without requiring users to install anything on their computers. You'll learn how to configure and deploy a cloud-based JupyterHub using Kubernetes and how to customize and extend it for your needs.
Data visualization, Integrations with other Software, Usage and application
Kerim Kalafala and Nicholai L'Esperance share their experiences using Jupyter notebooks as a critical aid in designing the next generation of IBM Power and Z processors, focusing on analytics on graphs consisting of hundreds of millions of nodes. Along the way, Kerim and Nicholai explain how they leverage Jupyter notebooks as part of their overall design system.
Integrations with other Software, Reproducible research and open science, Usage and application
Scott Sanderson (Quantopian)
Scott Sanderson explores how interactivity can and should influence the design of software libraries, details how the needs of interactive users differ from the needs of application developers, and shares techniques for improving the usability of libraries in interactive environments without sacrificing robustness in noninteractive environments.
Data visualization, Poster, Reproducible research and open science, Training and education
Available building energy data analysis software doesn't meet the needs of building scientists and energy service professionals. Join in to explore a Python-based API and data visualization toolkit that can be used within a Jupyter notebook to create a powerful and flexible analysis tool and prototype code that can be plugged in to more robust applications.
JupyterCon Business Summit
Brian Granger (Cal Poly San Luis Obispo)
Over the past two years, we have seen a dramatic shift in Jupyter’s deployment, from ad hoc usage by individuals to production enterprise application at scale. Brian Granger explains how this has expanded the Jupyter community and revealed new use cases with new challenges and opportunities.
Community, Enterprise and organizational adoption, JupyterHub deployments
Join in to discover lessons learned utilizing JupyterHub and Jupyter notebooks to facilitate workshops for participants and demonstrators at the ESIP 2018 Summer Meeting in Tuscon, Arizona.
Integrations with other Software, Poster, Reproducible research and open science, Usage and application
Today’s Balkanized “data cathedrals” force us to extract, transform, and load data for before use, leaving us without a way to use data we don’t control. Join in to learn why this approach should be replaced by the "data bazaar," allowing us to freely compose and build upon each other’s data much the way we do with software today—using Jupyter as a key tool.
Extensions and customization, Reproducible research and open science
Kevin Zielnicki (Stitch Fix)
Even with good intentions, analysis notebooks can quickly accumulate a mess of false starts and out-of-order statements. Best practices encourage cleaning up a notebook to ensure reproducibility, but many analyses will never reach this cleaned-up state. Kevin Zielnicki offers an overview of Nodebook, a Jupyter plugin that encourages reproducibility by preventing inconsistency.
Wenming Ye (Amazon Web Services), Miro Enev
2-Day Training Please note: to attend, you must be registered for a Platinum pass.
Machine learning and IoT projects are increasingly common at enterprises and startups alike and have been the key innovation engine for Amazon businesses such as Go, Alexa, and Robotics. Wenming Ye and Miro Enev lead a hands-on deep dive into the AWS machine learning platform, using Project Jupyter-based Amazon SageMaker to build, train, and deploy ML/DL models to the cloud and AWS DeepLens.
Training and education
Lorena Barba (George Washington University), Robert Talbert (Grand Valley State University)
In flipped learning, students encounter new material before class meetings, which helps them learn how to learn and frees up class time to focus on creative applications of the basic material. Lorena Barba and Robert Talbert discuss the use of Jupyter notebooks as a “tangible interface” for new material in a flipped course and share case studies from their own courses.
Keynotes
Paco Nathan (derwen.ai), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
JupyterCon cochairs Paco Nathan, Fernando Pérez, and Brian Granger open the second day of keynotes.
Integrations with other Software, Reproducible research and open science, Usage and application
Thorin Tabor (University of California, San Diego)
Making Jupyter accessible to all members of a research organization, regardless of their programming ability, empowers it to best utilize the latest analysis methods while avoiding bottlenecks. Thorin Tabor offers an overview of the GenePattern Notebook, which offers a wide suite of enhancements to the Jupyter environment to help bridge the gap between programmers and nonprogrammers.
Data visualization, Integrations with other Software, Kernels
Sylvain Corlay (QuantStack), Johan Mabille (QuantStack), Wolf Vollprecht (QuantStack), Martin Renou
Sylvain Corlay, Johan Mabille, Wolf Vollprecht, and Martin Renou share the latest features of the C++ Jupyter kernel, including live help, auto-completion, rich MIME type rendering, and interactive widgets. Join in to explore one of the most feature-full implementations of the Jupyter kernel protocol that also brings Jupyter closer to the metal.
Enterprise and organizational adoption, Reproducible research and open science, Usage and application
Joshua Patterson (NVIDIA), Keith Kraus (NVIDIA)
The GPU Open Analytics Initiative (GoAi) is a collection of open source libraries, frameworks, and APIs that make leveraging GPUs easy for data scientists. Joshua Patterson and Keith Kraus demonstrate how to build Jupyter notebooks with GPU-accelerated data processing and visualizations, rapidly accelerating data exploration all without writing any low-level code.
Enterprise and organizational adoption, Extensions and customization, Integrations with other Software, Poster
Preferred Networks operates a cluster with 1,024 GPUs. Instant and flexible access to the cluster is essential for its researchers, but it's hard to provide exclusive access to GPU cores. Join in to learn how the company introduced JupyterHub on Mesos. Mesos is responsible for resource isolation, and using Docker images with shared home directories provides a highly flexible environment.
Integrations with other Software, Reproducible research and open science, Usage and application
Tyler Erickson (Google)
Massive collections of data on the Earth's changing environment, collected by satellite sensors and generated by Earth system models, are being exposed via web APIs by multiple providers. Tyler Erickson highlights the use of JupyterLab and Jupyter widgets in analyzing complex high-dimensional datasets, providing insights into how our Earth is changing and what the future might look like.
JupyterCon Business Summit, Usage and application
ED MA (Synchrony Financial)
In the corporate tax world, Microsoft Excel—the king of spreadsheets—is the default tool for tracking information and managing tasks, but tax professionals are often annoyed by slowly updating or broken linked or referenced cells within or between spreadsheets. Jinli Ma explains how the Jupyter Notebook does a better job than Microsoft Excel with the original issued discount calculation process.
Core architecture
Kyle Kelley (Netflix)
Tutorial Please note: to attend, your registration must include Tutorials.
Kyle Kelley walks you through creating a new web application from the ground up, teaching you how to build on top of Jupyter's protocols in the process. Along the way, you'll learn about Jupyter's REST and streaming APIs, message spec, and the notebook format.
JupyterHub deployments, Usage and application
Yuvi Panda (Data Science Education Program (UC Berkeley))
Running infrastructure is challenging for an open source community. Yuvi Panda shares lessons drawn from the small community that operates MyBinder.org, covering the social and technical processes for keeping MyBinder.org reliable in the most open, transparent, and inclusive way possible, using pretty graphs about the state of MyBinder.org that anyone can see in real time.
Data visualization, Reproducible research and open science, Training and education
Pramit Choudhary (DataScience.com)
Tutorial Please note: to attend, your registration must include Tutorials.
Just predicting the target labels for a data science use case is not enough. It's important to understand the why, what, and how of a given model’s behavior. Pramit Choudhary explores algorithms (post hoc and rule extraction) to faithfully interpret ML models globally and locally with Jupyter's interactiveness and Skater, an open source library to demystify the inner workings of ML models.
Training and education
Rachael Tatman (Kaggle)
Tutorial Please note: to attend, your registration must include Tutorials.
Rachael Tatman offers practical introduction to incorporating Jupyter notebooks into the classroom using active learning techniques.
Training and education, Usage and application
Joel Grus (Allen Institute for Artificial Intelligence)
Joel Grus has been using and teaching Python for many years. He wrote a best-selling book about learning data science. Here's his confession: he doesn't like notebooks. (And there are dozens just like him.) Joel explains why he finds notebooks difficult, demonstrates how they frustrate his preferred pedagogy, outlines how he prefers to work, and details what Jupyter could do to win him over.
Enterprise and organizational adoption, JupyterHub deployments, Usage and application
The Minnesota Supercomputing Institute has implemented JupyterHub and the Jupyter Notebook server as a general-purpose point of entry to interactive high-performance computing services. This mode of operation runs counter to traditional job-oriented HPC operations but offers significant advantages for ease of use, data exploration, prototyping, and workflow development.
Training and education
Rob Newton (Trinity School)
In an effort to broaden graduates' mathematical toolkit and address gender equity in STEM education, Rob Newton has led the implementation of Python projects across his school's entire ninth-grade math courses. Now every student in the ninth grade completes three python projects that introduce programming and integrate them with the ideas developed in class.
Extensions and customization, Training and education, Usage and application
Douglas Blank (Bryn Mawr College), Nicole Petrozzo (Bryn Mawr College)
For the last four years, Douglas Blank has used nothing but Jupyter in the classroom—from a first-year writing course to a course on assembly language, from biology to computer science, from lectures to homework. Join in to learn how Douglas has leveraged Jupyter and discover the successes and failures he experienced along the way. Nicole Petrozzo then offers a student's perspective.
JupyterCon Business Summit
Gerald Rousselle (Teradata)
Gerald Rouselle reviews some of the trends in modern data and analytics ecosystems for large enterprises and shares some of the key challenges and opportunities for Jupyter adoption. He also details some recent examples and experiments in incorporating Jupyter in commercial products and platforms.
Enterprise and organizational adoption, Integrations with other Software, Usage and application
Explore IBM's Data Science Experience (DSX) and see how it leverages Jupyter to enable data scientists and AI professionals to create notebooks accessing cloud data services and Watson AI services to collaboratively analyze data and gain insights. Join in to see a demo of example Python notebooks created in Jupyter in DSX that apply AI to data and visualize the results.
Keynotes
David Schaaf (Capital One)
David Schaaf explains how data science and data engineering can work together in cross-functional teams—with Jupyter notebooks at the center of collaboration and the analytic workflow—to more effectively and more quickly deliver results to decision makers.
Maarten Breddels (Maarten Breddels), Sylvain Corlay (QuantStack)
Project Jupyter aims to provide a consistent set of tools for data science workflows, from the exploratory phase of the analysis to the sharing of the results. Maarten Breddels and Sylvain Corlay offer an overview of Jupyter's interactive widgets framework, which enables rich user interaction, including 2D and 3D interactive plotting, geographic data visualization, and much more.
Afshin Darian (Two Sigma, Project Jupyter), M Pacer (Netflix), Min Ragan-Kelley (Simula Research Laboratory), Matthias Bussonnier (UC Berkeley BIDS)
Jupyter's straightforward, out-of-the-box experience has been important for its success in widespread adoption. But good defaults only go so far. Join Afshin Darian, M Pacer, Min Ragan-Kelley, and Matthias Bussonnier to go beyond the defaults and make Jupyter your own.
JupyterCon Business Summit
Julia Lane (Center for Urban Science and Progress and Wagner School, NYU)
Government agencies have found it difficult to serve taxpayers because of the technical, bureaucratic, and ethical issues associated with access and use of sensitive data. Julia Lane explains how the Coleridge Initiative has partnered with Jupyter to design ways that can address the core problems such organizations face.
JupyterHub deployments, Training and education, Usage and application
Mariah Rogers (UC Berkeley Division of Data Sciences), Ronald Walker (UC Berkeley Division of Data Sciences), Julian Kudszus (Yelp)
The Data Science Modules program at UC Berkeley creates short explorations into data science using notebooks to allow students to work hands-on with a dataset relevant to their course. Mariah Rogers, Ronald Walker, and Julian Kudszus explain the logistics behind such a program and the indispensable features of JupyterHub that enable such a unique learning experience.
Ian Rose (UC Berkeley), Chris Colbert (Project Jupyter)
Ian Rose and Chris Colbert walk you through the JupyterLab interface and codebase and explain how it fits within the overall roadmap of Project Jupyter.
Data visualization, Kernels, Usage and application
Lindsay Richman (McKinsey & Co.)
JupyterLab and Plotly both provide a rich set of tools for working with data. When combined, they create a powerful computational environment that enables users to produce versatile, robust visualizations in a fast-paced setting. Lindsay Richman demonstrates how to use JupyterLab, Plotly, and Plotly's Python-based Dash framework to create dynamic charts and interactive reports.
Chris Colbert (Project Jupyter), Ian Rose (UC Berkeley), Saul Shanabrook (Quansight)
1-Day Training Please note: you must be registered for a Platinum pass.
Chris Colbert, Ian Rose, and Saul Shanabrook walk you through using, extending, and developing custom components for JupyterLab using PhosphorJS, React, JavaScript, TypeScript, and CSS. You'll learn how to make full use of the power features of JupyterLab, customize it to your needs, and develop custom extensions, making complete use of JupyterLab's current capabilities.
Jason Grout (Bloomberg), Matthias Bussonnier (UC Berkeley BIDS)
Tutorial Please note: to attend, your registration must include Tutorials.
JupyterLab—Jupyter's new frontend—goes beyond the classic Jupyter Notebook, providing a flexible and extensible web application with a set of reusable components. Jason Grout and Matthias Bussonnier walk you through using JupyterLab, explain how to transition from the classic Jupyter Notebook frontend to JupyterLab, and demonstrate the new powerful features of JupyterLab.
Keynotes
Keynote - To Be Announced
Keynotes
Keynote - To Be Announced
Keynotes
Keynotes - To Be Announced
Keynotes
Keynote - To Be Announced
Keynotes
Keynotes - To Be Announced Soon
Keynotes
Keynote - To Be Announced
Keynotes
Dan Romuald Mbanga (Amazon Web Services)
Keynote by Dan Romuald Mbanga
Keynotes
Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)
Keynote by Fernando Perez
Keynotes
Paco Nathan (derwen.ai)
Keynote by Paco Nathan
Community, Jupyter subprojects, Training and education
Carol Willing (Cal Poly San Luis Obispo), Jessica Forde (Jupyter), Erik Sundell (IT-Gymnasiet Uppsala)
Students learn by doing. Carol Willing, Jessica Forde, and Erik Sundell demonstrate the value of interactive content, using Jupyter notebooks, widgets, and visualization libraries, share notable examples of projects within the Jupyter community, and outline ways educators can help students develop data science literacy and use computational skills to build upon their interests.
Extensions and customization, Usage and application
Jupyter is useful for DevOps. It enables collaboration between experts and novices to accumulate infrastructure knowledge, and automation via notebooks enhances traceability and reproducibility. Learn how the National Institute of Informatics enhances Jupyter for reproducible infrastructure operation through literate computing practices, which aim to keep humans in the automated operational loop.
Christopher Cho (Google)
1-Day Training Please note: you must be registered for a Platinum pass.
Christopher Cho demonstrates how Kubernetes can be easily leveraged to build a complete deep learning pipeline, including data ingestion and aggregation, preprocessing, ML training, and serving with the mighty Kubernetes APIs.
Core architecture, Data visualization, Extensions and customization
M Pacer (Netflix)
Jupyter displays a rich array of media types out-of-the-box. M Pacer explains how to use these capabilities to their full potential, covering how to add rich displays to existing and new Python classes and how to customize the way notebooks are converted to other formats. These skills will enable anyone to make beautiful objects with Jupyter.
Community, Extensions and customization, Usage and application
Sam Lau (UC Berkeley), Caleb Siu (UC Berkeley)
The nbinteract package converts Jupyter notebooks with widgets into interactive, standalone HTML pages. Its built-in support for function-driven plotting makes authoring interactive pages simpler by allowing users to focus on data, not callbacks. Sam Lau and Caleb Siu offer an overview of nbinteract and walk you through the steps to publish an interactive web page from a Jupyter notebook.
Community, JupyterCon Business Summit
Matt Greenwood (Two Sigma Investments)
Matt Greenwood explains why Two Sigma, a company in a space notorious for protecting IP, thinks it's important to contribute to the open source community. Matt covers the evolution of Two Sigma's thinking and policies over the past five years and makes a case for why other companies should make a commitment to the open source ecosystem.
Community, Usage and application
Learn how to make pandas faster by changing a single line of your code. Pandas on Ray gives users a seamless way to transition into multiprocess computing and parallel execution of their data science pipelines.
Integrations with other Software, JupyterHub deployments, Reproducible research and open science
Ryan Abernathey (Columbia University), Yuvi Panda (Data Science Education Program (UC Berkeley))
Climate science is being flooded with petabytes of data, overwhelming traditional modes of data analysis. The Pangeo project is building a platform to take big data climate science into the cloud using SciPy and large-scale interactive computing tools. Join Ryan Abernathey and Yuvi Panda to find out what the Pangeo team is building and why and learn how to use it.
Enterprise and organizational adoption, Extensions and customization, Integrations with other Software, JupyterCon Business Summit
Romit Mehta (PayPal), Praveen Kanamarlapudi (PayPal)
Hundreds of PayPal's data scientists, analysts, and developers use Jupyter to access data spread across filesystem, relational, document, and key-value stores, enabling complex analytics and an easy way to build, train, and deploy machine learning models. Romit Mehta and Praveen Kanamarlapudi explain how PayPal built its Jupyter infrastructure and powerful extensions.
Community, Reproducible research and open science, Training and education
April Clyburne-Sherin (Code Ocean)
Tutorial Please note: to attend, your registration must include Tutorials.
April Clyburne-Sherin walks you through preparing Jupyter notebooks for computationally reproducible publication. You'll learn best practices for publishing notebooks and get hands-on experience preparing your own research for reuse, creating documentation, and submitting your notebook to share.
Data visualization, Extensions and customization, JupyterCon Business Summit, JupyterHub deployments
George Williams (Capsule8), Harini Kannan (Capsule8), Alex Comerford (Capsule8)
The key to successful threat detection in cybersecurity is fast response. George Williams, Harini Kannan, and Alex Comerford offer an overview of specialized extensions they have built for data scientists working in cybersecurity that can be used and deployed via JupyterHub.
Core architecture, Reproducible research and open science, Training and education
William Stein (SageMath, Inc. | University of Washington)
William Stein explains how CoCalc relates to Project Jupyter and shares how he implemented real-time collaborative editing of Jupyter notebooks in CoCalc.
Enterprise and organizational adoption, Extensions and customization, Reproducible research and open science
Jackson Brown (Allen Institute for Cell Science), Aneesh Karve (Quilt)
Reproducible data is essential for notebooks that work across time, across contributors, and across machines. Jackson Brown and Aneesh Karve demonstrate how to use an open source data registry to create reproducible data dependencies for Jupyter and share a case study in open science over terabyte-size image datasets.
Documentation, Reproducible research and open science, Training and education
Elizabeth Wickes (School of Information Sciences, University of Illinois at Urbana-Champaign)
As practitioners of open science begin to migrate their educational material into pubic repositories, many of their common practices and platforms can be used to streamline the instruction material development process. Elizabeth Wickes explains how open science practices can be used in an educational context and why they are best facilitated by tools like the Jupyter Notebook.
Data visualization, Extensions and customization, Reproducible research and open science
Chris Harris (Kitware)
In silico prediction of chemical properties has seen vast improvements in both veracity and volume of data but is currently hamstrung by a lack of transparent, reproducible workflows coupled with environments for visualization and analysis. Chris Harris offers an overview of a platform that uses Jupyter notebooks to enable an end-to-end workflow from simulation setup to visualizing the results.
Rachael Tatman (Kaggle)
1-Day Training Please note: you must be registered for a Platinum pass.
Rachael Tatman shows you how to take an existing research project and make it fully reproducible using Kaggle Kernels. You'll learn best practices for and get hands-on experience with each of the three components necessary for completely reproducible research.
Reproducible research and open science
Sandra Savchenko-de Jong (Swiss Data Science Center)
Sandra Savchenko-de Jong offers an overview of Renku, a highly scalable and secure open software platform designed to make (data) science reproducible, foster collaboration between scientists, and share resources in a federated environment.
Integrations with other Software, Reproducible research and open science, Usage and application
Ian Foster (Argonne National Laboratory | University of Chicago)
The Globus service simplifies the utilization of large and distributed data on the Jupyter platform. Ian Foster explains how to use Globus and Jupyter to seamlessly access notebooks using existing institutional credentials, connect notebooks with data residing on disparate storage systems, and make data securely available to business partners and research collaborators.
Luciano Resende outlines a pattern for building deep learning models using the Jupyter Notebook's interactive development in commodity hardware and leveraging platforms and services such as Fabric for Deep Learning (FfDL) for cost-effective full dataset training of deep learning models.
Extensions and customization, Jupyter subprojects, Usage and application
Matthew Seal (Netflix)
Using an nteract project, papermill, Matthew Seal walks you through how Netflix uses notebooks to track user jobs and make a simple interface for work submission. You’ll get an inside peek at how Netflix is tackling the scheduling problem for a range of users who want easily managed workflows.
Carl Osipov (Google)
1-Day Training Please note: you must be registered for a Platinum pass.
Carl Osipov walks you through the process of building machine learning models with TensorFlow. You'll learn about data exploration, feature engineering, model creation, training, evaluation, deployment, and more.
Extensions and customization, Kernels, Reproducible research and open science
Bo Peng (The University of Texas, MD Anderson Cancer Center)
Script of Scripts (SoS) is a Python3-based workflow engine with a Jupyter frontend that allows the use of multiple kernels in one notebook. This unique combination allows users to analyze data using multiple scripting languages in one notebook, and, if needed, convert scripts to workflows in situ to analyze large amount of data on remote systems.
Core architecture, Kernels, Reproducible research and open science
Versioning is easy when you only need a local versioning system (v1, v2, v3, etc.). It gets hard when versioning info needs to concisely say if upgrades are safe or risky and roughly what will change. Explore StabVS, a stabilizing versioning system developed for EvoSysBio research, which could help Jupyter open science users increase the long-term stability of their code.
Data visualization, Extensions and customization, Reproducible research and open science
David Koop (University of Massachusetts Dartmouth)
Dataflow notebooks build on the Jupyter Notebook environment by adding constructs to make dependencies between cells explicit and clear. David Koop offers an overview of the Dataflow kernel, shows how it can be used to robustly link cells as a notebook is developed, and demonstrates how that notebook can be reused and extended without impacting its reproducibility.
Keynotes
Carol Willing (Cal Poly San Luis Obispo)
New challenges are emerging for Jupyter, open information, and investing in the future. You, the innovators of this growing knowledge commons, will determine how we meet these challenges and sustain the ecosystem. Carol Willing shows how you can start.
Enterprise and organizational adoption, Extensions and customization, Integrations with other Software
Diogo Castro (CERN)
SWAN, CERN’s service for web-based analysis, leverages the power of Jupyter to provide the high energy physics community access to state-of-the-art infrastructure and services through a web service. Diogo Castro offers an overview of SWAN and explains how researchers and students are using it in their work.
Enterprise and organizational adoption, Extensions and customization
Stephanie Stattel (Bloomberg LP), Paul Ivanov (Bloomberg LP)
Stephanie Stattel and Paul Ivanov walk you through a series of extensions that demonstrate the power and flexibility of JupyterLab’s architecture, from targeted functionality modifications to more extreme atmospheric changes that require extensive decoupling and flexibility within JupyterLab.
Min Ragan-Kelley (Simula Research Laboratory), Carol Willing (Cal Poly San Luis Obispo), Yuvi Panda (Data Science Education Program (UC Berkeley))
JupyterHub is a multiuser server for Jupyter notebooks, focused on supporting deployments in research and education. Min Ragan-Kelley, Carol Willing, and Yuvi Panda discuss recent additions and future plans for the project.
Integrations with other Software, Usage and application
John Miller (Honeywell UOP)
John Miller offers an overview of the Emacs IPython Notebook (EIN), a full-featured client for the Jupyter Notebook in Emacs, and shares a brief history of its development.
Keynotes
Ryan Abernathey (Columbia University)
Drawing on his experience with the Pangeo project, Ryan Abernathey makes the case for the large-scale migration of scientific data and research to the cloud. The cloud offers a way to make the largest datasets instantly accessible to the most sophisticated computational techniques. A global scientific data commons could usher in a golden age of data driven discovery.
Community, Reproducible research and open science, Usage and application
Viral Shah (Julia Computing), Jane Herriman (Julia Computing)
Julia and Jupyter share a common evolution path: Julia is the language for modern technical computing, while Jupyter is the development and presentation environment of choice for modern technical computing. Viral Shah and Jane Herriman discuss Julia's journey and the impact of Jupyter on Julia's growth.
JupyterCon Business Summit, Training and education, Usage and application
Catherine Ordun (Booz Allen Hamilton)
Many US government agencies are just getting started in machine learning. As a result, data scientists need to de-"black box" models as much as possible. One simple way to do this is to transparently show how the model is coded and its results at each step. Notebooks do just this. Catherine Ordun walks you through a notebook built for RNNs and explains how government agencies can use notebooks.
Extensions and customization, Reproducible research and open science, Usage and application
Tony Fast (Ronin), Nick Bollweg (Georgia Tech Research Institute)
Notebook authors often consider only the interactive experience of creating computable documents. However, the dynamic state of a notebook is a minor period in its lifecycle; the majority is spent as a file at rest. Tony Fast and Nick Bollweg explore conventions that create notebooks with value long past their inception as documents, software packages, test suites, and interactive applications.
Keynotes
Mark Hansen (Columbia Journalism School | The Brown Institute for Media Innovation)
Beyond Twitter and Facebook and similar networks, without question, data, code, and algorithms are forming systems of power in our society. Mark Hansen explains why it is crucial that journalists—explainers of last resort—be able to interrogate these systems, holding power to account.
Keynotes
Paco Nathan (derwen.ai), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
JupyterCon cochairs Paco Nathan, Fernando Pérez, and Brian Granger open the first day of keynotes.
JupyterCon Business Summit, JupyterHub deployments
David Schaaf (Capital One), Shivraj Ramanan (Capital One)
In Capital One's recent exploration of "notebook" offerings, JupyterHub emerged as a top contender that could serve as a potential platform for analytics even in highly regulated industries like financial services. David Schaaf and Shivraj Ramanan discuss Capital One's journey and explain how Jupyter has become a part of the company's ever-growing analytics toolkit.
Enterprise and organizational adoption, Reproducible research and open science, Usage and application
Sean Gorman (DigitalGlobe)
Satellite imagery can be a critical resource during disasters and humanitarian crises. While the community has improved data sharing, we still struggle to create reusable data science to solve on the ground problems. Sean Gorman offers an overview of GBDX Notebooks, a step toward creating an open data science community built around Jupyter to stream imagery and share analysis at scale.
Data visualization, Reproducible research and open science, Usage and application
Seth Lawler (Dewberry)
Creating flood maps for coastal and riverine communities requires geospatial processing, statistical analysis, finite element modeling, and a team of specialists working together. Seth Lawler explains how using the feature-rich JupyterLab to develop tools, share code with team members, and document workflows used in the creation of flood maps improves productivity and reproducibility.
Extensions and customization, Kernels, Usage and application
MapD Core is an open source analytical SQL engine that has been designed from the ground up to harness the parallelism inherent in GPUs. This enables queries on billions of rows of data in milliseconds. Randy Zwitch offers an overview of the MapD kernel extension for the Jupyter Notebook and explains how to use it in a typical machine learning workflow.
Data visualization
Nicolas Fernandez (Icahn School of Medicine at Mount Sinai)
Nicolas Fernandez offers an overview of Clustergrammer-Widget, an interactive heatmap Jupyter widget that enables users to easily explore high-dimensional data within a Jupyter notebook and share their interactive visualizations using nbviewer.
Chakri Cherukuri (Bloomberg LP)
Chakri Cherukuri offers an overview of the interactive widget ecosystem available in the Jupyter notebook and illustrates how Jupyter widgets can be used to build rich visualizations of machine learning models. Along the way, Chakri walks you through algorithms like regression, clustering, and optimization and shares a wizard for building and training deep learning models with diagnostic plots.
Community, Usage and application
Holden Karau (Google), Matt Hunt (Bloomberg)
Many of us believe that gender diversity in open source projects is important. (If you don’t, this isn’t going to convince you.) But what things are correlated with improved gender diversity, and what can we learn from similar historic industries? Holden Karau and Matt Hunt explore the diversity of different projects, examine historic EEOC complaints, and detail parallels and historic solutions.
Keynotes
Julia Meinwald (Two Sigma Investments)
Julia Meinwald outlines a few effective ways Two Sigma has identified to support the unseen labor maintaining a healthy open source ecosystem and details how the company’s thinking on this topic has evolved.