Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

Schedule: Reproducible research and open science sessions

9:00am–12:30pm Wednesday, August 22, 2018
Location: Murray Hill A Level: Beginner
Carol Willing (Cal Poly San Luis Obispo), Min Ragan-Kelley (Simula Research Laboratory), Erik Sundell (IT-Gymnasiet Uppsala)
Average rating: ****.
(4.50, 2 ratings)
Carol Willing, Min Ragan-Kelley, and Erik Sundell demonstrate how to provide easy access to Jupyter notebooks and JupyterLab without requiring users to install anything on their computers. You'll learn how to configure and deploy a cloud-based JupyterHub using Kubernetes and how to customize and extend it for your needs. Read more.
9:00am–12:30pm Wednesday, August 22, 2018
Location: Gramercy B Level: Beginner
April Clyburne-Sherin (Code Ocean)
Average rating: ***..
(3.00, 2 ratings)
April Clyburne-Sherin walks you through preparing Jupyter notebooks for computationally reproducible publication. You'll learn best practices for publishing notebooks and get hands-on experience preparing your own research for reuse, creating documentation, and submitting your notebook to share. Read more.
1:30pm–5:00pm Wednesday, August 22, 2018
Location: Murray Hill B
Bruno Goncalves (JPMorgan Chase & Co.), Matt Brems (General Assembly)
This two-part tutorial presents a sequence of advanced topics in Data Science, based on using Jupyter. Read more.
11:05am–11:45am Thursday, August 23, 2018
Location: Nassau Level: Intermediate
Ian Foster (Argonne National Laboratory | University of Chicago)
Average rating: *****
(5.00, 1 rating)
The Globus service simplifies the utilization of large and distributed data on the Jupyter platform. Ian Foster explains how to use Globus and Jupyter to seamlessly access notebooks using existing institutional credentials, connect notebooks with data residing on disparate storage systems, and make data securely available to business partners and research collaborators. Read more.
11:55am–12:35pm Thursday, August 23, 2018
Location: Murray Hill Level: Beginner
Adam Thornton (LSST)
LSST is an ambitious project to map the sky in the fastest, widest, and deepest survey ever made. The project's database disrupts traditional astronomical workflows, and its science platform requires a paradigm shift in how astronomy is done. Adam Thornton discusses the challenges of providing production services on a notebook-based architecture and the compelling advantages of JupyterLab. Read more.
11:55am–12:35pm Thursday, August 23, 2018
Location: Nassau Level: Beginner
Chris Harris (Kitware)
Average rating: *****
(5.00, 1 rating)
In silico prediction of chemical properties has seen vast improvements in both veracity and volume of data but is currently hamstrung by a lack of transparent, reproducible workflows coupled with environments for visualization and analysis. Chris Harris offers an overview of a platform that uses Jupyter notebooks to enable an end-to-end workflow from simulation setup to visualizing the results. Read more.
1:50pm–2:30pm Thursday, August 23, 2018
Location: Murray Hill Level: Beginner
Ryan Abernathey (Columbia University), Yuvi Panda (Data Science Education Program (UC Berkeley))
Average rating: *****
(5.00, 1 rating)
Climate science is being flooded with petabytes of data, overwhelming traditional modes of data analysis. The Pangeo project is building a platform to take big data climate science into the cloud using SciPy and large-scale interactive computing tools. Join Ryan Abernathey and Yuvi Panda to find out what the Pangeo team is building and why and learn how to use it. Read more.
1:50pm–2:30pm Thursday, August 23, 2018
Location: Nassau Level: Intermediate
Tony Fast (Ronin), Nick Bollweg (Georgia Tech Research Institute)
Average rating: **...
(2.00, 3 ratings)
Notebook authors often consider only the interactive experience of creating computable documents. However, the dynamic state of a notebook is a minor period in its lifecycle; the majority is spent as a file at rest. Tony Fast and Nick Bollweg explore conventions that create notebooks with value long past their inception as documents, software packages, test suites, and interactive applications. Read more.
2:40pm–3:20pm Thursday, August 23, 2018
Location: Murray Hill Level: Intermediate
William Stein (SageMath, Inc. | University of Washington)
Average rating: ****.
(4.50, 2 ratings)
William Stein explains how CoCalc relates to Project Jupyter and shares how he implemented real-time collaborative editing of Jupyter notebooks in CoCalc. Read more.
2:40pm–3:20pm Thursday, August 23, 2018
Location: Nassau Level: Non-technical
Viral Shah (Julia Computing), Jane Herriman (Julia Computing), Stefan Karpinski (Julia Computing, Inc.)
Julia and Jupyter share a common evolution path: Julia is the language for modern technical computing, while Jupyter is the development and presentation environment of choice for modern technical computing. Viral Shah and Jane Herriman discuss Julia's journey and the impact of Jupyter on Julia's growth. Read more.
4:10pm–4:50pm Thursday, August 23, 2018
Location: Murray Hill Level: Beginner
Thorin Tabor (University of California, San Diego)
Average rating: ****.
(4.50, 2 ratings)
Making Jupyter accessible to all members of a research organization, regardless of their programming ability, empowers it to best utilize the latest analysis methods while avoiding bottlenecks. Thorin Tabor offers an overview of the GenePattern Notebook, which offers a wide suite of enhancements to the Jupyter environment to help bridge the gap between programmers and nonprogrammers. Read more.
5:00pm–5:40pm Thursday, August 23, 2018
Location: Murray Hill Level: Beginner
Bo Peng (The University of Texas, MD Anderson Cancer Center)
Bo Peng offers an overview of Script of Scripts (SoS), a Python 3-based workflow engine with a Jupyter frontend that allows the use of multiple kernels in one notebook. This unique combination enables users to analyze data using multiple scripting languages in one notebook and, if needed, convert scripts to workflows in situ to analyze large amounts of data on remote systems. Read more.
5:00pm–5:40pm Thursday, August 23, 2018
Location: Nassau Level: Beginner
Tim Head (Wild Tree Tech)
Average rating: ****.
(4.33, 3 ratings)
The Binder project drastically lowers the bar to sharing and reusing software. Users wanting to try out someone else’s work need only click a single link to do so. Tim Head offers an overview of the Binder project and explores the concepts and ideas behind it. Tim then showcases examples from the community to show off the power of Binder. Read more.
11:05am–11:45am Friday, August 24, 2018
Location: Sutton Center/Sutton South Level: Intermediate
Jackson Brown (Allen Institute for Cell Science), Aneesh Karve (Quilt)
Average rating: *****
(5.00, 3 ratings)
Reproducible data is essential for notebooks that work across time, across contributors, and across machines. Jackson Brown and Aneesh Karve demonstrate how to use an open source data registry to create reproducible data dependencies for Jupyter and share a case study in open science over terabyte-size image datasets. Read more.
11:05am–11:45am Friday, August 24, 2018
Location: Nassau Level: Beginner
Sandra Savchenko-de Jong (Swiss Data Science Center)
Average rating: *****
(5.00, 2 ratings)
Sandra Savchenko-de Jong offers an overview of Renku, a highly scalable and secure open software platform designed to make (data) science reproducible, foster collaboration between scientists, and share resources in a federated environment. Read more.
11:55am–12:35pm Friday, August 24, 2018
Location: Murray Hill Level: Intermediate
Tyler Erickson (Google)
Massive collections of data on the Earth's changing environment, collected by satellite sensors and generated by Earth system models, are being exposed via web APIs by multiple providers. Tyler Erickson highlights the use of JupyterLab and Jupyter widgets in analyzing complex high-dimensional datasets, providing insights into how our Earth is changing and what the future might look like. Read more.
1:50pm–2:30pm Friday, August 24, 2018
Location: Murray Hill Level: Beginner
Seth Lawler (Dewberry)
Average rating: *****
(5.00, 1 rating)
Creating flood maps for coastal and riverine communities requires geospatial processing, statistical analysis, finite element modeling, and a team of specialists working together. Seth Lawler explains how using the feature-rich JupyterLab to develop tools, share code with team members, and document workflows used in the creation of flood maps improves productivity and reproducibility. Read more.
1:50pm–2:30pm Friday, August 24, 2018
Location: Nassau Level: Intermediate
David Koop (University of Massachusetts Dartmouth)
Average rating: ****.
(4.50, 2 ratings)
Dataflow notebooks build on the Jupyter Notebook environment by adding constructs to make dependencies between cells explicit and clear. David Koop offers an overview of the Dataflow kernel, shows how it can be used to robustly link cells as a notebook is developed, and demonstrates how that notebook can be reused and extended without impacting its reproducibility. Read more.
2:40pm–3:20pm Friday, August 24, 2018
Location: Beekman/Sutton North Level: Non-technical
Elizabeth Wickes (School of Information Sciences, University of Illinois at Urbana-Champaign)
As practitioners of open science begin to migrate their educational material into pubic repositories, many of their common practices and platforms can be used to streamline the instruction material development process. Elizabeth Wickes explains how open science practices can be used in an educational context and why they are best facilitated by tools like the Jupyter Notebook. Read more.
2:40pm–3:20pm Friday, August 24, 2018
Location: Nassau Level: Intermediate
Kevin Zielnicki (Stitch Fix)
Average rating: ****.
(4.50, 2 ratings)
Even with good intentions, analysis notebooks can quickly accumulate a mess of false starts and out-of-order statements. Best practices encourage cleaning up a notebook to ensure reproducibility, but many analyses will never reach this cleaned-up state. Kevin Zielnicki offers an overview of Nodebook, a Jupyter plugin that encourages reproducibility by preventing inconsistency. Read more.
4:10pm–4:50pm Friday, August 24, 2018
Location: Murray Hill Level: Beginner
Sean Gorman (DigitalGlobe)
Satellite imagery can be a critical resource during disasters and humanitarian crises. While the community has improved data sharing, we still struggle to create reusable data science to solve problems on the ground. Sean Gorman offers an overview of GBDX Notebooks, a step toward creating an open data science community built around Jupyter to stream imagery and share analysis at scale. Read more.
5:00pm–5:40pm Friday, August 24, 2018
Location: Murray Hill Level: Intermediate
Joshua Patterson (NVIDIA), Keith Kraus (NVIDIA), Leo Meyerovich (Graphistry)
Joshua Patterson, Leo Meyerovich, and Keith Kraus demonstrate how to use PyGDF and other GoAi technologies to easily analyze and interactively visualize large datasets from standard Jupyter notebooks. Read more.
5:00pm–5:40pm Friday, August 24, 2018
Location: Nassau Level: Intermediate
Scott Sanderson (Quantopian)
Average rating: ****.
(4.67, 3 ratings)
Scott Sanderson explores how interactivity can and should influence the design of software libraries, details how the needs of interactive users differ from the needs of application developers, and shares techniques for improving the usability of libraries in interactive environments without sacrificing robustness in noninteractive environments. Read more.