Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.

The official Jupyter Conference

August 22-23, 2017: Training

August 23-25, 2017: Tutorials & Conference

New York, NY

Speaker slides & video

Presentation slides will be made available after the session has concluded and the speaker has given us the files. Check back if you don't see the file you're looking for—it might be available later! (However, please note some speakers choose not to share their presentations.)

Accelerating data-driven culture at the largest media group in Latin America with Jupyter

Diogo Munaro Vieira (Globo.com), Felipe Ferreira (Globo.com)

View slides

JupyterHub is an important tool for research and data-driven decisions at Globo.com. Diogo Munaro Vieira and Felipe Ferreira explain how data scientists at Globo.com—the largest media group in Latin America and second largest television group in the world—use Jupyter notebooks for data analysis and machine learning, making decisions that impact 50 million users per month.

Beautiful networks and network analytics made simpler with Jupyter

Daina Bouquin (Harvard-Smithsonian Center for Astrophysics), John D (CUNY Building Performance Lab)

Download slides (PDF)

Performing network analytics with NetworkX and Jupyter often results in difficult-to-examine hairballs rather than useful visualizations. Meanwhile, more flexible tools like SigmaJS have high learning curves for people new to JavaScript. Daina Bouquin and John DeBlase share a simple, flexible architecture that can help create beautiful JavaScript networks without ditching the Jupyter Notebook.

Building a notebook platform for 100,000 users

Scott Sanderson (Quantopian)

Download slides (PDF)

Scott Sanderson describes the architecture of the Quantopian Research Platform, a Jupyter Notebook deployment serving a community of over 100,000 users, explaining how, using standard extension mechanisms, it provides robust storage and retrieval of hundreds of gigabytes of notebooks, integrates notebooks into an existing web application, and enables sharing notebooks between users.

Building a powerful data science IDE for R, Python, and SQL using JupyterLab

Ali Marami (R-Brain Inc)

Download slides (PDF)

JupyterLab provides a robust foundation for building flexible computational environments. Ali Marami explains how R-Brain leveraged the JupyterLab extension architecture to build a powerful IDE for data scientists, one of the few tools in the market that evenly supports R and Python in data science and includes features such as IntelliSense, debugging, and environment and data view.

Building interactive applications and dashboards in the Jupyter Notebook (sponsored by Bloomberg)

Romain Menegaux (Bloomberg LP), Chakri Cherukuri (Bloomberg LP)

Download slides (PDF)

Romain Menegaux and Chakri Cherukuri demonstrate how to develop advanced applications and dashboards using open source projects, illustrated with examples in machine learning, finance, and neuroscience.

Closing the gap between Jupyter and academic publishing

Mark Hahnel (figshare), Marius Tulbure (figshare)

Download slides (PDF)

Reports of a lack of reproducibility have led funders and others to require open data and code as the outputs of research they fund. Mark Hahnel and Marius Tulbure discuss the opportunities for Jupyter notebooks to be the final output of academic research, arguing that Jupyter could help disrupt the inefficiencies in cost and scale of open access academic publishing.

Collaboration and automated operation as literate computing for reproducible infrastructure

Y M (National Institute of Informatics)

View slides

Jupyter is useful for DevOps. It enables collaboration between experts and novices to accumulate infrastructure knowledge, while automation via notebooks enhances traceability and reproducibility. Yoshi Nobu Masatani shows how to combine Jupyter with Ansible for reproducible infrastructure and explores knowledge, workflow, and customer support as literate computing practices.

Data science at UC Berkeley: 2,000 undergraduates, 50 majors, no command line

Gunjan Baid (UC Berkeley), Vinitra Swamy (UC Berkeley)

Download slides (PDF)

Engaging critically with data is now a required skill for students in all areas, but many traditional data science programs aren’t easily accessible to those without prior computing experience. Gunjan Baid and Vinitra Swamy explore UC Berkeley's Data Science program—2,000 students across 50 majors—explaining how its pedagogy was designed to make data science accessible to everyone.

Data science without borders

Wes McKinney (Two Sigma Investments)

Download slides (PDF)

Wes McKinney makes the case for a shared infrastructure for data science, discusses the open source community's efforts on Apache Arrow, and offers a vision for seamless computation and data sharing across languages.

Defactoring pace of change: Reviewing computational research in the digital humanities

Matt Burton (University of Pittsburgh)

Download slides (PDF)

While Jupyter notebooks are a boon for computational science, they are also a powerful tool in the digital humanities. Matt Burton offers an overview of the digital humanities community, discusses defactoring—a novel use of Jupyter notebooks to analyze computational research—and reflects upon Jupyter’s relationship to scholarly publishing and the production of knowledge.

Deploying a reproducible course

Lindsey Heagy (University of British Columbia), Rowan Cockett (3point Science)

Download slides (PDF)

Web-based textbooks and interactive simulations built in Jupyter notebooks provide an entry point for course participants to reproduce content they are shown and dive into the code used to build them. Lindsey Heagy and Rowan Cockett share strategies and tools for developing an educational stack that emerged from the deployment of a course on geophysics and some lessons learned along the way.

Design for reproducibility

Lorena Barba (George Washington University)

Download slides (PDF)

Lorena Barba explores how to build the ability to support reproducible research into the design of tools like Jupyter and explains how better insights on designing for reproducibility might help extend this design to our research workflows, with the machine as our active collaborator.

Enhancing data journalism with Jupyter

Karlijn Willems (DataCamp)

Download slides (PDF)

Drawing inspiration from narrative theory and design thinking, Karlijn Willems walks you through effectively using Jupyter notebooks to guide the data journalism workflow and tackle some of the challenges that data can pose to data journalism.

GeoNotebook: An extension to the Jupyter Notebook for exploratory geospatial analysis

Christopher Kotfila (Kitware)

Download slides (PDF)

Chris Kotfila offers an overview of the GeoNotebook extension to the Jupyter Notebook, which provides interactive visualization and analysis of geospatial data. Unlike other geospatial extensions to the Jupyter Notebook, GeoNotebook includes a fully integrated tile server providing easy visualization of vector and raster data formats.

How Jupyter makes experimental and computational collaborations easy

Zach Sailer (University of Oregon)

View slides

Scientific research thrives on collaborations between computational and experimental groups, who work together to solve problems using their separate expertise. Zach Sailer highlights how tools like the Jupyter Notebook, JupyterHub, and ipywidgets can be used to make these collaborations smoother and more effective.

How JupyterHub tamed big science: Experiences deploying Jupyter at a supercomputing center

Shreyas Cholia (Lawrence Berkeley National Laboratory), Rollin Thomas (Lawrence Berkeley National Laboratory), Shane Canon (Lawrence Berkeley National Laboratory)

Download slides (PDF)

Shreyas Cholia, Rollin Thomas, and Shane Canon share their experience leveraging JupyterHub to enable notebook services for data-intensive supercomputing on the Cray XC40 Cori system at the National Energy Research Scientific Computing Center (NERSC).

How the Jupyter Notebook helped fast.ai teach deep learning to 50,000 students

Rachel Thomas (fast.ai)

Download slides (PDF)

Although some claim you must start with advanced math to use deep learning, the best way for any coder to get started is with code. Rachel Thomas explains how fast.ai's Practical Deep Learning for Coders course uses Jupyter notebooks to provide an environment that encourages students to learn deep learning through experimentation.

Humans in the loop: Jupyter notebooks as a frontend for AI pipelines at scale

Paco Nathan (derwen.ai)

Download slides (PDF)

Paco Nathan reviews use cases where Jupyter provides a frontend to AI as the means for keeping humans in the loop. This process enhances the feedback loop between people and machines, and the end result is that a smaller group of people can handle a wider range of responsibilities for building and maintaining a complex system of automation.

Jupyter and Anaconda: Shaking up the enterprise (sponsored by Anaconda Powered by Continuum Analytics)

Peter Wang (Anaconda)

Download slides (PDF)

In recent years, open source has emerged as a valuable player in the enterprise, and companies like Jupyter and Anaconda are leading the way. Peter Wang discusses the coevolution of these two major players in the new open data science ecosystem and shares next steps to a sustainable future.

Jupyter notebooks and production data science workflows

Andrew Therriault (City of Boston)

View slides

Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.

Labz 'N Da Wild 2.0: Teaching signal and data processing at scale using Jupyter notebooks in the cloud

Demba Ba (Harvard University)

Download slides (PDF)

Demba Ba discusses two new signal processing/statistical modeling courses he designed and implemented at Harvard, exploring his perspective as an educator and that of the students as well as the steps that led him to adopt the current cloudJHub architecture. Along the way, Demba outlines the potential of architectures such as cloudJHub to help to democratize data science education.

Learning to code isn’t enough: Training as a pathway to improve diversity

Kari Jordan (Data Carpentry)

Download slides (PPT)

Diversity can be achieved through sharing information among members of a community. Jupyter prides itself on being a community of dynamic developers, cutting-edge scientists, and everyday users, but is our platform being shared with diverse populations? Kari Jordan explains how training has the potential to improve diversity and drive usage of Jupyter notebooks in broader communities.

Mapping data in Jupyter notebooks with PixieDust (sponsored by IBM)

RAJ SINGH (IBM Cloud Data Services)

View slides

Raj Singh offers an overview of PixieDust, a Jupyter Notebook extension that provides an easy way to make interactive maps from DataFrames for visual exploratory data analysis. Raj explains how he built mapping into PixieDust, putting data from Apache Spark-based analytics on maps using Mapbox GL.

Meet the Expert with Yoshi Nobu Masatani (National Institute of Informatics)

Y M (National Institute of Informatics)

View slides

Interested in literate computing for reproducibility and nblineage? Or understanding the notebook lifecycle and the consequences of computational narratives? Grab this opportunity to meet Nobu.

Model interpretation guidelines for the enterprise: Using Jupyter’s interactiveness to build better predictive models (sponsored by DataScience.com)

Pramit Choudhary (h2o.ai)

Download slides (PDF)

Pramit Choudhary offers an overview of Datascience.com's model interpretation library Skater, explains how to use it to evaluate models using the Jupyter environment, and shares how it could help analysts, data scientists, and statisticians better understand their model behavior—without compromising on the choice of algorithm.

Music and Jupyter: A combo for creating collaborative narratives for teaching

Carol Willing (Cal Poly San Luis Obispo)

View slides

Music engages and delights. Carol Willing explains how to explore and teach the basics of interactive computing and data science by combining music with Jupyter notebooks, using music21, a tool for computer-aided musicology, and Magenta, a TensorFlow project for making music with machine learning, to create collaborative narratives and publishing materials for teaching and learning.

Notebook narratives from industry: Inspirational real-world examples and reusable industry notebooks

Patty Ryan (Microsoft), Lee Stott (Microsoft), Michael Lanzetta (Microsoft)

View slides

Patty Ryan, Lee Stott, and Michael Lanzetta explore four industry examples of Jupyter notebooks that illustrate innovative applications of machine learning in manufacturing, retail, services, and education and share four reference industry Jupyter notebooks (available in both Python and R)—along with demo datasets—for practical application to your specific industry value areas.

Project Jupyter: From interactive Python to open science

Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)

Download slides (PDF)

Fernando Pérez opens JupyterCon with an overview of Project Jupyter, describing how it fits into a vision of collaborative, community-based open development of tools applicable to research, education, and industry.

Scala: Why hasn't an official Scala kernel for Jupyter emerged yet?

Alexandre Archambault (Teads.tv)

Download slides (PDF)

Alexandre Archambault explores why an official Scala kernel for Jupyter has yet to emerge. Part of the answer lies in the fact that there is no user-friendly, easy-to-use Scala shell in the console (i.e., no IPython for Scala). But there's a new contender, Ammonite—although it still has to overcome a few challenges, not least being supporting by big data frameworks like Spark, Scio, and Scalding.

Elite Sponsors

Strategic Sponsor

Bloomberg

Contributing Sponsor

Impact Sponsor

Domino Data Lab

Supporting Sponsors

Premier Exhibitors

Innovators

Community Partners

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email jupytersponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with JupyterCon, email partners@oreilly.com

Contact Us

View a complete list of JupyterCon contacts

©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com