Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY
 
Beekman/Sutton North
Add Jupyter: Kernels, protocols, and the IPython reference implementation to your personal schedule
11:05am Jupyter: Kernels, protocols, and the IPython reference implementation Matthias Bussonnier (UC Berkeley BIDS), Paul Ivanov (Bloomberg LP)
Add Jupyter and the changing rituals around computation to your personal schedule
11:55am Jupyter and the changing rituals around computation Stuart Geiger (UC Berkeley Institute for Data Science), Charlotte Cabasse-Mazel (UC Berkeley Institute for Data Science)
Add Notebook narratives from industry: Inspirational real-world examples and reusable industry notebooks to your personal schedule
1:50pm Notebook narratives from industry: Inspirational real-world examples and reusable industry notebooks Patty Ryan (Microsoft), Lee Stott (Microsoft), Michael Lanzetta (Microsoft)
Add Teaching from Jupyter notebooks  to your personal schedule
2:40pm Teaching from Jupyter notebooks Christian Moscardi (The Data Incubator)
Add The Jupyter Notebook as document: From structure to application to your personal schedule
4:10pm The Jupyter Notebook as document: From structure to application M Pacer (Netflix), Jess Hamrick (UC Berkeley), Damián Avila (Anaconda, Inc.)
Add Jupyter notebooks and the road to enabling data-driven teams to your personal schedule
5:00pm Jupyter notebooks and the road to enabling data-driven teams Skipper Seabold (Civis Analytics), Lori Eich (Civis Analytics)
Sutton Center/Sutton South
Add Jupyter notebooks and production data science workflows to your personal schedule
11:05am Jupyter notebooks and production data science workflows Andrew Therriault (City of Boston)
Add Deploying a reproducible course to your personal schedule
11:55am Deploying a reproducible course Lindsey Heagy (University of British Columbia), Rowan Cockett (3point Science)
Add Data science apps: Beyond notebooks to your personal schedule
4:10pm Data science apps: Beyond notebooks Natalino Busa (DBS)
Murray Hill
Add Citing the Jupyter Notebook in the scientific publication process to your personal schedule
11:05am Citing the Jupyter Notebook in the scientific publication process Bernie Randles (UCLA), Hope Chen (Harvard University)
Add Xeus: A framework for writing native Jupyter kernels to your personal schedule
5:00pm Xeus: A framework for writing native Jupyter kernels Sylvain Corlay (QuantStack), Johan Mabille (QuantStack)
Nassau
Add Accelerating data-driven culture at the largest media group in Latin America with Jupyter to your personal schedule
11:05am Accelerating data-driven culture at the largest media group in Latin America with Jupyter Diogo Munaro Vieira (Globo.com), Felipe Ferreira (Globo.com)
Add Closing the gap between Jupyter and academic publishing to your personal schedule
1:50pm Closing the gap between Jupyter and academic publishing Mark Hahnel (figshare), Marius Tulbure (figshare)
Add Hosting Jupyter at scale to your personal schedule
4:10pm Hosting Jupyter at scale Christopher Wilcox (Microsoft)
Add Democratizing access to open data by providing open computational infrastructure to your personal schedule
5:00pm Democratizing access to open data by providing open computational infrastructure Yuvi Panda (Data Science Education Program (UC Berkeley))
Regent Parlor
Add Friday opening welcome to your personal schedule
Grand Ballroom
8:50am Friday opening welcome Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Andrew Odewahn (O'Reilly Media)
Add Making science happen faster to your personal schedule
8:55am Making science happen faster Jeremy Freeman (Chan Zuckerberg Initiative)
Add Design for reproducibility to your personal schedule
9:20am Design for reproducibility Lorena Barba (George Washington University)
Add Jupyter at O'Reilly to your personal schedule
9:35am Jupyter at O'Reilly Andrew Odewahn (O'Reilly Media)
Add The give and take of open source to your personal schedule
9:45am The give and take of open source Brett Cannon (Microsoft | Python Software Foundation)
Add Where money meets open source to your personal schedule
10:00am Where money meets open source Nadia Eghbal (GitHub)
Add Closing remarks to your personal schedule
10:15am Closing remarks
8:00am Morning Coffee | Room: Sponsor Pavilion (Grand Ballroom Foyer)
Add Speed Networking to your personal schedule
8:00am Speed Networking | Room: 3rd floor promenade
10:30am Morning Break | Room: Sponsor Pavilion (Grand Ballroom Foyer)
8:30am this slot is just to eliminate grey space
3:20pm Afternoon Break | Room: Sutton Complex Foyer
11:05am-11:45am (40m) Programmatic
Jupyter: Kernels, protocols, and the IPython reference implementation
Matthias Bussonnier (UC Berkeley BIDS), Paul Ivanov (Bloomberg LP)
Matthias Bussonnier and Paul Ivanov walk you through the current Jupyter architecture and protocol and explain how kernels work (decoupled from but in communication with the environment for input and output, such as a notebook document). Matthias and Paul also offer an overview of a number of kernels developed by the community and show you how you can get started writing a new kernel.
11:55am-12:35pm (40m) Usage and application
Jupyter and the changing rituals around computation
Stuart Geiger (UC Berkeley Institute for Data Science), Charlotte Cabasse-Mazel (UC Berkeley Institute for Data Science)
The concept of the ritual is useful for thinking about how the core technology of Jupyter notebooks is extended through other tools, platforms, and practices. R. Stuart Geiger, Brittany Fiore-Gartland, and Charlotte Cabasse-Mazel share ethnographic findings about various rituals performed with Jupyter notebooks.
1:50pm-2:30pm (40m) Usage and application
Notebook narratives from industry: Inspirational real-world examples and reusable industry notebooks
Patty Ryan (Microsoft), Lee Stott (Microsoft), Michael Lanzetta (Microsoft)
Patty Ryan, Lee Stott, and Michael Lanzetta explore four industry examples of Jupyter notebooks that illustrate innovative applications of machine learning in manufacturing, retail, services, and education and share four reference industry Jupyter notebooks (available in both Python and R)—along with demo datasets—for practical application to your specific industry value areas.
2:40pm-3:20pm (40m) Jupyter subprojects
Teaching from Jupyter notebooks
Christian Moscardi (The Data Incubator)
Christian Moscardi shares the practical solutions developed at the Data Incubator for using Jupyter notebooks for education. Christian explores some of the open source Jupyter extensions he has written to improve the learning experience as well as tools to clean notebooks before they are committed to version control.
4:10pm-4:50pm (40m) Programmatic
The Jupyter Notebook as document: From structure to application
M Pacer (Netflix), Jess Hamrick (UC Berkeley), Damián Avila (Anaconda, Inc.)
M Pacer, Jess Hamrick, and Damián Avila explain how the structured nature of the notebook document format, combined with native tools for manipulation and creation, allows the notebook to be used across a wide range of domains and applications.
5:00pm-5:40pm (40m) Usage and application
Jupyter notebooks and the road to enabling data-driven teams
Skipper Seabold (Civis Analytics), Lori Eich (Civis Analytics)
It’s not enough just to give data scientists access to Jupyter notebooks in the cloud. Skipper Seabold and Lori Eich argue that to build truly data-driven organizations, everyone from data scientists and managers to business stakeholders needs to work in concert to bring data science out of the wilderness and into the core of decision-making processes.
11:05am-11:45am (40m) Usage and application
Jupyter notebooks and production data science workflows
Andrew Therriault (City of Boston)
Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.
11:55am-12:35pm (40m) Reproducible research and open science
Deploying a reproducible course
Lindsey Heagy (University of British Columbia), Rowan Cockett (3point Science)
Web-based textbooks and interactive simulations built in Jupyter notebooks provide an entry point for course participants to reproduce content they are shown and dive into the code used to build them. Lindsey Heagy and Rowan Cockett share strategies and tools for developing an educational stack that emerged from the deployment of a course on geophysics and some lessons learned along the way.
1:50pm-2:30pm (40m) Extensions and customization
Building a powerful data science IDE for R, Python, and SQL using JupyterLab
Ali Marami (R-Brain Inc)
JupyterLab provides a robust foundation for building flexible computational environments. Ali Marami explains how R-Brain leveraged the JupyterLab extension architecture to build a powerful IDE for data scientists, one of the few tools in the market that evenly supports R and Python in data science and includes features such as IntelliSense, debugging, and environment and data view.
2:40pm-3:20pm (40m) Usage and application
Collaboration and automated operation as literate computing for reproducible infrastructure
Y M (National Institute of Informatics)
Jupyter is useful for DevOps. It enables collaboration between experts and novices to accumulate infrastructure knowledge, while automation via notebooks enhances traceability and reproducibility. Yoshi Nobu Masatani shows how to combine Jupyter with Ansible for reproducible infrastructure and explores knowledge, workflow, and customer support as literate computing practices.
4:10pm-4:50pm (40m) Usage and application
Data science apps: Beyond notebooks
Natalino Busa (DBS)
Jupyter notebooks are transforming the way we look at computing, coding, and science. But is this the only "data scientist experience" that this technology can provide? Natalino Busa explains how you can create interactive web applications for data exploration and analysis that in the background are still powered by the well-understood and well-documented Jupyter Notebook.
5:00pm-5:40pm (40m) Development and community
Learning to code isn’t enough: Training as a pathway to improve diversity
Kari Jordan (Data Carpentry)
Diversity can be achieved through sharing information among members of a community. Jupyter prides itself on being a community of dynamic developers, cutting-edge scientists, and everyday users, but is our platform being shared with diverse populations? Kari Jordan explains how training has the potential to improve diversity and drive usage of Jupyter notebooks in broader communities.
11:05am-11:45am (40m) Reproducible research and open science
Citing the Jupyter Notebook in the scientific publication process
Bernie Randles (UCLA), Hope Chen (Harvard University)
Although researchers have traditionally cited code and data related to their publications, they are increasingly using the Jupyter Notebook to share the processes involved in the act of scientific inquiry. Bernie Randles and Hope Chen explore various aspects of citing Jupyter notebooks in publications, discussing benefits, pitfalls, and best practices for creating the "paper of the future."
11:55am-12:35pm (40m) Usage and application
Cloud Datalab: Jupyter with the power of BigQuery and TensorFlow
Kaz Sato (Google)
Kazunori Sato explains how you can use Google Cloud Datalab—a Jupyter environment from Google that integrates BigQuery, TensorFlow, and other Google Cloud services seamlessly—to easily run SQL queries from Jupyter to access terabytes of data in seconds and train a deep learning model with TensorFlow with tens of GPUs in the cloud, with all the usual tools available on Jupyter.
1:50pm-2:30pm (40m) Development and community
Empower scientists; save humanity: NumFOCUS—Five years in, five hundred thousand to go
Leah Silen (NumFOCUS), Andy Terrel (NumFOCUS)
What do the discovery of the Higgs boson, the landing of the Philae robot, the analysis of political engagement, and the freedom of human trafficking victims have in common? NumFOCUS projects were there. Join Leah Silen and Andy Terrel to learn how we can empower scientists and save humanity.
2:40pm-3:20pm (40m) Usage and application
Using Jupyter at the intersection of robots and industrial biology
Danielle Chou (Zymergen)
Zymergen approaches biology with an engineering and data-driven mindset. Its platform integrates robotics, software, and biology to deliver predictability and reliability during strain design and development. Danielle Chou explains the integral role Jupyter notebooks play in providing a shared Python environment between Zymergen's software engineers and scientists.
4:10pm-4:50pm (40m) Reproducible research and open science
Postpublication peer review of Jupyter notebooks referenced in articles on PubMed Central
Daniel Mietchen (University of Virginia)
Jupyter notebooks are a popular option for sharing data science workflows. Daniel Mietchen shares best practices for reproducibility and other aspects of usability (documentation, ease of reuse, etc.) gleaned from analyzing Jupyter notebooks referenced in PubMed Central, an ongoing project that started at a hackathon earlier this year and is being documented on GitHub.
5:00pm-5:40pm (40m) Kernels
Xeus: A framework for writing native Jupyter kernels
Sylvain Corlay (QuantStack), Johan Mabille (QuantStack)
Xeus takes on the burden of implementing the Jupyter kernel protocol so that kernel authors can focus on more easily implementing the language-specific part of the kernel and support features, such as autocomplete or interactive widgets. Sylvain Corlay and Johan Mabille showcase a new C++ kernel based on the Cling interpreter built with xeus.
11:05am-11:45am (40m) Usage and application
Accelerating data-driven culture at the largest media group in Latin America with Jupyter
Diogo Munaro Vieira (Globo.com), Felipe Ferreira (Globo.com)
JupyterHub is an important tool for research and data-driven decisions at Globo.com. Diogo Munaro Vieira and Felipe Ferreira explain how data scientists at Globo.com—the largest media group in Latin America and second largest television group in the world—use Jupyter notebooks for data analysis and machine learning, making decisions that impact 50 million users per month.
11:55am-12:35pm (40m) Kernels
Scala: Why hasn't an official Scala kernel for Jupyter emerged yet?
Alexandre Archambault (Teads.tv)
Alexandre Archambault explores why an official Scala kernel for Jupyter has yet to emerge. Part of the answer lies in the fact that there is no user-friendly, easy-to-use Scala shell in the console (i.e., no IPython for Scala). But there's a new contender, Ammonite—although it still has to overcome a few challenges, not least being supporting by big data frameworks like Spark, Scio, and Scalding.
1:50pm-2:30pm (40m) Reproducible research and open science
Closing the gap between Jupyter and academic publishing
Mark Hahnel (figshare), Marius Tulbure (figshare)
Reports of a lack of reproducibility have led funders and others to require open data and code as the outputs of research they fund. Mark Hahnel and Marius Tulbure discuss the opportunities for Jupyter notebooks to be the final output of academic research, arguing that Jupyter could help disrupt the inefficiencies in cost and scale of open access academic publishing.
2:40pm-3:20pm (40m) Reproducible research and open science
Defactoring pace of change: Reviewing computational research in the digital humanities
Matt Burton (University of Pittsburgh)
While Jupyter notebooks are a boon for computational science, they are also a powerful tool in the digital humanities. Matt Burton offers an overview of the digital humanities community, discusses defactoring—a novel use of Jupyter notebooks to analyze computational research—and reflects upon Jupyter’s relationship to scholarly publishing and the production of knowledge.
4:10pm-4:50pm (40m) Usage and application
Hosting Jupyter at scale
Christopher Wilcox (Microsoft)
Have you thought about what it takes to host 500+ Jupyter users concurrently? What about managing 17,000+ users and their content? Christopher Wilcox explains how Azure Notebooks does this daily and discusses the challenges faced in designing and building a scalable Jupyter service.
5:00pm-5:40pm (40m) JupyterHub deployments
Democratizing access to open data by providing open computational infrastructure
Yuvi Panda (Data Science Education Program (UC Berkeley))
Open data by itself is not enough. You need open computational infrastructures as well. Yuvi Panda offers an overview of a volunteer-led open knowledge movement that makes all of its data available openly and explores the free, open, and public computational infrastructure recently set up for people to play with and build things on its data (using a JupyterHub deployment).
11:05am-11:45am (40m) Sponsored, Usage and application
Model interpretation guidelines for the enterprise: Using Jupyter’s interactiveness to build better predictive models (sponsored by DataScience.com)
Pramit Choudhary (h2o.ai)
Pramit Choudhary offers an overview of Datascience.com's model interpretation library Skater, explains how to use it to evaluate models using the Jupyter environment, and shares how it could help analysts, data scientists, and statisticians better understand their model behavior—without compromising on the choice of algorithm.
11:55am-12:35pm (40m) Sponsored
Data science encapsulation and deployment with Anaconda Project and JupyterLab (sponsored by Anaconda Powered by Continuum Analytics)
Christine Doig (Anaconda )
Christine Doig offers an overview of the Anaconda Project, an open source library created by Continuum Analytics that delivers lightweight, efficient encapsulation and portability of data science projects. A JupyterLab extension enables data scientists to install the necessary dependencies, download datasets, and set environment variables and deployment commands from a graphical interface.
8:50am-8:55am (5m)
Friday opening welcome
Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Andrew Odewahn (O'Reilly Media)
Program chairs Fernando Pérez and Andrew Odewahn open the second day of keynotes.
8:55am-9:10am (15m)
Making science happen faster
Jeremy Freeman (Chan Zuckerberg Initiative)
Modern biology is evolving quickly, but if we want to make our science more robust, more scalable, and more reproducible, the major bottleneck is computation. Jeremy Freeman offers an overview of a growing ecosystem of solutions to this challenge—many of which involve Jupyter—in the context of exciting scientific projects past, present, and future.
9:10am-9:20am (10m) Sponsored Keynote
Three movements driving enterprise adoption of Jupyter (sponsored by DataScience.com)
William Merchan (DataScience.com)
William Merchan outlines the fundamental trends driving the adoption of Jupyter and shares lessons learned deploying Jupyter in large organizations. Join in to learn best practices in developing a high-performing data science team and moving data science to the core and discover where data science platforms fit in.
9:20am-9:35am (15m)
Design for reproducibility
Lorena Barba (George Washington University)
Lorena Barba explores how to build the ability to support reproducible research into the design of tools like Jupyter and explains how better insights on designing for reproducibility might help extend this design to our research workflows, with the machine as our active collaborator.
9:35am-9:45am (10m)
Jupyter at O'Reilly
Andrew Odewahn (O'Reilly Media)
For almost five years, O’Reilly Media has centered its publishing processes around tools like Jupyter, Git, GitHub, Docker, and a host of open source packages. Andrew Odewahn explores how O'Reilly is using the Jupyter architecture to create the next generation of technical content and offers a preview of what's in store for the future.
9:45am-10:00am (15m)
The give and take of open source
Brett Cannon (Microsoft | Python Software Foundation)
Brett Cannon explains why, in order for open source projects to function long-term, a symbiotic relationship between user and project maintainer needs to exist. When users receive a useful piece of software and project maintainers receive useful help in maintaining the project, everyone is happy.
10:00am-10:15am (15m)
Where money meets open source
Nadia Eghbal (GitHub)
We know money has an important role to play in open source, but where does it help and where does it fall short? Nadia Eghbal explores how money can support open source development without changing its incentives—especially when grants are involved.
10:15am-10:30am (15m)
Closing remarks
Program chairs Fernando Pérez and Andrew Odewahn close the second day of keynotes.
8:00am-8:50am (50m)
Break: Morning Coffee
8:00am-8:30am (30m)
Speed Networking
Gather before keynotes on Thursday and Friday morning for a speed networking event. Enjoy casual conversation while meeting new attendees.
10:30am-11:05am (35m)
Break: Morning Break
8:30am-8:50am (20m)
Plenary: this slot is just to eliminate grey space
12:35pm-1:50pm (1h 15m)
Lunch (sponsored by DataScience.com) and Friday Industry Tables
Industry Table discussions are a great way to informally network with people in similar industries or interested in the same topics. Industry Table discussions will happen during lunch on Thursday, August 24, and Friday, August 25.
3:20pm-4:10pm (50m)
Break: Afternoon Break