Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY
 
Beekman/Sutton North
11:05am Data science in US and Canadian higher education Laura Noren (Obsidian Security)
11:55am Canadians land on Jupyter Ian Allison (Pacific Institute for the Mathematical Sciences), James Colliander (Pacific Institute for the Mathematical Sciences)
1:50pm Jupyter graduates Douglas Blank (Comet.ML)
2:40pm Reproducible education: What teaching can learn from open science practices Elizabeth Wickes (School of Information Sciences, University of Illinois at Urbana-Champaign)
4:10pm nbinteract: Shareable interactive web pages from notebooks Sam Lau (UC Berkeley), Caleb Siu (UC Berkeley)
5:00pm Current RISE capabilities and its evolution into the future Damián Avila (Anaconda, Inc.)
Sutton Center/Sutton South
11:55am Jupyter widgets Maarten Breddels (Maarten Breddels), Sylvain Corlay (QuantStack)
1:50pm What things are correlated with gender diversity: A dig through the ASF and Jupyter projects Holden Karau (Independent), matthew hunt (Bloomberg)
2:40pm Jupyter's configuration system Afshin Darian (Two Sigma | Project Jupyter), M Pacer (Netflix), Min Ragan-Kelley (Simula Research Laboratory), Matthias Bussonnier (UC Berkeley BIDS)
4:10pm JupyterLab Ian Rose (UC Berkeley), Chris Colbert (Project Jupyter)
5:00pm The Emacs Ipython Notebook John Miller (Honeywell UOP)
Murray Hill
5:00pm GoAi and PyGDF: GPU-accelerated data science with Jupyter notebooks Joshua Patterson (NVIDIA), Keith Kraus (NVIDIA), Leo Meyerovich (Graphistry)
Nassau
11:05am Reproducible science with the Renku platform Sandra Savchenko-de Jong (Swiss Data Science Center)
11:55am I don't like notebooks. Joel Grus (Allen Institute for Artificial Intelligence)
1:50pm Supporting reproducibility in Jupyter through dataflow notebooks David Koop (University of Massachusetts Dartmouth)
2:40pm Explorations in reproducible analysis with Nodebook Kevin Zielnicki (Stitch Fix)
4:10pm JupyterLab and Plotly: A data visualization power couple Lindsay Richman (McKinsey & Co.)
5:00pm Designing for interaction Scott Sanderson (Quantopian)
Concourse A: Business Summit
1:50pm Rapid data science exploration for cybersecurity George Williams (GSI Technology), Harini Kannan (Capsule8), Alex Comerford (Capsule8)
4:10pm PayPal Notebooks: Data science and machine learning at scale, powered by Jupyter Romit Mehta (PayPal), Praveen Kanamarlapudi (PayPal)
5:00pm Business Summit discussion group Paco Nathan (derwen.ai)
8:00am - 5:00 pm Jupyter Usability Testing: Day 2 | Room: Gramercy
8:00am Morning Coffee | Room: Sponsor Pavilion (Grand Ballroom Foyer)
8:15am Speed Networking | Room: 3rd Floor Promenade
Grand Ballroom
8:50am Friday opening remarks Paco Nathan (derwen.ai), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
8:55am Democratizing data Tracy Teal (The Carpentries)
9:10am The future of data-driven discovery in the cloud Ryan Abernathey (Columbia University)
9:50am Data science as a catalyst for scientific discovery Michelle Gill (BenevolentAI)
10:05am Sea change: What happens when Jupyter becomes pervasive at a university? Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)
10:20am Closing remarks
12:35pm Lunch Friday Topic Tables at Lunch | Room: Americas Hall 1
10:30am Morning Break (Sponsored by Amazon Web Services, Inc.) | Room: Sponsor Pavilion (Grand Ballroom Foyer)
3:20pm Afternoon Break | Room: Sponsor Pavilion (Grand Ballroom Foyer)
11:05am-11:45am (40m) Enterprise and organizational adoption, JupyterHub deployments, Training and education
Data science in US and Canadian higher education
Laura Noren (Obsidian Security)
Laura Noren offers an overview of a research project on the various infrastructure models supporting data science in research settings in terms of funding, educational uses, and research utilization. Laura then shares some of the findings, comparing the national federation model currently established in Canada to the more grassroots efforts in many US universities.
11:55am-12:35pm (40m) Enterprise and organizational adoption, JupyterHub deployments, Usage and application
Canadians land on Jupyter
Ian Allison (Pacific Institute for the Mathematical Sciences), James Colliander (Pacific Institute for the Mathematical Sciences)
Over the past 18 months, Ian Allison and James Colliander have deployed Jupyter to more than 8,000 users at universities across Canada. Ian and James offer an overview of the Syzygy platform and explain how they plan to scale and deliver the service nationally and how they intend to make Jupyter integral to the working experience of students, researchers, and faculty members.
1:50pm-2:30pm (40m) Extensions and customization, Training and education, Usage and application
Jupyter graduates
Douglas Blank (Comet.ML)
For the last four years, Douglas Blank has used nothing but Jupyter in the classroom—from a first-year writing course to a course on assembly language, from biology to computer science, from lectures to homework. Join in to learn how Douglas has leveraged Jupyter and discover the successes and failures he experienced along the way. Nicole Petrozzo then offers a student's perspective.
2:40pm-3:20pm (40m) Documentation, Reproducible research and open science, Training and education
Reproducible education: What teaching can learn from open science practices
Elizabeth Wickes (School of Information Sciences, University of Illinois at Urbana-Champaign)
As practitioners of open science begin to migrate their educational material into pubic repositories, many of their common practices and platforms can be used to streamline the instruction material development process. Elizabeth Wickes explains how open science practices can be used in an educational context and why they are best facilitated by tools like the Jupyter Notebook.
4:10pm-4:50pm (40m) Community, Extensions and customization, Usage and application
nbinteract: Shareable interactive web pages from notebooks
Sam Lau (UC Berkeley), Caleb Siu (UC Berkeley)
The nbinteract package converts Jupyter notebooks with widgets into interactive, standalone HTML pages. Its built-in support for function-driven plotting makes authoring interactive pages simpler by allowing users to focus on data, not callbacks. Sam Lau and Caleb Siu offer an overview of nbinteract and walk you through the steps to publish an interactive web page from a Jupyter notebook.
5:00pm-5:40pm (40m) Extensions and customization, Training and education, Usage and application
Current RISE capabilities and its evolution into the future
Damián Avila (Anaconda, Inc.)
RISE has evolved into the main slideshow machinery for live presentations within the Jupyter notebook. Damián Avila explains how to install and use RISE. You'll also discover how to customize it and see some of its new capabilities. Damián concludes by discussing the migration from RISE into a new JupyterLab-RISE extension providing RISE-based capabilities in the new JupyterLab interface.
11:05am-11:45am (40m) Enterprise and organizational adoption, Extensions and customization, Reproducible research and open science
Reproducible data dependencies for Jupyter: Distributing massive, versioned image datasets from the Allen Institute for Cell Science
Jackson Brown (Allen Institute for Cell Science), Aneesh Karve (Quilt)
Reproducible data is essential for notebooks that work across time, across contributors, and across machines. Jackson Brown and Aneesh Karve demonstrate how to use an open source data registry to create reproducible data dependencies for Jupyter and share a case study in open science over terabyte-size image datasets.
11:55am-12:35pm (40m)
Jupyter widgets
Maarten Breddels (Maarten Breddels), Sylvain Corlay (QuantStack)
Project Jupyter aims to provide a consistent set of tools for data science workflows, from the exploratory phase of the analysis to the sharing of the results. Maarten Breddels and Sylvain Corlay offer an overview of Jupyter's interactive widgets framework, which enables rich user interaction, including 2D and 3D interactive plotting, geographic data visualization, and much more.
1:50pm-2:30pm (40m) Community, Usage and application
What things are correlated with gender diversity: A dig through the ASF and Jupyter projects
Holden Karau (Independent), matthew hunt (Bloomberg)
Many of us believe that gender diversity in open source projects is important. (If you don’t, this isn’t going to convince you.) But what things are correlated with improved gender diversity, and what can we learn from similar historic industries? Holden Karau and Matt Hunt explore the diversity of different projects, examine historic EEOC complaints, and detail parallels and historic solutions.
2:40pm-3:20pm (40m)
Jupyter's configuration system
Afshin Darian (Two Sigma | Project Jupyter), M Pacer (Netflix), Min Ragan-Kelley (Simula Research Laboratory), Matthias Bussonnier (UC Berkeley BIDS)
Jupyter's straightforward, out-of-the-box experience has been important for its success in widespread adoption. But good defaults only go so far. Join Afshin Darian, M Pacer, Min Ragan-Kelley, and Matthias Bussonnier to go beyond the defaults and make Jupyter your own.
4:10pm-4:50pm (40m)
JupyterLab
Ian Rose (UC Berkeley), Chris Colbert (Project Jupyter)
Ian Rose and Chris Colbert walk you through the JupyterLab interface and codebase and explain how it fits within the overall roadmap of Project Jupyter.
5:00pm-5:40pm (40m) Integrations with other Software, Usage and application
The Emacs Ipython Notebook
John Miller (Honeywell UOP)
John Miller offers an overview of the Emacs IPython Notebook (EIN), a full-featured client for the Jupyter Notebook in Emacs, and shares a brief history of its development.
11:05am-11:45am (40m) Data visualization, Integrations with other Software, Usage and application
Design and analysis of the world’s most advanced microprocessors using Jupyter notebooks
Kerim Kalafala (IBM), NICHOLAI L'ESPERANCE (IBM)
Kerim Kalafala and Nicholai L'Esperance share their experiences using Jupyter notebooks as a critical aid in designing the next generation of IBM Power and Z processors, focusing on analytics on graphs consisting of hundreds of millions of nodes. Along the way, Kerim and Nicholai explain how they leverage Jupyter notebooks as part of their overall design system.
11:55am-12:35pm (40m) Integrations with other Software, Reproducible research and open science, Usage and application
How JupyterLab and widgets enable interactive analysis of the Earth's past, present, and future
Tyler Erickson (Google)
Massive collections of data on the Earth's changing environment, collected by satellite sensors and generated by Earth system models, are being exposed via web APIs by multiple providers. Tyler Erickson highlights the use of JupyterLab and Jupyter widgets in analyzing complex high-dimensional datasets, providing insights into how our Earth is changing and what the future might look like.
1:50pm-2:30pm (40m) Data visualization, Reproducible research and open science, Usage and application
Using JupyterLab for flood map development: Approaches for improving productivity and reproducibility
Seth Lawler (Dewberry)
Creating flood maps for coastal and riverine communities requires geospatial processing, statistical analysis, finite element modeling, and a team of specialists working together. Seth Lawler explains how using the feature-rich JupyterLab to develop tools, share code with team members, and document workflows used in the creation of flood maps improves productivity and reproducibility.
2:40pm-3:20pm (40m) Extensions and customization, Kernels, Usage and application
Using the MapD kernel for the Jupyter Notebook
Randy Zwitch (MapD)
MapD Core is an open source analytical SQL engine that has been designed from the ground up to harness the parallelism inherent in GPUs. This enables queries on billions of rows of data in milliseconds. Randy Zwitch offers an overview of the MapD kernel extension for the Jupyter Notebook and explains how to use it in a typical machine learning workflow.
4:10pm-4:50pm (40m) Enterprise and organizational adoption, Reproducible research and open science, Usage and application
Using Jupyter to create a community for satellite imagery analysis and sharing
Sean Gorman (DigitalGlobe)
Satellite imagery can be a critical resource during disasters and humanitarian crises. While the community has improved data sharing, we still struggle to create reusable data science to solve problems on the ground. Sean Gorman offers an overview of GBDX Notebooks, a step toward creating an open data science community built around Jupyter to stream imagery and share analysis at scale.
5:00pm-5:40pm (40m) Enterprise and organizational adoption, Reproducible research and open science, Usage and application
GoAi and PyGDF: GPU-accelerated data science with Jupyter notebooks
Joshua Patterson (NVIDIA), Keith Kraus (NVIDIA), Leo Meyerovich (Graphistry)
Joshua Patterson, Leo Meyerovich, and Keith Kraus demonstrate how to use PyGDF and other GoAi technologies to easily analyze and interactively visualize large datasets from standard Jupyter notebooks.
11:05am-11:45am (40m) Reproducible research and open science
Reproducible science with the Renku platform
Sandra Savchenko-de Jong (Swiss Data Science Center)
Sandra Savchenko-de Jong offers an overview of Renku, a highly scalable and secure open software platform designed to make (data) science reproducible, foster collaboration between scientists, and share resources in a federated environment.
11:55am-12:35pm (40m) Training and education, Usage and application
I don't like notebooks.
Joel Grus (Allen Institute for Artificial Intelligence)
I have been using and teaching Python for many years. I wrote a best-selling book about learning data science. And here's my confession: I don't like notebooks. (There are dozens of us!) I'll explain why I find notebooks difficult, show how they frustrate my preferred pedagogy, demonstrate how I prefer to work, and discuss what Jupyter could do to win me over.
1:50pm-2:30pm (40m) Data visualization, Extensions and customization, Reproducible research and open science
Supporting reproducibility in Jupyter through dataflow notebooks
David Koop (University of Massachusetts Dartmouth)
Dataflow notebooks build on the Jupyter Notebook environment by adding constructs to make dependencies between cells explicit and clear. David Koop offers an overview of the Dataflow kernel, shows how it can be used to robustly link cells as a notebook is developed, and demonstrates how that notebook can be reused and extended without impacting its reproducibility.
2:40pm-3:20pm (40m) Extensions and customization, Reproducible research and open science
Explorations in reproducible analysis with Nodebook
Kevin Zielnicki (Stitch Fix)
Even with good intentions, analysis notebooks can quickly accumulate a mess of false starts and out-of-order statements. Best practices encourage cleaning up a notebook to ensure reproducibility, but many analyses will never reach this cleaned-up state. Kevin Zielnicki offers an overview of Nodebook, a Jupyter plugin that encourages reproducibility by preventing inconsistency.
4:10pm-4:50pm (40m) Data visualization, Kernels, Usage and application
JupyterLab and Plotly: A data visualization power couple
Lindsay Richman (McKinsey & Co.)
JupyterLab and Plotly both provide a rich set of tools for working with data. When combined, they create a powerful computational environment that enables users to produce versatile, robust visualizations in a fast-paced setting. Lindsay Richman demonstrates how to use JupyterLab, Plotly, and Plotly's Python-based Dash framework to create dynamic charts and interactive reports.
5:00pm-5:40pm (40m) Integrations with other Software, Reproducible research and open science, Usage and application
Designing for interaction
Scott Sanderson (Quantopian)
Scott Sanderson explores how interactivity can and should influence the design of software libraries, details how the needs of interactive users differ from the needs of application developers, and shares techniques for improving the usability of libraries in interactive environments without sacrificing robustness in noninteractive environments.
11:05am-11:45am (40m) JupyterCon Business Summit, Usage and application
How the Jupyter Notebook makes the corporate tax process easier and better
Jinli Ma (Synchrony Financial)
In the corporate tax world, Microsoft Excel—the king of spreadsheets—is the default tool for tracking information and managing tasks, but tax professionals are often annoyed by slowly updating or broken linked or referenced cells within or between spreadsheets. Jinli Ma explains how the Jupyter Notebook does a better job than Microsoft Excel with the original issued discount calculation process.
11:55am-12:35pm (40m) JupyterCon Business Summit, Training and education, Usage and application
The Jupyter Notebook as a transparent way to document machine learning model development: A case study from a US defense agency
Catherine Ordun (Booz Allen Hamilton)
Many US government agencies are just getting started with machine learning. As a result, data scientists need to de-"black box" models as much as possible. One simple way to do this is to transparently show how the model is coded and its results at each step. Notebooks do just this. Catherine Ordun walks you through a notebook built for RNNs and explains how government agencies can use notebooks.
1:50pm-2:30pm (40m) Data visualization, Extensions and customization, JupyterCon Business Summit, JupyterHub deployments
Rapid data science exploration for cybersecurity
George Williams (GSI Technology), Harini Kannan (Capsule8), Alex Comerford (Capsule8)
The key to successful threat detection in cybersecurity is fast response. George Williams, Harini Kannan, and Alex Comerford offer an overview of specialized extensions they have built for data scientists working in cybersecurity that can be used and deployed via JupyterHub.
2:40pm-3:20pm (40m) JupyterCon Business Summit
Jupyter in the modern enterprise data and analytics ecosystem: Trends, experiments, and opportunities
Gerald Rousselle (Teradata)
Gerald Rouselle reviews some of the trends in modern data and analytics ecosystems for large enterprises and shares some of the key challenges and opportunities for Jupyter adoption. He also details some recent examples and experiments in incorporating Jupyter in commercial products and platforms.
4:10pm-4:50pm (40m) Enterprise and organizational adoption, Extensions and customization, Integrations with other Software, JupyterCon Business Summit
PayPal Notebooks: Data science and machine learning at scale, powered by Jupyter
Romit Mehta (PayPal), Praveen Kanamarlapudi (PayPal)
Hundreds of PayPal's data scientists, analysts, and developers use Jupyter to access data spread across filesystem, relational, document, and key-value stores, enabling complex analytics and an easy way to build, train, and deploy machine learning models. Romit Mehta and Praveen Kanamarlapudi explain how PayPal built its Jupyter infrastructure and powerful extensions.
5:00pm-5:40pm (40m) JupyterCon Business Summit
Business Summit discussion group
Paco Nathan (derwen.ai)
The Business Summit concludes with "unconference"-style breakout sessions that allow enterprise stakeholders to give input to Project Jupyter directly.
8:00am-5:00pm (9h)
Jupyter Usability Testing: Day 2
Help shape the future of Jupyter's user experience. We’ll be testing new UI ideas for JupyterLab, listening to your needs, and involving you in idea generation.
8:00am-9:00am (1h)
Break: Morning Coffee
Morning Coffee
8:15am-8:45am (30m)
Speed Networking
Ready, set, network! Meet fellow attendees who are looking to connect at JupyterCon. We'll gather before Friday keynotes for an informal speed networking event. Be sure to bring your business cards—and remember to have fun.
8:50am-8:55am (5m)
Friday opening remarks
Paco Nathan (derwen.ai), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), Brian Granger (Cal Poly San Luis Obispo)
JupyterCon cochairs Paco Nathan, Fernando Pérez, and Brian Granger open the second day of keynotes.
8:55am-9:10am (15m)
Democratizing data
Tracy Teal (The Carpentries)
We are generating vast amounts of data, but it's not the data itself that is valuable—it's the information and knowledge that can come from this data. Tracy Teal explains how to bring people to data and empower them to address their questions, reach their potential, and solve issues that are important in science, scholarship, and society.
9:10am-9:25am (15m)
The future of data-driven discovery in the cloud
Ryan Abernathey (Columbia University)
Drawing on his experience with the Pangeo project, Ryan Abernathey makes the case for the large-scale migration of scientific data and research to the cloud. The cloud offers a way to make the largest datasets instantly accessible to the most sophisticated computational techniques. A global scientific data commons could usher in a golden age of data-driven discovery.
9:25am-9:30am (5m)
Beyond interactive: Scaling impact with notebooks at Netflix
Michelle Ufford (Netflix)
Netflix is reimagining what a Jupyter notebook is, who works with it, and what you can do with it. Michelle Ufford shares how Netflix leverages notebooks today and describes a brief vision for the future.
9:30am-9:45am (15m)
Jupyter notebooks and the intersection of data science and data engineering
David Schaaf (Capital One)
David Schaaf explains how data science and data engineering can work together in cross-functional teams—with Jupyter notebooks at the center of collaboration and the analytic workflow—to more effectively and more quickly deliver results to decision makers.
9:45am-9:50am (5m)
Disease prediction using the world's largest clinical lab dataset (sponsored by Amazon Web Services)
Cristian Capdevila (Prognos)
Cristian Capdevila explains how Prognos is predicting disease by applying a combination of modern machine learning techniques and clinical expertise to the world’s largest clinical lab database and how the company is leveraging Amazon SageMaker to accelerate model development, training, and deployment.
9:50am-10:05am (15m)
Data science as a catalyst for scientific discovery
Michelle Gill (BenevolentAI)
Michelle Gill explains how data science methodologies and tools can be used to link information from different scientific fields and accelerate discovery in a variety of areas, including the biological sciences.
10:05am-10:20am (15m)
Sea change: What happens when Jupyter becomes pervasive at a university?
Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)
In 2018, UC Berkeley launched a new major in data science, anchored by two core courses that are the fastest-growing in the history of the university. Fernando Pérez discusses the program and explains how the core courses, which now reach roughly 40% of the campus population, are extending data science into specific domains that cover virtually all disciplinary areas of the campus.
10:20am-10:30am (10m)
Closing remarks
Closing remarks
12:35pm-1:50pm (1h 15m)
Friday Topic Tables at Lunch
Topic Table discussions are a great way to informally network with people in similar industries or interested in the same topics.
10:30am-11:05am (35m)
Break: Morning Break (Sponsored by Amazon Web Services, Inc.)
3:20pm-4:10pm (50m)
Break: Afternoon Break