Brought to you by NumFOCUS Foundation and O’Reilly Media

The official Jupyter Conference

Aug 21-22, 2018: Training

Aug 22-24, 2018: Tutorials & Conference

New York, NY

Explorations in reproducible analysis with Nodebook

Kevin Zielnicki (Stitch Fix)

2:40pm–3:20pm Friday, August 24, 2018

Extensions and customization, Reproducible research and open science
Location: Nassau Level: Intermediate

Average rating:

(4.50, 2 ratings)

Who is this presentation for?

Data scientists, researchers, and Python developers

Prerequisite knowledge

Familiarity with the Jupyter Notebook

What you'll learn

Learn how to do reproducible analysis with Nodebook

Description

Tools like the Jupyter Notebook provide an excellent platform for quickly iterating on an analysis by interleaving code, text, and output. However, the flexibility of the notebook environment can also lend itself to code that, over the course of an analysis, becomes increasingly unwieldy and difficult to rerun or meaningfully build upon.

While the notebook model allows users to develop code and share results quickly, the prioritization of quick exploration can make the analyses difficult to reproduce. This is typically fixed in a final “clean-up” phase where a notebook is pared down and rerun to make sure it is logically consistent. However, this takes extra effort, and many analysis artifacts will never reach this state. To help address this problem before it happens, we can build tools to make reproducible analysis the most natural option.

As a step toward encouraging reproducibility, Kevin Zielnicki offers an overview of Nodebook, an extension to the Jupyter Notebook that imposes constraints on the notebook model in exchange for greater consistency while keeping the exploratory interactivity that makes the notebook model so useful. Nodebook does this by maintaining a chain of cell execution in logical rather than temporal order. This contrasts with the standard notebook model, in which cells affect the global notebook state in order of execution independently of their logical position in the notebook. By enforcing logical consistency with each cell execution, reproducibility is no longer delayed to a final clean-up but rather maintained throughout the analysis.

Kevin Zielnicki

Stitch Fix

Kevin Zielnicki is a data scientist on the styling algorithms team at Stitch Fix. Kevin holds a PhD in physics in the field of quantum information processing, but he now enjoys working with data that can be observed without changing its value.

Presented by

Strategic Sponsors

Premier Exhibitors

Supporting Sponsor

Diversity and Inclusion Sponsor

Innovator

Non-Profit Exhibitor

Community Partners

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email jupytersponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with JupyterCon, email partners@oreilly.com

Contact Us

View a complete list of JupyterCon contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com