Tools like the Jupyter Notebook provide an excellent platform for quickly iterating on an analysis by interleaving code, text, and output. However, the flexibility of the notebook environment can also lend itself to code that, over the course of an analysis, becomes increasingly unwieldy and difficult to rerun or meaningfully build upon.
While the notebook model allows users to develop code and share results quickly, the prioritization of quick exploration can make the analyses difficult to reproduce. This is typically fixed in a final “clean-up” phase where a notebook is pared down and rerun to make sure it is logically consistent. However, this takes extra effort, and many analysis artifacts will never reach this state. To help address this problem before it happens, we can build tools to make reproducible analysis the most natural option.
As a step toward encouraging reproducibility, Kevin Zielnicki offers an overview of Nodebook, an extension to the Jupyter Notebook that imposes constraints on the notebook model in exchange for greater consistency while keeping the exploratory interactivity that makes the notebook model so useful. Nodebook does this by maintaining a chain of cell execution in logical rather than temporal order. This contrasts with the standard notebook model, in which cells affect the global notebook state in order of execution independently of their logical position in the notebook. By enforcing logical consistency with each cell execution, reproducibility is no longer delayed to a final clean-up but rather maintained throughout the analysis.
Kevin Zielnicki is a data scientist on the styling algorithms team at Stitch Fix. Kevin holds a PhD in physics in the field of quantum information processing, but he now enjoys working with data that can be observed without changing its value.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org