Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

SWAN: CERN's Jupyter-based interactive data analysis service

Diogo Castro (CERN)
11:05am–11:45am Thursday, August 23, 2018

Prerequisite knowledge

  • Basic knowledge of Jupyter and JupyterHub

What you'll learn

  • Understand the challenges of interactive data analysis in the high energy physics field and the challenges of keeping interactivity of analysis while still offloading computations to external resources
  • Learn how Jupyter was integrated in the ecosystem of CERN technologies for mass storage, sync and share, and software distribution

Description

Both CERN and high energy physics (HEP) in general face unprecedented challenges in data storage, processing, and analysis. The experiments of the Large Hadron Collider (LHC) are expected to reach one exabyte of physics data this year. After processing and filtering this data, interactivity takes particular importance in the last phases of analysis, where the final results are produced, namely in the form of plots.

Jupyter’s ability to provide notebooks that merge a rich narrative made of code, text, and other media materials allows CERN to offer a web-based service that addresses the needs of the community. This service, called SWAN (an acronym for service for web-based analysis), provides the HEP community with an interactive interface to access data analysis tools, such as the ROOT framework. Moreover, SWAN integrates with CERN’s infrastructure more precisely, with users’ synchronized storage (CERNBox), computing resources, and experiments data and software.

Diogo Castro offers an overview of SWAN and explains how the service is being used by researchers and students, both inside and outside CERN. Diogo also discusses the evolution of the service, especially the new SWAN interface, developed on top of Jupyter, which enables both easy sharing among users and connecting to Spark clusters.

Photo of Diogo Castro

Diogo Castro

CERN

Diogo Castro is a full stack developer on the SWAN team within the Software Development for Experiments Group at CERN.