Brought to you by NumFOCUS Foundation and O’Reilly Media Inc.
The official Jupyter Conference
August 22-23, 2017: Training
August 23-25, 2017: Tutorials & Conference
New York, NY

Encrypting Notebooks for Data Science

Moderated by: Steven Anton

Who is this presentation for?

Data Scientists/Analysts, Data Engineers, Security/compliance teams

Prerequisite knowledge

Attendees should feel somewhat comfortable modifying Jupyter configurations and using command-line tools.

What you'll learn

Encrypting notebooks is easy and minimally disrupts common workflows, but can drastically increase security when working with sensitive information.

Description

Jupyter notebooks are a core element in many data science workflows. However, analyzing sensitive data, such as personally identifiable information or health records, presents a security challenge because notebooks are not encrypted at rest.

By extending Jupyter’s FileManager hook, we demonstrate how to transparently increase security and compliance by working directly with encrypted notebooks. The first encryption backend we discuss leverages GPG tools, which works well for individuals and small teams. The second backend uses the encryption-as-a-service feature provided by Vault and is more appropriate for larger teams. We’ll also discuss implications for managing encryption across a typical data science team, including things like managing notebooks version control.

Attendees should leave with simple actionable steps to harden the security of Jupyter notebooks.