Brought to you by NumFOCUS Foundation and O’Reilly Media
The official Jupyter Conference
Aug 21-22, 2018: Training
Aug 22-24, 2018: Tutorials & Conference
New York, NY

Visualizing high-dimensional biological data with Clustergrammer-Widget in the Jupyter Notebook

Nicolas Fernandez (Icahn School of Medicine at Mount Sinai)
4:10pm–4:50pm Thursday, August 23, 2018
Data visualization
Location: Nassau Level: Intermediate

Who is this presentation for?

  • Data analysts, data scientists, research scientists, biologists, and Jupyter widget and visualization developers

Prerequisite knowledge

  • A working knowledge of pandas and the Jupyter Notebook

What you'll learn

  • Understand how to explore high-dimensional data using interactive heatmaps in Jupyter notebooks, using Clustergrammer-Widget
  • Learn how to share Jupyter notebooks with interactive widgets with collaborators

Description

Biological data and other data collected from complex systems can have tens of thousands of variables that interact nonlinearly. Interactive visualizations enable users to develop an intuition about the global structure of their data and immediately identify patterns. While dimensionality reduction techniques are useful for obtaining a bird’s eye view of data, these techniques often obscure important information. Heatmaps, or clustergrams, are powerful alternative but complementary visualization techniques for directly visualizing all variables from high-dimensional data. While there are many software tools that can generate clustergrams, few are web based, fully interactive, or seamlessly integrated into Jupyter notebooks.

Nicolas Fernandez offers an overview of Clustergrammer-Widget, which enables users to easily visualize high-dimensional data (e.g., a pandas DataFrame) within a Jupyter notebook as an interactive hierarchically clustered heatmap. Clustergrammer-Widget generates highly interactive visualizations (e.g., reorderable and zoomable) that can be embedded within notebooks and shared using nbviewer. Clustergrammer-Widget was developed to analyze high-dimensional biological data but can be applied to any high-dimensional data from other fields. Nicolas explains how to use Jupyter notebooks and Clustergrammer-Widget to produce transparent and reproducible analyses for a wide variety of biological datasets and demonstrates how to share your results with collaborators.

Photo of Nicolas Fernandez

Nicolas Fernandez

Icahn School of Medicine at Mount Sinai

Nicolas Fernandez is a computational scientist at the Human Immune Monitoring Center at the Icahn School of Medicine at Mount Sinai. Nicolas is a computational biologist with interests in analysis and visualization of high-throughput biological data as a means to understanding biological regulatory networks.