Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
A Tutorial on Data Visualization with Lightning
Basic data visualization techniques in interactive notebooks – I cover how to set up an interactive notebook environment with Jupyter, and go through some basic examples using Python with popular visualization libraries such as matplotlib and seaborn. I also show how to use custom kernels to integrate languages other than Python into the notebook environment (for example, Scala and R).
Interactive visualization with Lightning – I introduce the Lightning data visualization server and show how to include it in the notebook environment that was set up in the first portion of the session. I go through a wide range of examples showing the power of the library.
Using these tools with large scale analysis libraries – What is visualization without analysis?! In this step I integrate Spark, a popular engine for driving large-scale interactive data analysis, and go through examples using Spark in conjunction with the above data visualization tools.
Closing the feedback loop – Users of Lightning can set up visualizations to trigger callbacks on specific events. For example, a user could highlight a certain portion of an image and then run subsequent analysis on data underlying that region. I show how to set this up and use it to automatically run Spark jobs based on user interaction on data visualizations.
Matthew Conlen is a software engineer and information designer in New York. He is a partner at the New York Data Company, and works as the senior developer for Rhizome and computational journalist at FiveThirtyEight. Matthew collaborates with researchers from HHMI Janelia on the open source Lightning data visualization server. He graduated from the University of Michigan with degrees in computer science and applied mathematics.
Comments on this page are now closed.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.