The Power Of Visualizing Deforestation Data

Data Science
Location: Palace Suite - Buckingham Room Level: Intermediate
Average rating: ****.
(4.25, 8 ratings)
Slides:   external link

The power of visualising big data time-series that are derived from remote sensing products processed on Hadoop can not be overestimated. Visualization can give scientists, policy makers, journalists, and the public immediate insights into how the environment is changing over time, leading to quicker understanding and action. Effectively putting big data time-series on maps remains difficult. With the advent of Hadoop, CartoDB, and HTML5 APIs, our ability to create interactive maps in highly performant ways has greatly improved. New problems with the scale and complexity of near real-time data keep data visualization interesting and challenging. Here, we present our work to develop fast mapping solutions for 500 meter resolution deforestation data produced 16-days by the FORMA algorithm for the Global Forest Watch 2.0 web portal.

Deforestation data contain rich temporal information that is often lost when visualized using static map tiles. To ensure that these data characteristics are effectively surfaced in the Global Forest Watch portal, we developed new methods for big data storage, query, transfer, and map-based visualization. The Forest Monitoring for Action (FORMA) project provides free and open forest clearing alert data derived from MODIS satellite imagery every 16 days beginning in December 2005. FORMA is written in the Clojure programming language and rides on Cascading and Cascalog APIs for processing big spatial data on top of Hadoop using MapReduce. Here we will focus on the high level FORMA algorithm and data workflow, with a particular emphasis on the visualization mechanisms for these data.

At a high level, deforestation events are converted from raster products to JSON data objects. Each JSON data object efficiently stores an index of the date and pixel locations of deforestation on quadtree map-tiles. On the client, these three-dimensional JSON objects are unpacked and used to render HTML5 canvas objects that are displayed on the map. In combination with user-interface controls, users can interact with the history of deforestation on the map.

The methods developed for the Global Forest Watch website have been further generalized in an open-source library called, Torque (http://github.com/CartoDB/Torque). A generalized SQL statement to compress temporal-geospatial data to tile-based JSON objects and the HTML5 canvas rendering functions will be expanded in the future to visualize the motion of multiple agents and ordered, non-temporal, data. In this presentation we will describe in-depth the analysis of deforestation data, the efficiency of the temporal JSON data schema, and finally the challenges and rewards of visualizing temporal data on the web.

Photo of Andrew Hill

Andrew Hill

Textile

Andrew is a biologist with technology bent, focusing on informatics and the use of big data. He has worked diverse domains within biology, including epidemiology, microbiology, biodiversity informatics, and phyloinformatics. Andrew is Senior Scientist at Vizzuality, where works on enhancing data accessibility, enabling scientific discovery, and improving scientific outreach.

Photo of Robin Kraft

Robin Kraft

World Resources Institute

Robin is a data wrangler who specializes in being a generalist, having found his way to environmental data analysis from journalism and development economics. These days, as a member of WRI’s Data Lab, he’s tracking deforestation across the tropics, organizing EcoHack events and bending Hadoop to his will.

Photo of Javier de la Torre

Javier de la Torre

Vizzuality

Javier is CEO and co-founder of Vizzuality where he helps run project strategy and works closely with the team to coordinate CartoDB development.

Javier is a recognized expert on biodiversity informatics, open data and open source software. He has been featured in publications including NPR, CNET, The Guardian and El Pais and is frequently invited to speak on topics about data analysis, visualization, biodiversity, mapping and citizen science.

With the belief that better access to data and tools will change the way we understand and preserve environment, Javier has had a long-term interest in understanding and helping to communicate the distribution and of species on Earth. He thinks data visualization provides a great opportunity to bridge the gap between science, policy-making and citizen awareness, where it is most needed.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts