In the Big Data era, datasets are rapidly increasing in both size and complexity, thanks in large part to the pervasiveness of GPS-enabled devices. Billions of records are now the norm, with each record containing dozens of columns, including a geospatial location, timestamp, and other domain specific attributes. Analyzing this data requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, but, without a careful representation, these data cubes can take prohibitively large amounts of memory. The nanocubes project addresses this shortcoming.
Nanocubes is an efficient in-memory encoding of traditional data cubes that can be used to visually explore today’s datasets at interactive rates using only a web browser. In many cases, nanocubes use sufficiently little memory that you can run it entirely on a laptop. The implementation is optimized for memory usage and query time, and uses a client-server architecture. A nanocube is built upon the underlying data initially, and subsequently answers HTTP queries by stand-alone or browser-based clients. Nanocubes supports many well-known visual encodings, including heatmaps, histograms, time series and parallel sets, and is flexible enough to handle a wide variety of real-world datasets. Nanocubes is an open source project available for download at github. Live demos can be found online at www.nanocubes.net.
Lauro Lins received a BSc/MSc in Computer Science and a PhD in Computational Mathematics from Universidade Federal de Pernambuco (Brazil, 1996-2007). During this period, Lauro also worked as the main software designer and developer of a small company that deployed optimization software for some industrial operations (e.g. corrugated paper cut for Klabin, pallet loading for WalMart’s local branch). From 2008-2010, he worked as a Post-Doc at the Scientific Computing and Imaging Institute at University of Utah. From 2010-2012, Lauro has worked as an Assistant Research Professor at NYU-Poly. In the end of 2012, Lauro joined AT&T labs as a researcher of the information visualization department. Lauro’s current research is in the area of information visualization and is focused on three fronts: (1) data structures for interactive visualization of big data (2) design of new visual encodings and interaction modes that are effective for specific data analysis tasks (coordinated tag lenses, timed word trees); (3) a model that brings a free-form and fluid experience to data analysis and visualization (defog system).