Large astronomical catalogues containing more than a billion stars require new methods to visualize and explore these large datasets. Data volumes of this size require different visualization techniques, since scatter plots quickly become too slow and meaningless due to overplotting. One solution to the performance and visualization issue uses binned statistics (e.g., histograms, density maps, and volume rendering in 3D). Maarten Breddels offers an overview of vaex, a Python library that enables calculating statistics for a billion samples per second on a regular n-dimensional grid, and ipyvolume, a library that enables volume and glyph rendering in Jupyter notebooks. Together, these libraries allow the interactive visualization and exploration of large, high-dimensional datasets in the Jupyter Notebook.
Vaex can process at least a billion samples per second, for instance to produce the mean of a quantity on a regular grid. This means statistics can be calculated for any mathematical expression on the data (NumPy style) and can be on the full dataset or subsets specified by queries or selections. However, no proper solution existed to interactively visualize higher-dimensional data in a notebook. This led to the development of ipyvolume, which can render 3D volumes and up to a million glyphs (scatter plots and quiver) in the Jupyter Notebook as a widget. With the browser as a platform and the release of ipywidgets 6.0, these 3D plots can also be embedded in static HTML files and renders on nbviewer, enabling you to share them with colleagues, render them on your tablet (particularly great for paperless offices), and use them for outreach, press release material, etc. Full-screen stereo rendering allows for a virtual reality experience using your phone and Google Cardboard, a minor investment compared to other VR head mountables, and overlaying 3D quiver plots on a 3D volume rendering allows exploring a 6D (or higher) space.
Maarten Breddels is a postdoctoral researcher at the Kapteyn Astronomical Institute at the University of Groningen (RUG), Netherlands, where he works for the Gaia mission, combining astronomy and IT to enable visualization and exploration of the large dataset this satellite will yield. Maarten has experience with low-level languages, such as Assembly and C, and higher-level languages, including C++, Java, and Python. He holds a bachelor’s degree in information technology and a bachelor’s degree, master’s degree, and PhD in astronomy, where his research focused on the field of galactic dynamics.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org