Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Dynamic Events in Massive Data Streams, from Astrophysics to Marketing Automation

Kirk Borne (George Mason University )
4:00pm–4:40pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Average rating: ***..
(3.67, 3 ratings)
Slides:   1-PPT 

The Large Synoptic Survey Telescope (LSST.org), the most impressive astronomical sky survey ever designed, will deliver multi-year temporal coverage of the dynamic Universe, while generating ~30 TB of imaging data every night for 10 years. The final astronomical object catalog (database) is expected to be ~20 PB, comprising over 200 attributes for each one of trillions of source observations – all available to the public for exploration, education, and enjoyment.

The discovery of dynamic, transient, changing events in the sky (including the enabling of rapid response to these events) is one of the primary goals of the project. The required corresponding big data stream analytics and massive event mining techniques also have many parallels in other domains, including business, climate change, biodiversity, social uprisings, health epidemics, seismology, cybersecurity, and marketing. In this session I will address these parallels, their big data applications, and some anticipated analytics solutions, including Decision Science-as-a-Service and Hadoop-based analytics. The greatest potential for discovery in all of these domains will come from data science use cases like these:

● Provide rapid characterization and probabilistic classification of high-velocity data;
● Find new correlations and associations of all kinds within high-variety data;
● Identify novel, unexpected behaviors and classes of events in temporal data streams;
● Validate new and existing hypotheses about events with strong statistical confidence, using millions of training samples; and
● Serendipity – discover rare (one-in-a-billion) events.

Photo of Kirk Borne

Kirk Borne

George Mason University

Dr. Kirk Borne is a Transdisciplinary Data Scientist and an Astrophysicist. He is Professor of Astrophysics and Computational Science in the George Mason University School of Physics, Astronomy, and Computational Sciences. He has been at Mason since 2003, where he teaches and advises students in the graduate and undergraduate Computational Science, Informatics, and Data Science programs. Previously, he spent nearly 20 years in positions supporting NASA projects, including an assignment as NASA’s Data Archive Project Scientist for the Hubble Space Telescope, and as Project Manager in NASA’s Space Science Data Operations Office. He has extensive experience in big data and data science, including expertise in scientific data mining and data systems. He has published over 200 articles (research papers, conference papers, and book chapters), and given over 200 invited talks at conferences and universities worldwide. In these roles, he focuses on achieving big discoveries from big data through data science, and he promotes the use of information and data-centric experiences with big data in the STEM education pipeline at all levels. He believes in data literacy for all!