Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Science conference sessions

2:40pm–3:20pm Thursday, 03/31/2016
Siddha Ganju (Deep Vision)
Siddha Ganju explains how CERN uses machine-learning models to predict which datasets will become popular over time. This helps to replicate the datasets that are most heavily accessed, which improves the efficiency of physics analysis in CMS. Analyzing this data leads to useful information about the physical processes.
2:30pm–3:00pm Tuesday, 03/29/2016
Mr Prabhat (Berkeley Lab)
Prabhat reviews the top data analytics problems in modern science—covering problems at all scales, from full-scale astronomy surveys to subatomic physics—and outlines Berkeley Lab's hardware and software strategy for dealing with these daunting challenges.
2:40pm–3:20pm Thursday, 03/31/2016
Brian Clark (Objectivity), Marco Ippolito (CGG GeoSoftware)
Oil and gas organizations are at the forefront of big data, adopting technologies such as Hadoop and Spark to develop next-generation fusion systems. Brian Clark and Marco Ippolito introduce a case study from CGG, a builder of common data models to drive analytics of sensor data and associated metadata from fast-changing big data streams, to show how to derive richer value from big data assets.
10:00am–10:30am Tuesday, 03/29/2016
James Crawford (Orbital Insight)
Big data is exploding in space. Constellations of satellites are being launched to monitor the world in all wavelengths—tracking everything from ships to corn harvests. James Crawford explains how machine vision lets us see vast areas at once, while machine learning lets us analyze these images trillions of pixels at a time to recognize patterns that can help with world-changing projects
1:30pm–1:50pm Tuesday, 03/29/2016
Aurelia Moser (Mozilla Science)
Aurelia Moser offers a brief introduction to open source mapping tools for scientific and environmental study. Aurelia will present case studies of terrestrial mapping with Global Forest Watch and planet mapping with Where on Mars?
3:30pm–4:00pm Tuesday, 03/29/2016
Laura Waller (UC Berkeley)
Laura Waller gives an overview of new optical microscopes that employ simple experimental systems and efficient nonlinear inverse algorithms to achieve high-resolution 3D and phase images. By leveraging recent advances in data science, these microscopes can produce gigapixel-scale images at each time frame, computed efficiently and with good robustness to noise and model mismatch.
4:20pm–5:00pm Wednesday, 03/30/2016
Erik Andrejko (The Climate Corporation)
Best practices from scientific research can significantly increase the pace and quality of data science projects. Erik Andrejko discusses the benefits and challenges of reproducibility and collaboration, including review and inter-team communication, for data science work at the Climate Corporation.