Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Geospatial conference sessions

2:00pm–2:30pm Tuesday, 03/29/2016
Alexander Gray (Skytree, Inc.)
Alex Gray presents a novel approach to score and detect anomalies in large-scale data based on probabilistic machine-learning models. Alex focuses on unsupervised learning and uses a real-world use case—finding outliers in geospatial behavior—to demonstrate how an outlier detection framework can be applied to find anomalies in a dataset with millions of instances.
2:40pm–3:20pm Thursday, 03/31/2016
Brian Clark (Objectivity), Marco Ippolito (CGG GeoSoftware)
Oil and gas organizations are at the forefront of big data, adopting technologies such as Hadoop and Spark to develop next-generation fusion systems. Brian Clark and Marco Ippolito introduce a case study from CGG, a builder of common data models to drive analytics of sensor data and associated metadata from fast-changing big data streams, to show how to derive richer value from big data assets.
10:00am–10:30am Tuesday, 03/29/2016
James Crawford (Orbital Insight)
Big data is exploding in space. Constellations of satellites are being launched to monitor the world in all wavelengths—tracking everything from ships to corn harvests. James Crawford explains how machine vision lets us see vast areas at once, while machine learning lets us analyze these images trillions of pixels at a time to recognize patterns that can help with world-changing projects
1:30pm–1:50pm Tuesday, 03/29/2016
Aurelia Moser (Mozilla Science)
Aurelia Moser offers a brief introduction to open source mapping tools for scientific and environmental study. Aurelia will present case studies of terrestrial mapping with Global Forest Watch and planet mapping with Where on Mars?
4:20pm–5:00pm Wednesday, 03/30/2016
Erik Andrejko (The Climate Corporation)
Best practices from scientific research can significantly increase the pace and quality of data science projects. Erik Andrejko discusses the benefits and challenges of reproducibility and collaboration, including review and inter-team communication, for data science work at the Climate Corporation.
2:40pm–3:20pm Thursday, 03/31/2016
Kelvin Chu (Uber), Evan Richards (Uber)
Schema plays a key role in the Hadoop architecture at Uber. Kelvin Chu and Evan Richards explain why schema is important and how it can make your Hadoop and Spark application more reliable and efficient.
11:50am–12:30pm Wednesday, 03/30/2016
Vinoth Chandar explains how Uber revamped its foundational data infrastructure with Hadoop as the source-of-truth data lake, sharing lessons from the experience.