Fueling innovative software
July 15-18, 2019
Portland, OR

How NASA is building a petabyte-scale geospatial archive in the cloud

Aimee Barciauskas (Development Seed)
2:35pm3:15pm Thursday, July 18, 2019
Secondary topics:  Cloud Native
Average rating: ****.
(4.70, 10 ratings)

Who is this presentation for?

  • Software, cloud, and data engineers

Level

Intermediate

Description

EOSDIS is working toward a vision of a cloud-based, highly flexible system to meet its ever-growing and evolving data demands. Cumulus, a free and open source framework, supports this vision via configurable workflows to ingest, process, archive, manage, and distribute NASA’s Earth imagery. The Cumulus infrastructure is designed for scalability and reliability, using much of the AWS serverless platform, which enables Cumulus to scale in real time to be performant under the largest expected workloads.

Cumulus is poised to make a huge impact on how NASA manages and disseminates its Earth science imagery. In one notable case, the NASA-ISRO Synthetic Aperture Radar (NISAR) mission, Cumulus will be used to collect more data in a year than exists in NASA’s current archive. The NISAR mission will collect 45 PB a year and process that data at a rate of 1 GB per second. The need for Cumulus is proven through its application to NASA missions, but its application has extended beyond NASA’s Distributed Active Archive Centers (DAACs). It’s being used to monitor agriculture in Tanzania, apply machine learning models to estimate hurricane intensity, and generate air quality predictions using near-real-time forecast data.

Aimee Barciauskas outlines the motivation for Cumulus, the achievements and hurdles of the past two years, and its varied applications. You’ll learn about the availability of the open-sourced software and how NASA intends to make its Earth Observing Geospatial data available for free to the public in the cloud.

Prerequisite knowledge

  • General knowledge of Node.js and AWS (useful but not required)

What you'll learn

  • Learn how Cumulus software keeps up to speed with and takes advantage of AWS cloud offerings and how it's being adapted to be available through other cloud providers—specifically Google Cloud and Azure
  • Understand that Cumulus is being used outside NASA for organizations like the World Bank to support large-scale development projects
  • See that Cumulus Lite provides users with the cheapest way to get the most critical Cumulus features
Photo of Aimee Barciauskas

Aimee Barciauskas

Development Seed

Aimee Barciauskas is a data engineer at Development Seed. She works with NASA and the European Space Agency (ESA) to make the vast collection of Earth Observation data cloud friendly and analysis ready and helps the machine learning team improve the understanding of and access to data. She’s experienced in building cloud service-oriented architecture, web APIs, and data processing pipelines. Previously, Aimee was a full stack web developer at Nava, Medidata, and Fundraise.com. She cares deeply about using data, data science, and machine learning to drive positive social change; she’s a chapter leader of DataKind DC, where she volunteers on projects ranging from a program referral portal for DC’s Child and Family Services Agency to the use of natural language processing to understand why people give philanthropically. When not coding, Aimee enjoys rock climbing and fancy beer. She holds a MS in data science from Barcelona Graduate School of Economics and a BA in economics and philosophy from Boston College.