Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Where data science meets rocket science: Data platforms and predictive analytics for aerospace

Mike Koelemay (Lockheed Martin)
1:50pm2:30pm Thursday, March 16, 2017
Secondary topics:  Data Platform, Geospatial, Logistics
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Data scientists, analysts, architects, and technology managers

Prerequisite knowledge

  • Basic knowledge of the Hadoop ecosystem and scalable computing technologies

What you'll learn

  • Explore Sikorsky's robust infrastructure and tools used to analyze data collected from fleets of machines in order to rapidly create value from massive datasets


Over the past three decades, the prevalence of dedicated advanced monitoring systems on aircraft has grown tremendously. These systems commonly include a large number of sensors monitoring everything from outside air temperature to gearbox pressures to the health of drivetrain components (via onboard calculated condition indicators). The motivation for monitoring and collecting this data is to enhance safety and enable intelligent decision making about the operation of the aircraft, including optimizing maintenance, maximizing availability, focusing troubleshooting, and enabling proactive and timely support.

The Health & Usage Monitoring System (HUMS) onboard most Sikorsky rotorcraft collects several different types of data (parametric, event, usage, regime, mechanical diagnostics, etc.). When this data is aggregated across an entire fleet over all time, the size of the data alone can be difficult to manage, let alone be tractable for extraction of useful and actionable insights in a reasonable time frame or interactive analysis and visualization. When dealing with datasets of this size (which can be hundreds of terabytes and millions of files for a single fleet), there are several difficult problems with ingesting and storing files, batch processing algorithms and analytics, and serving the data to end users. In addition, Sikorsky uses several other datasets, including flight test data, supply chain, maintenance, safety, operator information, design specs, logistics, and other external data sources, including global weather and economic information (oil and gas prices), to drive decision making in the enterprise.

Mike Koelemay explores the data platform that Sikorsky has built in recent years using massively scalable tools such as Hadoop, Spark, and Cassandra to enable the ingestion, storage, serving, and processing of aircraft data and shares several use cases showcasing how this technology has enabled decision support using historical fleet data that was previously impractical, as well as the challenges encountered along the way.

Photo of Mike Koelemay

Mike Koelemay

Lockheed Martin

Mike Koelemay is a Fellow and Chief Data Scientist in the Chief Data & Analytics Office at Enterprise Operation, Lockheed Martin