Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

A deep learning approach for precipitation nowcasting with RNN using Analytics Zoo on BigDL

Alexander Heye (Cray), Ding Ding (Intel)
2:55pm–3:35pm Wednesday, 09/12/2018
Data science and machine learning
Location: 1A 15/16 Level: Intermediate
Secondary topics:  Deep Learning, Temporal data and time-series analytics

Who is this presentation for?

  • Data scientists and big data, machine learning, and deep learning engineers

Prerequisite knowledge

  • A basic understanding of machine learning and deep learning concepts
  • A working knowledge of Apache Spark

What you'll learn

  • Explore a precipitation nowcasting system built with 3D recurrent neural networks using BigDL on Apache Spark
  • Gain insight into the process for developing a full end-to-end deep learning workflow including elements of big data and machine learning


Precipitation nowcasting has long been an important problem in the field of weather forecasting. The goal of this task is to predict the future rainfall intensity in a local region over a relatively short timeframe (e.g., 0–6 hours), in order to take timely actions (e.g., generating regional emergency rainfall alerts). The forecasting resolution and time accuracy required are much higher than for other traditional forecasting tasks, and they make the precipitation nowcasting problem quite challenging. The existing approach relies on numerical weather prediction, which requires a complex and meticulous simulation of the physical equations in the atmospheric model, which limits performance and accessibility.
Alexander Heye and Ding Ding explain how to build a precipitation nowcasting system with recurrent neural networks using BigDL on Apache Spark. BigDL, a new distributed deep learning framework on Apache Spark, provides easy and seamlessly integrated big data and deep learning capabilities for big data users and data scientists. The precipitation nowcasting system uses 3D convolution long short-term memory (3D ConvLSTM) sequence-to-sequence learning framework provided by BigDL. Alexander and Ding walk you through this complex use case, covering data preparation, model development, training, and more. Along the way, they present the challenges and novel solutions to significant problems and share insight into the ultimate deployment of BigDL as a platform and tool for enabling and implementing the precipitation nowcasting solution.

Photo of Alexander Heye

Alexander Heye


Alex Heye is a software engineer with the Analytics Group at Cray focused on deep learning technologies. His team works to develop applications for HPC users to readily incorporate data analytics and machine learning tools into their workflow.

Photo of Ding Ding

Ding Ding


Ding Ding is a senior software engineer on Intel’s big data technology team, where she works on developing and optimizing distributed machine learning and deep learning algorithms on Apache Spark, focusing particularly on large-scale analytical applications and infrastructure on Spark.