Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Automatic 3D MRI knee damage classification with 3D CNN using BigDL on Spark

Jiao(Jennie) Wang (Intel), Valentina Pedoia (UCSF), Berk Norman (UCSF), Yulia Tell (Intel)
11:50am12:30pm Thursday, March 8, 2018

Who is this presentation for?

  • Data scientists and big data, machine learning, and deep learning engineers

Prerequisite knowledge

  • A basic understanding of machine learning and deep learning concepts
  • A working knowledge of Apache Spark

What you'll learn

  • Explore a medical classification system built with 3D convolutional neural networks using BigDL on Apache Spark


Damage to the meniscus has been proposed as a precipitating event for osteoarthritis, a degenerative disease affecting millions a year with significant reduction in their quality of life. Additionally, damaged menisci assessed by magnetic resonance imaging (MRI)-based grading have been associated with greater odds of longitudinal cartilage loss than intact menisci. The grading of meniscus lesions is typically done by radiologists using a semiquantitative grading system that indicates whether or not a lesion exists and the severity of the lesion to inform further care. An automated system that can classify menisci based on the presence or absence of lesions has high clinical relevance as it would provide immediate objective results at the time of the MRI scan, eliminate intra-user variability, and enable automated comparison over time.

Jennie Wang, Valentina Pedoia, Berk Norman, and Yulia Tell offer an overview of their classification system built with 3D convolutional neural networks using BigDL on Apache Spark. BigDL, a new distributed deep learning framework on Apache Spark, provides easy and seamlessly integrated big data and deep learning capabilities for big data users and data scientists. In the 3D imaging field, BigDL provides support with 3D image convolutions, 3D max pooling, and a 3D image augmentation library. Jennie, Valentina, Berk, and Yulia walk you through this complex use case, covering data preparation, model development, training, and more. Along the way, they present the challenges and novel solutions to significant problems and share insight into the ultimate deployment of BigDL as a platform and tool for enabling and implementing the MRI classification solution.

Photo of Jiao(Jennie) Wang

Jiao(Jennie) Wang


Jiao (Jennie) Wang is a software engineer on the big data technology team at Intel, where she works in the area of big data analytics. She’s engaged in developing and optimizing distributed deep learning framework on Apache Spark.

Jiao(Jennie)Wang是英特尔大数据技术团队的软件工程师,主要工作在大数据分析领域。她致力于基于Apache Spark开发和优化分布式深度学习框架。

Photo of Valentina Pedoia

Valentina Pedoia


Valentina Pedoia is a specialist in the Musculoskeletal and Imaging Research Group at UCSF and a data scientist focusing on developing algorithms for advanced computer vision and machine learning for improving the usage of noninvasive imaging as diagnostic and prognostic tools. Her current research explores the role of machine learning in the extraction of contributors to osteoarthritis (OA), and she is studying analytics to model the complex interactions between morphological, biochemical, and biomechanical aspects of the knee joint as a whole and deep learning convolutional neural network for musculoskeletal tissue segmentation and for the extraction of silent features from quantitative relaxation maps for a comprehensive study of the biochemical articular cartilage composition with the ultimate goal of developing a completely data-driven model that is able to extract imaging features and use them to identify risk factors and predict outcomes. Previously, she was a postdoc in the Musculoskeletal and Imaging Research Group, where she provided support and expertise in medical computer vision with a focus on reducing human effort and extracting semantic features from MRIs to study degenerative joint disease. Valentina’s recent work on machine learning applied to OA was awarded as annual scientific highlights of the 25th conference of the International Society of Magnetic Resonance In Medicine (ISMRM 2017) and selected as best paper presented at the MRI drug discovery study group. Valentina holds a PhD in computer science, where her research focused on feature extraction from functional and structural brain MRI in subjects with glial tumors.

Photo of Berk Norman

Berk Norman


Berk Norman is a data scientist in the Department of Radiology and Biomedical Imaging at UC San Francisco, where he works on constructing deep learning models.

Photo of Yulia Tell

Yulia Tell


Yulia Tell is a technical program manager on the big data technologies team within the Software and Services Group at Intel, where she is working on several open source projects and partner engagements in the big data domain. Yulia’s work is focused specifically on Apache Hadoop and Apache Spark, including big data analytics applications that use machine learning and deep learning. Yulia holds an MSc in computer science from Moscow Power Engineering Technical University and has completed executive training on market driving strategies at London Business School.