Engineering the Future of Software
November 13–14, 2016: Training
November 14–16, 2016: Tutorials & Conference
San Francisco, CA

Large-scale image processing of big medical image data using Amazon Web Services and its challenges

Razik Yousfi (Heartflow)
1:15pm–2:05pm Wednesday, 11/16/2016
Integration architecture
Location: Georgian Level: Intermediate
Average rating: ****.
(4.25, 4 ratings)

Prerequisite knowledge

  • Experience with Amazon Web Services
  • Familiarity with designing large-scale systems involving multiple subsystems such as APIs and asynchronous compute nodes
  • An understanding of data processing algorithms

What you'll learn

  • Learn how HeartFlow, a noninvasive medical imaging startup, uses an application pipeline and its supporting architecture

Description

Coronary artery disease is the leading cause of death worldwide, affecting more than 15 million Americans and costing the US healthcare system more than $182B each year. HeartFlow’s mission is to achieve the triple aim of bettering the health of populations, improving the patient experience, and reducing per capita costs of care by becoming the standard of care for noninvasive coronary artery disease detection. In order to achieve this goal, HeartFlow uses a very heterogenous stack of technologies and frameworks that supports the deployment of a thin virtual appliance within the hospital’s infrastructure to transmit data, advanced machine-learning algorithms for image processing, a computational fluid dynamics (CFD) simulation of the heart physiology running over a distributed file system, and the secure ingestion and storage of patient health information (PHI) from around the globe.

For HeartFlow to commercialize and scale appropriately, it had to deliver a product that was robust, secure, and fast to accommodate the changing demands of the hospital environment. More particularly, it needed to build an infrastructure capable of supporting a very large number of images to process within a very short amount of time. Where typical large-scale infrastructures deal with a lot of data that can be split into smaller chunks, HeartFlow has to transfer and manipulate full images of the heart and the coronaries, making the design of the infrastructure challenging in some aspects.

Over the last year, its objective has been to move from the typical data center schema into the public cloud for processing large-file-size datasets of CT images. This was a complete paradigm shift in how we move data from hospitals through the cloud and to HeartFlow for image analysis using state of the art 3D visualization techniques. Thus, we saw this as an opportunity to revamp our technology stack by moving toward queue-based messaging pipelines using AWS Simple Queue Service, asynchronous event processing with AWS Lambdas, Docker containerization for algorithms, and offscreen image rendering with state of the art WebGL techniques and Node.js

Razik Yousfi explains the challenges and opportunities HeartFlow encountered during the migration of its health technology from a private data center to Amazon Web Services, including solutions employed to support the exponential growth of the business. Razik reviews the pipeline and supporting architecture, highlighting the design of the backend, which utilizes multiple databases in different geographic regions of the world to accommodate strict HIPAA regulations for transmission of patient protected health information (PHI), how HeartFlow uses network file systems in AWS to share data among computing units, and scaling up and down to enhance computing times and drive down the costs of processing. Razik also touches on the current and future developments around big data analytics that are now possible thanks to the elastic infrastructure and computational power that the cloud enables.

Photo of Razik Yousfi

Razik Yousfi

Heartflow

Razik Yousfi is a principal software architect at HeartFlow focusing on designing scalable architectures for its image-based processing pipeline. In his role, he builds products that satisfy the customer needs and disrupt the industry standards. Razik specializes in software excellence, code maintainability, and consistency of the software architectures across the company. Previously, Razik was a research and development software engineer working on the design and architecture of an interactive workstation for the 3D modeling of the coronary system. Prior to HeartFlow, Razik worked at Systran. He graduated from a French school of computer science after doing his end-of-studies internship at Siemens Corporate Research, Princeton.