Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

Apache SINGA: A flexible and scalable deep learning platform for big data analytics

Ju Fan (National University of Singapore), Wei Wang (National University of Singapore)
11:50am–12:30pm Wednesday, 12/02/2015
Data Science and Advanced Analytics
Location: 321-322 Level: Intermediate
Average rating: ***..
(3.20, 5 ratings)

Prerequisite Knowledge

* Background in data analytics * Knowledge of machine learning

Description

As the volume, variety, and velocity of data continue to reach unprecedented levels, big data analytics has drawn significant interest. According to a recent IDC analysis, the worldwide enterprise data analytics market generated $37.7B during 2013, and is forecasted to grow at a 9.4% annual growth rate through 2018, reaching $59.2B. Many organizations are keen on adopting big data techniques to analyze huge volumes of data that conventional business intelligence solutions cannot touch, and discover insightful knowledge for better decision making. Recently deep learning, which extracts high-level abstractions from data, has emerged, and shows great potential for solving business problems.

In this talk, we will present our Apache SINGA system, a flexible and scalable deep learning platform. SINGA has the following highlights, making it a valuable tool for big data analytics:

  1. SINGA supports various deep learning models, and thus has the flexibility to allow users to customize the models that fit their business requirements
  2. SINGA provides a scalable architecture to train deep learning models from huge volumes of data
  3. SINGA provides a simple programming model, making the distributed training process transparent to users.

We have applied the SINGA system to healthcare data analytics in a hospital healthcare system. In this talk, we will show two applications on how SINGA is helpful in analyzing electronic medical record (EMR) data.

  • The first application is predicting risk-of-readmission. Hospital readmission contributes a significant proportion of healthcare spending, while a large proportion of readmissions are potentially avoidable. Predicting risk of readmission for potentially fatal diseases can effectively yield lower costs and better healthcare quality.
  • The second application is chronic disease progression modeling. Chronic diseases tend to evolve and progress over a long time, and if their conditions are not properly managed, more serious comorbidities as well as complications may ensue. Disease progression modeling can help with the early detection and management of chronic diseases.

We will also discuss some examples of how SINGA could be useful to other data types and applications.

Ju Fan

National University of Singapore

Ju Fan received his PhD in computer science from Tsinghua University, China in 2012. He is currently a research fellow in the School of Computing, National University of Singapore. His research interest includes big data analytics, crowdsourcing, and database management.

Wei Wang

National University of Singapore

Wei Wang is a Ph.D. student in the computer science department of the National University of Singapore. Currently, he is working on an Apache incubator project (SINGA) for developing a general distributed deep learning system.