Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Deep learning conference sessions

11:00am–11:30am Tuesday, 03/29/2016
Alexander Ulanov (Hewlett Packard Labs)
Alexander Ulanov outlines a scalable implementation of deep neural networks for Spark, which uses batch BLAS operations to speed up the computations, employs Spark data parallelism for scaling, and provides friendly and extensible user and developer interfaces.
9:05am–9:30am Tuesday, 03/29/2016
Arno Candel (
In recent years, deep learning has taken the lead in predictive accuracy in many fields of machine learning, and companies are struggling to keep up with the speed of innovation. Arno Candel demonstrates how successful enterprises can augment simple statistical models with more accurate data-driven models to gain a competitive edge.
11:30am–12:00pm Tuesday, 03/29/2016
John Canny (UC Berkeley)
GPUs have proven their value for machine learning, offering orders-of-magnitude speedups on dense and sparse data. They define the current performance limits for machine learning but have limited model capacity. John Canny explains how to mitigate that challenge and achieve linear speedups with GPUs on commodity networks. The result defines the hitherto unseen "outer limits" of ML performance.
2:30pm–3:00pm Tuesday, 03/29/2016
Mr Prabhat (Berkeley Lab)
Prabhat reviews the top data analytics problems in modern science—covering problems at all scales, from full-scale astronomy surveys to subatomic physics—and outlines Berkeley Lab's hardware and software strategy for dealing with these daunting challenges.
4:20pm–5:00pm Wednesday, 03/30/2016
Brandon Ballinger (Cardiogram), Johnson Hsieh (Cardiogram)
Each year, 15 million people suffer strokes, and at least a fifth of those are due to atrial fibrillation, the most common heart arrhythmia. Brandon Ballinger reports on a collaboration between UCSF cardiologists and ex-Google data scientists that detects atrial fibrillation with deep learning.
2:40pm–3:20pm Wednesday, 03/30/2016
Carlos Guestrin (Dato Inc.)
Machine learning is a hot topic. Recommenders, sentiment analysis, churn and click-through prediction, image recognition, and fraud detection are at the core of intelligent applications. However, developing these models is laborious. Carlos Guestrin shares a new approach to leverage massive amounts of data and applied machine learning at scale to create intelligent applications.
11:50am–12:30pm Thursday, 03/31/2016
Jeremy Howard ( | USF | and
In his 20+ years of applying machine learning and data analysis to a wide range of industries, Jeremy Howard never felt that his work really changed anyone's life in a deep and positive way, so he spent a year researching ways he might effect real change. Jeremy outlines the impact that deep learning is going to make on the world and explains how you too can make a difference.
5:10pm–5:50pm Wednesday, 03/30/2016
Josh Patterson (Patterson Consulting), Dave Kale (Skymind), Zachary Lipton (University of California, San Diego)
Time series data is increasingly ubiquitous with both the adoption of electronic health record (EHR) systems in hospitals and clinics and the proliferation of wearable sensors. Josh Patterson, David Kale, and Zachary Lipton bring the open source deep learning library DL4J to bear on the challenge of analyzing clinical time series using recurrent neural networks (RNNs).
11:30am–12:00pm Tuesday, 03/29/2016
Naveen Rao (Intel)
Naveen Rao discusses deep learning, a form of machine learning loosely inspired by the brain. Naveen explores the benefits of deep learning over other machine-learning techniques, recent advances in the field, the deep learning workflow, challenges in developing and deploying deep learning-based solutions, and the need for standardized tools for building and scaling deep learning solutions.
12:00pm–12:30pm Tuesday, 03/29/2016
Stephen Merity (Salesforce Research), Caiming Xiong (Metamind)
Stephen Merity, Richard Socher, and Caiming Xiong discuss their recent work on extending the dynamic memory network (DMN) to question answering in both the textual and visual domains and explore how memory networks and attention mechanisms can allow for better interpretability of deep learning models.
4:20pm–5:00pm Thursday, 03/31/2016
Sreeni Iyer (quadanalytix), Anurag Bhardwaj (Quad Analytix)
Typically, 8–10% of product URLs in ecommerce sites are misclassified. Sreeni Iyer and Anurag Bhardwaj discuss a machine-learning-based solution that relies on an innovative fusion of classifiers that are both text- and image-based, along with human touch to handle edge cases, to automatically classify product URLs according to a canonical taxonomic organization with a high F-score.
10:00am–10:30am Tuesday, 03/29/2016
James Crawford (Orbital Insight)
Big data is exploding in space. Constellations of satellites are being launched to monitor the world in all wavelengths—tracking everything from ships to corn harvests. James Crawford explains how machine vision lets us see vast areas at once, while machine learning lets us analyze these images trillions of pixels at a time to recognize patterns that can help with world-changing projects
3:30pm–4:10pm Wednesday, 03/30/2016
Christopher Nguyen (Arimo), Anh Trinh (Arimo, Inc.)
Christopher and Anh are happy to answer questions about Distributed DataFrame (The DDF Project), visual DDFs and their role in collaborative data visualization, and distributed deep learning on Spark/DDF.
4:20pm–5:00pm Wednesday, 03/30/2016
Rajat Monga (Google), Amy Unruh (Google), Kaz Sato (Google)
Googlers Rajat, Amy, and Kazunori can answer all your TensorFlow questions, including how to use TensorFlow to utilize interactive queries on petabyte-sized datasets, empower large-scale distributed training of neural networks, and train and deploy machine-learning models.
11:00am–11:30am Tuesday, 03/29/2016
Kanu Gulati (Zetta Venture Partners)
Hardware-accelerated solutions are ready to meet challenges in data analytics with regard to data I/O, computational capacity, and interactive visualization. Data analytics and HPC evolution must go hand in hand. Kanu Gulati offers an overview of the advances in hardware acceleration and discusses the HPC applications enabling the next major wave of analytics innovation.
11:00am–11:40am Wednesday, 03/30/2016
Robert Nishihara (University of California, Berkeley)
Robert Nishihara offers an overview of SparkNet, a framework for training deep networks in Spark using existing deep learning libraries (such as Caffe) for the backend. SparkNet gets an order of magnitude speedup from distributed training relative to Caffe on a single GPU, even in the regime in which communication is extremely expensive.
1:50pm–2:30pm Thursday, 03/31/2016
Kaz Sato (Google), Amy Unruh (Google)
Kazunori Sato and Amy Unruh explore how you can use TensorFlow to drive large-scale distributed machine learning against your analytic data sitting in Google BigQuery, with data preprocessing driven by Dataflow (now Apache Beam). Kazunori and Amy dive into practical examples of how these technologies can work together to enable a powerful workflow for distributed machine learning.
4:00pm–4:30pm Tuesday, 03/29/2016
Rajat Monga (Google)
TensorFlow is an open source software library for numerical computation with a focus on machine learning. Rajat Monga offers an introduction to TensorFlow and explains how to use it to train and deploy machine-learning models to make your next application smarter.
10:00am–10:30am Tuesday, 03/29/2016
Alice Zheng (Amazon)
Feature engineering is widely practiced, but understanding the hows and whys behind this process often relies on folklore and guesswork. Alice Zheng offers a systematic view of feature engineering and discusses the underpinnings of a few popular methods.
10:10am–10:25am Wednesday, 03/30/2016
Alyosha Efros (UC Berkeley)
Alyosha Efros discusses using computer vision to understand big visual data.