Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Deep Learning sessions

9:00am–12:30pm Tuesday, 09/11/2018
Location: 1A 21/22 Level: Intermediate
Garrett Hoffman (StockTwits)
Average rating: ****.
(4.75, 4 ratings)
Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include word2vec, recurrent neural networks and variants (LSTM, GRU), and convolutional neural networks. Read more.
9:00am–5:00pm Tuesday, 09/11/2018
Location: 1A 03
Dylan Bargteil (The Data Incubator)
The TensorFlow library provides for the use of data flow graphs for numerical computations, with automatic parallelization across several CPUs or GPUs. This architecture makes it ideal for implementing neural networks and other machine learning algorithms. Dylan Bargteil introduces TensorFlow's capabilities through its Python interface. Read more.
9:00am–12:30pm Tuesday, 09/11/2018
Location: 1E 15/16 Level: Intermediate
Vijay Agneeswaran (Walmart Labs), Abhishek Kumar (Publicis Sapient)
Average rating: ****.
(4.40, 5 ratings)
Abhishek Kumar and Vijay Srinivas Agneeswaran offer an introduction to deep learning-based recommendation and learning-to-rank systems using TensorFlow. You'll learn how to build a recommender system based on intent prediction using deep learning that is based on a real-world implementation for an ecommerce client. Read more.
1:30pm–5:00pm Tuesday, 09/11/2018
Location: 1E 07/08 Level: Intermediate
Vartika Singh (Cloudera), Alan Silva (Cloudera), Alex Bleakley (Cloudera), Steven Totman (Cloudera), Mirko Kämpf (Cloudera), Syed Nasar (Cloudera)
Average rating: *....
(1.00, 1 rating)
Vartika Singh, Alan Silva, Alex Bleakley, Steven Totman, Mirko Kämpf, and Syed Nasar outline approaches for preprocessing, training, inference, and deployment across datasets (time series, audio, video, text, etc.) that leverage Spark, its extended ecosystem of libraries, and deep learning frameworks. Read more.
1:30pm–5:00pm Tuesday, 09/11/2018
Location: 1A 12/14 Level: Intermediate
Bruno Goncalves (Data For Science)
Average rating: ***..
(3.14, 7 ratings)
Time series are everywhere around us. Understanding them requires taking into account the sequence of values seen in previous steps and even long-term temporal correlations. Join Bruno Gonçalves to learn how to use recurrent neural networks to model and forecast time series and discover the advantages and disadvantages of recurrent neural networks with respect to more traditional approaches. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Mikio Braun (Zalando)
Average rating: ****.
(4.86, 7 ratings)
Time series data has many applications in industry, from analyzing server metrics to monitoring IoT signals and outlier detection. Mikio Braun offers an overview of time series analysis with a focus on modern machine learning approaches and practical considerations, including recommendations for what works and what doesn’t, and industry use cases. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 06/07 Level: Beginner
Shioulin Sam (Cloudera Fast Forward Labs)
Average rating: ***..
(3.25, 4 ratings)
Recent advances in deep learning allow us to use the semantic content of items in recommendation systems, addressing a weakness of traditional methods. Shioulin Sam explores the limitations of classical approaches and explains how using the content of items can help solve common recommendation pitfalls, such as the cold start problem, and open up new product possibilities. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Longqi Yang (Cornell Tech, Cornell University)
State-of-the-art recommendation algorithms are increasingly complex and no longer one size fits all. Current monolithic development practice poses significant challenges to rapid, iterative, and systematic, experimentation. Longqi Yang explains how to use OpenRec to easily customize state-of-the-art solutions for diverse scenarios. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Ankit Jain (Uber)
Average rating: ***..
(3.00, 3 ratings)
Personalization is a common theme in social networks and ecommerce businesses. Personalization at Uber involves an understanding of how each driver and rider is expected to behave on the platform. Ankit Jain explains how Uber employs deep learning using LSTMs and its huge database to understand and predict the behavior of each and every user on the platform. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Alex Heye (Cray), Ding Ding (Intel)
Precipitation nowcasting is used to predict the future rainfall intensity over a relatively short timeframe. The forecasting resolution and time accuracy required are much higher than for other traditional forecasting tasks. Alexander Heye and Ding Ding explain how to build a precipitation nowcasting system with recurrent neural networks using BigDL on Apache Spark. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1A 15/16 Level: Intermediate
Patty Ryan (Microsoft), CY Yam (Microsoft), Elena Terenzi (Microsoft)
Average rating: *****
(5.00, 1 rating)
Large online fashion retailers must efficiently maintain catalogues of millions of items. Due to human error, it's not unusual that some items have duplicate entries. Since manually trawling such a large catalogue is next to impossible, how can you find these entries? Patty Ryan, CY Yam, and Elena Terenzi explain how they applied deep learning for image segmentation and background removal. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1A 15/16 Level: Beginner
Lars Hulstaert (Microsoft)
Average rating: *****
(5.00, 1 rating)
Transfer learning allows data scientists to leverage insights from large labeled datasets. The general idea of transfer learning is to use knowledge learned from tasks for which a lot of labeled data is available in settings where little labeled data is available. Lars Hulstaert explains what transfer learning is and how it can boost your NLP or CV pipelines. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1A 10 Level: Intermediate
Jonathan Hung (LinkedIn), Keqiu Hu (LinkedIn), Zhe Zhang (LinkedIn)
Jonathan Hung, Keqiu Hu, and Zhe Zhang offer an overview of TensorFlow on YARN (TonY), a framework to natively run TensorFlow on Hadoop. TonY enables running TensorFlow distributed training as a new type of Hadoop application. Its native Hadoop connector, together with other features, aims to run TensorFlow jobs as reliably and flexibly as other first-class citizens on Hadoop. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1A 10 Level: Intermediate
Wangda Tan (Cloudera)
Average rating: ****.
(4.50, 2 ratings)
In order to train deep learning and machine learning models, you must leverage applications such as TensorFlow, MXNet, Caffe, and XGBoost. Wangda Tan discusses new features in Apache Hadoop 3.x to better support deep learning workloads and demonstrates how to run these applications on YARN. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1A 15/16 Level: Intermediate
Moty Fania (Intel), Sergei Kom (Intel)
Average rating: *****
(5.00, 1 rating)
Moty Fania and Sergei Kom share their experience and lessons learned implementing an AI inference platform to enable internal visual inspection use cases. The platform is based on open source technologies and was designed for real-time, streaming, and online actuation. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1A 15/16 Level: Intermediate
Guoqiong Song (Intel), Wenjing Zhan (Talroo), Jacob Eisinger (Talroo )
Can the talent industry make the job search/match more relevant and personalized for a candidate by leveraging deep learning techniques? Guoqiong Song, Wenjing Zhan, and Jacob Eisinger demonstrate how to leverage distributed deep learning framework BigDL on Apache Spark to predict a candidate’s probability of applying to specific jobs based on their résumé. Read more.
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1A 15/16 Level: Advanced
Ash Munshi (Pepperdata)
Ash Munshi outlines a technique for labeling applications using runtime measurements of CPU, memory, and network I/O along with a deep neural network. This labeling groups the applications into buckets that have understandable characteristics, which can then be used to reason about the cluster and its performance. Read more.
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1A 15/16 Level: Beginner
Swetha Machanavajhala (Microsoft), Xiaoyong Zhu (Microsoft)
Average rating: *****
(5.00, 3 ratings)
In this auditory world, the human brain processes and reacts effortlessly to a variety of sounds. While many of us take this for granted, there are over 360 million in this world who are deaf or hard of hearing. Swetha Machanavajhala and Xiaoyong Zhu explain how to make the auditory world inclusive and meet the great demand in other sectors by applying deep learning on audio in Azure. Read more.
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1A 21/22 Level: Intermediate
Nir Yungster (JW Player), Kamil Sindi (JW Player)
JW Player—the world’s largest network-independent video platform, representing 5% of global internet video—provides on-demand recommendations as a service to thousands of media publishers. Nir Yungster and Kamil Sindi explain how the company is systematically improving model performance while navigating the many engineering challenges and unique needs of the diverse publishers it serves. Read more.