Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference
Singapore

Schedule: Data science and advanced analytics sessions

Add to your personal schedule
9:00am -5:00pm Monday, December 4 & Tuesday, December 5
Location: 336
Robert Schroll (The Data Incubator)
Robert Schroll demonstrates TensorFlow's capabilities through its Python interface and explores TFLearn, a high-level deep learning library built on TensorFlow. Join in to learn how to use TFLearn and TensorFlow to build machine learning models on real-world data. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, December 5, 2017
Location: 321/322 Level: Intermediate
Jared Lander (Lander Analytics)
Modern statistics has become almost synonymous with machine learning—a collection of techniques that utilize today's incredible computing power. Jared Lander walks you through the available methods for implementing machine learning algorithms in R and explores underlying theories such as the elastic net, boosted trees, and cross-validation. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, December 5, 2017
Location: 328/329 Level: Intermediate
Yufeng Guo (Google)
Yufeng Guo walks you through training and deploying a machine learning system using TensorFlow, a popular open source library. Yufeng takes you from a conceptual overview all the way to building complex classifiers and explains how you can apply deep learning to complex problems in science and industry. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, December 5, 2017
Location: 328/329 Level: Intermediate
Tim Seears (Think Big, a Teradata company), David Mueller (Teradata)
Tim Seears and David Mueller explain how to apply deep learning to improve consumer recommendations by training neural nets to learn categories of interest using embeddings. They then demonstrate how to extend this with WALS matrix factorization to achieve wide and deep learning—a process which is now used in production for the Google Play Store. Read more.
Add to your personal schedule
11:15am11:55am Wednesday, December 6, 2017
Location: Summit 2 Level: Intermediate
Wolff Dobson (Google)
TensorFlow, the world's most popular machine learning framework, is fast, flexible, and production ready. Wolff Dobson, representing the Google Brain team, shares the latest developments in TensorFlow, including tensor processing units (TPUs), distributed training, new APIs and models, and mobile features. Join in to learn what's in store for TensorFlow and how ML can change your business. Read more.
Add to your personal schedule
12:05pm12:45pm Wednesday, December 6, 2017
Location: Summit 1 Level: Intermediate
Jared Lander (Lander Analytics)
One common (but false) knock against R is that it doesn't scale well. Jared Lander shows how to use R in a performant matter both in terms of speed and data size and offers an overview of packages for running R at scale. Read more.
Add to your personal schedule
12:05pm12:45pm Wednesday, December 6, 2017
Location: Summit 2 Level: Intermediate
Danielle Dean (Microsoft), Wee Hyong Tok (Microsoft)
Transfer learning enables you to use pretrained deep neural networks (e.g., AlexNet, ResNet, and Inception V3) and adapt them for custom image classification tasks. Danielle Dean and Wee Hyong Tok walk you through the basics of transfer learning and demonstrate how you can use the technique to bootstrap the building of custom image classifiers. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, December 6, 2017
Location: Summit 1 Level: Intermediate
Aki Ariga (Cloudera)
Aki Ariga explains how to put your machine learning model into production, discusses common issues and obstacles you may encounter, and shares best practices and typical architecture patterns of deployment ML models with example designs from the Hadoop and Spark ecosystem using Cloudera Data Science Workbench. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, December 6, 2017
Location: Summit 2 Level: Intermediate
Bargava Subramanian and Harjinder Mistry share data engineering and machine learning strategies for building an efficient real-time recommendation engine when the transaction data is both big and wide. They also outline a novel way of generating frequent patterns using collaborative filtering and matrix factorization on Apache Spark and serving it using Elasticsearch in the cloud. Read more.
Add to your personal schedule
5:05pm5:45pm Wednesday, December 6, 2017
Location: 321/322 Level: Intermediate
Anand Chitipothu (rorodata)
There are many challenges to deploying machine models in production, including managing multiple versions of models, maintaining staging and production models, keeping track of model performance, logging, and scaling. Anand Chitipothu explores the tools, techniques, and system architecture of a cloud platform built to solve these challenges and the new opportunities it opens up. Read more.
Add to your personal schedule
9:45am10:00am Thursday, December 7, 2017
Location: Hall 404AXF
Secondary topics:  ecommerce
Tony Lee (JD.com)
Details to come. Read more.
Add to your personal schedule
11:15am11:55am Thursday, December 7, 2017
Location: Summit 1 Level: Beginner
Paco Nathan (O'Reilly Media)
Paco Nathan explains how O'Reilly employs AI, from the obvious (chatbots, case studies about other firms) to the less so (using AI to show the structure of content in detail, enhance search and recommendations, and guide editors for gap analysis, assessment, pathing, etc.). Approaches include vector embedding search, summarization, TDA for content gap analysis, and speech-to-text to index video. Read more.
Add to your personal schedule
12:05pm12:45pm Thursday, December 7, 2017
Location: Summit 2 Level: Beginner
Xianyan Jia (Intel), zhenhua wang (JD.com)
Xianyan Jia and Zhenhua Wang explore deep learning applications built successfully with BigDL. They also teach you how to develop fast prototypes with BigDL's off-the-shelf deep learning toolkit and build end-to-end deep learning applications with flexibility and scalability using BigDL on Spark. Read more.
Add to your personal schedule
1:45pm2:25pm Thursday, December 7, 2017
Location: Summit 1 Level: Intermediate
Teresa Tung (Accenture Labs), Ishmeet Grewal (Accenture Labs), Jurgen Weichenberger (Accenture Analytics)
As Accenture scaled to millions of predictive models, it required automation to ensure accuracy, prevent false alarms, and preserve trust. Teresa Tung, Ishmeet Grewal, and Jurgen Weichenberger explain how Accenture implemented a DevOps process for analytical models that's akin to software development—guaranteeing analytics modeling at scale and even in noncloud environments at the edge. Read more.
Add to your personal schedule
1:45pm2:25pm Thursday, December 7, 2017
Location: Summit 2 Level: Beginner
YONGLIANG XU (StarHub), Masaru Dobashi (NTT Data Corp.)
SmartHub and NTT DATA have embarked on a partnership to design next-generation architecture to power the data products that will help generate new insights. YongLiang Xu and Masaru Dobashi explain how deep learning and other analytics models coexist within the same platform to address issues relating to smart cities. Read more.
Add to your personal schedule
2:35pm3:15pm Thursday, December 7, 2017
Location: Summit 1 Level: Intermediate
Kazunori Sato (Google)
BigQuery is Google's fully managed, petabyte-scale data warehouse. Its user-defined function realizes "smart" queries with the power of machine learning, such as similarity searches or recommendations on images or documents with feature vectors and neural network prediction. Kazunori Sato demonstrates how BigQuery and TensorFlow together enable a powerful "data warehouse + ML" solution. Read more.
Add to your personal schedule
2:35pm3:15pm Thursday, December 7, 2017
Location: Summit 2 Level: Intermediate
Chris Hausler (Zendesk), Arwen Griffioen (Zendesk)
Chris Hausler and Arwen Griffioen discuss Zendesk's experience with deep learning, using the example of Answer Bot, a question-answering system that resolves support tickets without agent intervention. They cover the benefits Zendesk has already seen and challenges encountered along the way. Read more.
Add to your personal schedule
4:15pm4:55pm Thursday, December 7, 2017
Location: Summit 1 Level: Beginner
Prateek Nagaria (The Data Team)
Most data scientists use traditional methods of forecasting, such as exponential smoothing or ARIMA, to forecast a product demand. However, when the product experiences several periods of zero demand, approaches such as Croston may provide a better accuracy over these traditional methods. Prateek Nagaria compares traditional and Croston methods in R on intermittent demand time series. Read more.
Add to your personal schedule
4:15pm4:55pm Thursday, December 7, 2017
Location: 328/329 Level: Intermediate
Thomas Dinsmore (Cloudera), Johnson Poh (DBS)
Data science alone is easy. Data science with others, in the enterprise, on shared distributed systems, requires a bit more work. Thomas Dinsmore and Johnson Poh share common technology considerations and patterns for collaboration in large teams and best practices for moving machine learning into production at scale. Read more.
Add to your personal schedule
5:05pm5:45pm Thursday, December 7, 2017
Location: 310/311 Level: Intermediate
Graham Dumpleton (Red Hat)
Jupyter notebooks provide a rich interactive environment for working with data. Running a single notebook is easy, but what if you need to provide a platform for many users at the same time. Graham Dumpleton demonstrates how to use JupyterHub to run a highly scalable environment for hosting Jupyter notebooks in education and business. Read more.