Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Data Science, Machine Learning & AI

March 25-28, 2019
San Francisco, CA

If you're in data, you need to understand machine learning & AI

Machine learning lets you discover hidden insight from your data. It's a simple idea with phenomenal impact and sophisticated use cases like recommenders, text mining, real-time analytics, large-scale anomaly detection, and business forecasting.

At Strata, you’ll get a deeper and broader understanding of machine and deep learning—take a look at the sessions below.

Featured Speakers

Monday, Mar 25 - Tuesday, Mar 26: 2-Day Training (Platinum & Training passes)
Tuesday Mar 26: Tutorials (Gold & Silver passes)
Wednesday Mar 27: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Ballroom
Strata Data Conference Keynotes
10:30am
Morning break
Thursday Mar 28: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Ballroom
Strata Data Conference Keynotes
10:30am
Morning break
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2014
Secondary topics:  Deep Learning
Robert Schroll (The Data Incubator)
The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. Robert Schroll offers an overview of TensorFlow's capabilities in Python, demonstrating how to build machine learning algorithms piece by piece and how to use TensorFlow's Keras API with several hands-on applications. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2016
Don Fox (The Data Incubator)
Don Fox walks you through developing a machine learning pipeline, from prototyping to production. You'll learn about data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2020
Secondary topics:  Deep Learning
Ian Cook (Cloudera)
Advancing your career in data science requires learning new languages and frameworks—but learners face an overwhelming array of choices, each with different syntaxes, conventions, and terminology. Ian Cook simplifies the learning process by elucidating the abstractions common to these systems. Through hands-on exercises, you'll overcome obstacles to getting started using new tools. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 3018
Secondary topics:  Deep Learning, Financial Services, Temporal data and time-series analytics
Francesca Lazzeri (Microsoft)
Francesca Lazzeri walks you through the core steps for using Azure Machine Learning services to train your machine learning models both locally and on remote compute resources. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2002
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Martin Gorner (Google)
Hands-on with Recurrent Neural Networks and Tensorflow. Discover what makes RNNs so powerful for time series analysis. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2009
Secondary topics:  Deep Learning, Text and Language processing and analysis
David Talby (Pacific AI), Alex Thomas (Indeed), Claudiu Branzan (G2 Web Services)
This is a hands-on tutorial for scalable NLP using the highly performant, highly scalable open-source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Media, Marketing, Advertising
David Arpin (Amazon Web Services)
Learn how to use the Amazon SageMaker platform to build a machine learning model to recommend products to customers based on their past preferences. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2001
Secondary topics:  Ethics, Security and Privacy
Iman Saleh (Intel), Cory Ilo (Intel), Cindy Tseng (Intel)
From healthcare to smart home to autonomous vehicles, new applications of autonomous systems are raising ethical concerns including bias, transparency, and privacy. In this tutorial, we will demonstrate tools and capabilities that can help data scientists address these concerns. The tools help bridge the gap between ethicists and regulators, and machine learning practitioners. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2001
Secondary topics:  Ethics
Patrick Hall (H2O.ai | George Washington University), Pramit Choudhary (Oracle(Datascience.com))
If machine learning can lead to financial gains for your organization why isn’t everyone doing it? One reason is training machine learning systems with transparent inner-workings and auditable predictions is difficult. This talk will present the good, bad, and downright ugly lessons learned from the presenters’ years of experience in implementing solutions for interpretable machine learning. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2002
Secondary topics:  Deep Learning, Media, Marketing, Advertising, Model lifecycle management
Abhishek Kumar (Publicis.Sapient), Vijay Agneeswaran (Publicis Sapient), Pramod Singh (Sapient Razorfish)
This tutorial describes deep learning based recommender and personalisation systems that we have built for clients. The tutorial primarily gives the view of TensorFlow Serving and MLFlow for the end-to-end productionalization, including model serving, dockerization, reproducibility and experimentation plus how to use Kubernetes for deployment and orchestration of ML based micro-architectures. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2009
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Jason Dai (Intel), Yuhao Yang (Intel), Jennie Wang (Intel), Guoqiong Song (Intel)
Jason Dai, Yuhao Yang, Jennie Wang, and Guoqiong Song explain how to build and productionize deep learning applications for big data with Analytics Zoo—a unified analytics and AI platform that seamlessly unites Spark, TensorFlow, Keras, and BigDL programs into an integrated pipeline—using real-world use cases from JD.com, MLSListings, the World Bank, Baosight, and Midea/KUKA. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2011
Secondary topics:  AI and machine learning in the enterprise
Chi-Yi Kuan (LinkedIn), Tiger Zhang (LinkedIn), Xiaojing Dong (LinkedIn), Burcu Baran (LinkedIn), Emily Huang (LinkedIn)
Thanks to the rapid growth in data resources, business leaders now appreciate the importance (and the challenge) of mining information from data. Join in as a group of LinkedIn's data scientists share their experiences successfully leveraging emerging techniques to assist in intelligent decision making. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Ethics, Financial Services
Jari Koister (FICO )
Financial Services are increasingly deploying AI services for a wide range of applications such as credit life cycle, fraud, and financial crimes. Such deployment requires models to be interpretable, explainable and resilient to adversarial attacks. Regulatory requirements prohibit application of black-box machine learning models. This talk describes what FICO has developed to support these needs. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud
Tristan Zajonc (Cloudera), Tim Chen (Cloudera)
Data platforms are being asked to support an ever increasing range of workloads and compute environments, including machine learning and elastic cloud platforms. In this talk, we will discuss some emerging capabilities, including running machine learning and Spark workloads on autoscaling container platforms, and share our vision of the road ahead for ML and AI in the cloud. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, Text and Language processing and analysis
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Ali Zaidi (Microsoft)
We show how three cutting-edge machine learning techniques can be used together to up your modeling game: 1. Transfer learning from pre-trained language models 2. Active learning to make more effective use of a limited labeling budget 3. Hyperparameter tuning to maximize model performance We will apply these techniques to a growing business challenge: moderating public discussions. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2016
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Open Data, Data Generation and Data Networks
Jeremy Howard (platform.ai)
When deep learning is able to be easily applied by non-engineers (that possess extensive domain expertise), we can accelerate not only the pace of industry adoption but also the rate at which we uncover interesting and relevant research problems. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, Security and Privacy
Alon Kaufman (Duality), Vinod Vaikuntanathan (MIT and Duality Technologies)
Alon Kaufman and Vinod Vaikuntanathan discuss the challenges and opportunities of machine learning on encrypted data and describe the state of the art in this space. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Financial Services, Text and Language processing and analysis
Chakri Cherukuri (Bloomberg LP)
The main focus of the talk will be on understanding how these models work under the hood and on the interpretability of these models. We’ll look at novel interactive visualizations and diagnostic plots to help us better understand these models. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data
Sarah Aerni (Salesforce)
How does Salesforce manage to make data science an agile partner to over 100,000 customers? We will share the nuts and bolts of the platform and our agile process. From our open-source autoML library (TransmogrifAI) and experimentation to deployment and monitoring, we will cover how the tools make it possible for our data scientist to rapidly iterate and adopt a truly agile methodology. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Text and Language processing and analysis
Michael Johnson (Lockheed Martin), Norris Heintzelman (Lockheed Martin)
How do you train a machine learning model with no training data? Michael Johnson and Norris Heintzelman share their journey implementing multiple solutions to bootstrapping training data in the NLP domain, covering topics including weak supervision, building an active learning framework, and annotation adjudication for named-entity recognition. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  AI and machine learning in the enterprise, Deep Learning, Media, Marketing, Advertising, Retail and e-commerce
Melinda Han Williams (Dstillery)
Customer segmentation based on coarse survey data is a staple of traditional market research. Melinda Han Williams explains how Dstillery uses neural networks to model the digital pathways of 100M consumers and uses the resulting embedding space to cluster customer populations into fine-grained behavioral segments and inform smarter consumer insights—in the process, creating a map of the internet. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Media, Marketing, Advertising, Retail and e-commerce
Ron Bodkin (Google)
Google uses deep learning extensively in new and existing products. Join Ron Bodkin to learn how Google has used deep learning for recommendations at YouTube, in the Play store, and for customers in Google Cloud. You'll explore the role of embeddings, recurrent networks, contextual variables, and wide and deep learning and discover how to do candidate generation and ranking with deep learning. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Retail and e-commerce
Kapil Gupta (Airbnb)
In this talk, we will present how we approach personalization of travelers’ booking experience using Machine Learning. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Model lifecycle management
Ted Dunning (MapR)
Evaluating machine learning models is surprisingly hard, particularly because these systems interact in very subtle ways. Ted Dunning breaks the problem of evaluation apart into operational and function evaluation, demonstrating how to do each without unnecessary pain and suffering. Along the way, he shares exciting visualization techniques that will help make differences strikingly apparent. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Ethics
Sharad Goel (Stanford University)
The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Several formal definitions of fairness have gained prominence, but, as Sharad Goel argues, nearly all of them suffer from significant statistical limitations. Perversely, when used as a design constraint, they can even harm the very groups they were intended to protect. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Chenhui Hu (Microsoft)
Dilated neural networks are a class of recently developed neural networks that achieve promising results in time series forecasting. We introduce representative network architectures of dilated neural networks. Then, we demonstrate their advantages in terms of training efficiency and forecast accuracy by applying them to solve sales forecasting and financial time series forecasting problems. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  Deep Learning, Text and Language processing and analysis
Sonal Gupta (Facebook)
Sonal Gupta explores practical systems for building a conversational AI system for task-oriented queries and details a way to do more advanced compositional understanding, which can understand cross-domain queries, using hierarchical representations. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Deep Learning, Health and Medicine, Text and Language processing and analysis
Yogesh Pandit (Roche), Saif Addin Ellafi (John Snow Labs), Vishakha Sharma (Roche Molecular Solutions)
We’ll show how Roche applies Spark NLP for Healthcare to extract clinical facts from pathology reports and radiology, and the design of the deep learning pipelines used to simplify training, optimization, and inference of such domain-specific models at scale. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Automation in data science and big data, Financial Services, Model lifecycle management
Kelley Rivoire (Stripe)
Production ML applications benefit from reproducible, automated retraining, and deployment of ever-more predictive models trained on ever-increasing amounts of data. Kelley Rivoire explains how Stripe built a flexible API for training machine learning models that's used to train thousands of models per week on Kubernetes, supporting automated deployment of new models with improved performance. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Financial Services, Temporal data and time-series analytics
Ying Yau (AllianceBernstein)
Time series forecasting techniques are applied in a wide range of scientific disciplines, business scenarios, and policy settings. This presentation discuss the applications of statistical time series models, such as ARIMA, VAR, and Regime Switching Models, and machine learning models, such as random forest and neural network-based models, to forecasting problems. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Deep Learning, Retail and e-commerce
Luyang Wang (Office Depot), Jing (Nicole) Kong (Office Depot), Guoqiong Song (Intel)
User-based real-time recommendation systems have become an important topic in ecommerce. Jennie Wang, Lu Wang, and Nicole Kong demonstrate how to build deep learning algorithms using Analytics Zoo with BigDL on Apache Spark and create an end-to-end system to serve real-time product recommendations. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  Deep Learning, Graph technologies and analytics, Text and Language processing and analysis
Gungor Polatkan (LinkedIn)
Talent search systems at LinkedIn strive to match the potential candidates to the hiring needs of a recruiter expressed in terms of a search query. Gungor Polatkan shares the results of the company's deployment of deep learning models on a real-world production system serving 500M+ users through LinkedIn Recruiter. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Data Platforms, Streaming, realtime analytics, and IoT, Transportation and Logistics
Rakesh Kumar (Lyft Inc), Thomas Weise (Lyft)
At the core of Lyft is how we dynamically price our rides - a combination of various data sources, ML models, and streaming infrastructure for low latency, reliability and scalability. This allows the pricing system to be more adaptable to real world changes. The streaming platform powers pricing by bringing together the best of both worlds; ML algorithm in Python and JVM based streaming engine. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Automation in data science and big data, Model lifecycle management, Temporal data and time-series analytics
Ting-Fang Yen (DataVisor)
Ting-Fang Yen details an approach for monitoring production machine learning systems that handle billions of requests daily by discovering detection anomalies, such as spurious false positives, as well as gradual concept drifts when the model no longer captures the target concept. Join in to explore new tools for detecting undesirable model behaviors early in large-scale online ML systems. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Security and Privacy
Mike Lee Williams (Cloudera Fast Forward Labs)
Imagine building a model whose training data is collected on edge devices such as cell phones or sensors. Each device collects data unlike any other, and the data cannot leave the device because of privacy concerns or unreliable network access. This challenging situation is known as federated learning. In this talk we’ll cover the algorithmic solutions and the product opportunities. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Data Platforms, Deep Learning, Streaming, realtime analytics, and IoT
Zhenxiao Luo (Uber)
From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences. Inside Uber, analysts are using deep learning and big data to train models, make predictions, and run analytics in real time. This talk will share Uber’s engineering effort about running real time Analytics with deep learning. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data, Data Platforms, Model lifecycle management
Kevin Moore (Salesforce)
Kevin Moore walks you through how TransmogrifAI—Salesforce's open source AutoML library built on Spark—automatically generates models that are automatically customized to a company's dataset and use case and provides insights into why the model is making the predictions it does. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2009
Secondary topics:  Media, Marketing, Advertising
Boris Yakubchik (Forbes)
Introducing Bertie, our new publishing platform at Forbes. Bertie is an AI assistant that learns from writers at all times and suggests improvements along the way. We will discuss Bertie’s features, architecture, and ultimate goals. We will be giving special attention to how we implement an ensemble of machine learning models that, together, makeup a skill set and personality of the AI assistant. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2011
Sourav Dey (Manifold)
Clustered data is all around us. The best way to attack it? Mixed effect models. This talk explains how the Mixed Effects Random Forests (MERF) model and Python package marries the world of classical mixed effect modeling with modern machine learning algorithms, and how it can be extended to be used with other advanced modeling techniques like gradient boosting machines and deep learning. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and machine learning in the enterprise, Health and Medicine, Security and Privacy
Ram Shankar Kumar (Microsoft (Azure Security))
How can we guarantee to our customers that the ML system we develop is adequately protected from adversarial manipulation? Data scientists, program managers and security experts, will takeaway a framework and corresponding best practices to quantitatively assess the safety of their ML systems. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Security and Privacy
Olivia Wang (Datavisor)
Online fraud flourishes as online services become ubiquitous in our daily life. This talk will discuss how Datavisor leverages cutting-edge deep learning technologies to address the challenges in large-scale fraud detection. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: Expo Hall
Secondary topics:  Security and Privacy, Storage
Alex Ingerman (Google)
Federated Learning is the approach of training ML models across a fleet of participating devices, without collecting their data in a central location. Alex Ingerman introduces Federated Learning, compares the traditional and federated ML workflows, and explores the current and upcoming use cases for decentralized machine learning, with examples from Google's deployment of this technology. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Temporal data and time-series analytics
Jeff Chen (US Bureau of Economic Analysis)
Jeff Chen presents strategies for overcoming time series challenges at the intersection of macroeconomics and data science, drawing from machine learning research conducted at the Bureau of Economic Analysis aimed at improving its flagship product the Gross Domestic Product. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Media, Marketing, Advertising
Ken Johnston (Microsoft), Ankit Srivastava (Microsoft)
These days it’s not about normal growth, it’s about driving hockey-stick levels of growth. Sales & marketing orgs are looking to AI to help growth hack their way to new markets and segments. We have used Mutual Information for many years to help filter out noise and find the critical insights to new cohort of users, businesses and networks and now we can do it at scale across massive data sources. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Security and Privacy, Temporal data and time-series analytics
David Rodriguez (Cisco Systems)
Malicious DNS traffic patterns are inconsistent, ranging from periodic to sporadic, and typically thwart anomaly detection. Using Apache Spark and Stripe’s Bayesian inference software - Rainier, we fit the underlying time-series distribution for millions of domains and outline techniques to identify artificial traffic volumes related to spam, malvertising, and botnets we call masquerading traffic. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Sricharan Kumar (Intuit )
Machine learning is delivering immense value across industries. However, in some instances, machine learning models can produce overconfident results—with the potential for catastrophic outcomes. Kumar Sricharan explains how to address this challenge through Bayesian machine learning and highlights real-world examples to illustrate its benefits. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: Expo Hall
Secondary topics:  Open Data, Data Generation and Data Networks, Security and Privacy
Roger Chen (Computable)
Roger Chen explores new models for generating training data for AI. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2007
Secondary topics:  AI and Data technologies in the cloud, Data Integration and Data Pipelines, Data Platforms
Avner Braverman (Binaris)
What is serverless, and how can it be utilized for data analysis and AI? Avner Braverman outlines the benefits and limitations of serverless with respect to data transformation (ETL), AI inference and training, and real-time streaming. This is a technical talk, so expect demos and code. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Security and Privacy
Animesh Singh (IBM), Tommy Li (IBM)
In this talk we are going to discuss how to provide an implementation for many state-of-the-art methods for attacking and defending classifiers using open source Adversarial Robustness Toolbox. For AI developers, the library provides interfaces that support the composition of comprehensive defense systems using individual methods as building blocks. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Temporal data and time-series analytics
Jonathan Merriman (Verint Intelligent Self Service), Cynthia Freeman (Verint Intelligent Self Service)
An anomaly is a pattern not conforming to past, expected behavior. Its detection has many applications such as tracking business KPIs or fraud spotting in credit card transactions. Unfortunately, there is no one best way to detect anomalies across a variety of domains. We introduce a framework to determine the best anomaly detection method for the application based on time series characteristics. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Graph technologies and analytics, Security and Privacy
Louis DiValentin (Accenture Technology Labs), Dillon Cullinan (Accenture)
In this talk, we will show how Accenture's Cyber Security Lab built Security Analytics Models to detect Attempted Lateral Movement in networks by transforming enterprise scale security data into a graph format, generating graph analytics for individual users, and building time series detection models that visualize the changing graph metrics for security operators. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Financial Services, Temporal data and time-series analytics
Aashish Sheshadri (PayPal Inc)
Deep learning using Sequence to Sequence networks (Seq2Seq) has demonstrated unparalleled success in Neural Machine Translation. A less explored but highly sought-after area of Forecasting can leverage recent gains made in Seq2Seq networks. This talk will introduce the application of deep networks to monitoring and alerting intelligence at PayPal. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2007
Secondary topics:  Deep Learning, Transportation and Logistics
Piero Molino (Uber AI)
Piero Molino offers an overview of Ludwig, a deep learning toolbox that allows you to train models and use them for prediction without the need to write code. It's unique in its ability to help make deep learning easier to understand for nonexperts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Health and Medicine
Kirstin Aschbacher (UCSF Cardiology)
Some people use digital devices to track their blood alcohol content (BAC). A BAC-tracking app that could anticipate when a person is likely to have a high BAC could offer coaching in a time of need. Kirstin Aschbacher shares a machine learning approach that predicts user BAC levels with good precision based on minimal information, thereby enabling targeted interventions. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Text and Language processing and analysis, Transportation and Logistics
Divya Choudhary (GOJEK)
Divya Choudhary explains how a random chat message or a note written in a local language sent by customers to their drivers while waiting for a ride to arrive can be utilized to carve out unparalleled information about pickup points and their names (which sometimes even Google Maps has no idea of) and help create a world-class customer pickup experience feature. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data
Till Bergmann (Salesforce)
A problem in predictive modeling data is label leakage. At Enterprise companies such as Salesforce, this problem takes on monstrous proportions as the data is populated by diverse business processes, making it hard to distinguish cause from effect. We will describe how we tackled this problem at Salesforce, where we need to churn out thousands of customer-specific models for any given use case. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Data preparation, data governance, and data lineage, Deep Learning
Sridhar Alla (Comcast), Syed Nasar (Cloudera)
Any Business big or small depends on analytics whether the goal is revenue generation, churn reduction or sales/marketing purposes. No matter the algorithm and the techniques used, the result depends on the accuracy and consistency of the data being processed. In this talk, we will present some techniques used to evaluate the the quality of data and the means to detect the anomalies in the data. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Health and Medicine
Noah Gift (UC Davis ), Michelle Davenport (Quantitative Nutrition)
Learn how to explore exciting ideas in Nutrition using Data Science. In this presentation we analyze the detrimental relationship between sugar and longevity, obesity and chronic diseases. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Graph technologies and analytics
RAPIDS is the next big step in data science, combining the ease of use of common APIs and the power and scalability of GPUs. Bartley Richardson and Joshua Patterson offer an overview of RAPIDS and and explore cuDF, cuGraph, and cuML—a trio of RAPIDS tools that enable data scientists to work with data in a familiar interface and apply graph analytics and traditional machine learning techniques. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Media, Marketing, Advertising
Patrick Miller (Civis Analytics)
Brands that test the content of ads before they are shown to an audience can avoid spending resources on the 11% of ads that cause backlash. Using a survey experiment to choose the best ad typically improves effectiveness of marketing campaigns by 13% on average, and up to 37% for particular demographics. We discuss data collection and statistical methods for analysis and reporting. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Data Platforms, Deep Learning
Yuhao Yang (Intel), Jennie Wang (Intel)
The talk introduces how to run distributed TensorFlow on Apache Spark with the open source software package Analytics Zoo. Compared to other solution, Analytics Zoo is built for production environment and encourages more industry users to run deep learning applications with the Big Data ecosystems. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Streaming, realtime analytics, and IoT, Temporal data and time-series analytics, Transportation and Logistics
Alex Gorbachev (Pythian), Paul Spiegelhalter (The Pythian Group)
Using the example of r a mining haul truck at a leading Canadian mining company, we will cover mapping preventive maintenance needs to supervised machine learning problems, creating labeled datasets, feature engineering from sensors and alerts data, evaluating models— then converting it all to a complete AI solution on Google Cloud Platform which is integrated with existing on-premise systems. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Security and Privacy
Michael Gregory (Cloudera)
The General Data Protection Regulation (GDPR) enacted by the European Union restricts the use of machine learning practices in many cases. Michael Gregory offers an overview of the regulations, important considerations for both EU and non-EU organizations, and tools and technologies to ensure that you're appropriately using ML applications to drive continued transformation and insights. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Media, Marketing, Advertising
Shradha Agrawal (Adobe Systems Inc)
Decision making often struggles with the exploration-exploitation dilemma and multi-armed bandits (MAB) are popular Reinforcement Learning for tackling it. However, increasing the number of decision criteria leads to exponential blowup in complexity of MAB and observational delays doesn’t allow for optimal performance. This talk will introduce MAB and explain how to overcome the above challenges. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Retail and e-commerce
Christopher Lennan (idealo.de)
At idealo.de we trained Convolutional Neural Networks (CNN) for aesthetic and technical image quality predictions. We will present our training approach, practical insights, and shed some light on what the trained models actually learned by visualising the convolutional filter weights and output nodes of our trained models. Read more.