Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Data Science, Machine Learning & AI

March 25-28, 2019
San Francisco, CA

If you're in data, you need to understand machine learning & AI

Machine learning lets you discover hidden insight from your data. It's a simple idea with phenomenal impact and sophisticated use cases like recommenders, text mining, real-time analytics, large-scale anomaly detection, and business forecasting.

At Strata, you’ll get a deeper and broader understanding of machine and deep learning—take a look at the sessions below.

Monday, Mar 25 - Tuesday, Mar 26: 2-Day Training (Platinum & Training passes)
Tuesday Mar 26: Tutorials (Gold & Silver passes)
Wednesday Mar 27: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Ballroom
Strata Data Conference Keynotes
10:30am
Morning break
Thursday Mar 28: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Ballroom
Strata Data Conference Keynotes
10:30am
Morning break
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2014
Secondary topics:  Deep Learning
Robert Schroll (The Data Incubator)
The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. This training will introduce TensorFlow's capabilities in Python. It will move from building machine learning algorithms piece by piece to using the Keras API provided by TensorFlow with several hands-on applications. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2016
Zachary Glassman (The Data Incubator)
We will walk through all the steps - from prototyping to production - of developing a machine learning pipeline. We’ll look at data cleaning, feature engineering, model building/evaluation, and deployment. Students will extend these models into two applications from real-world datasets. All work will be done in Python. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 2020
Secondary topics:  Deep Learning
Ian Cook (Cloudera)
Advancing your career in data science requires learning new languages and frameworks—but learners face an overwhelming array of choices, each with different syntaxes, conventions, and terminology. Ian Cook simplifies the learning process by elucidating the abstractions common to these systems. Through hands-on exercises, you'll overcome obstacles to getting started using new tools. Read more.
Add to your personal schedule
9:00am - 5:00pm Monday, March 25 & Tuesday, March 26
Location: 3018
Secondary topics:  Deep Learning, Financial Services, Temporal data and time-series analytics
Francesca Lazzeri (Microsoft)
Francesca Lazzeri will walk you through the core steps for using Azure Machine Learning services to train your machine learning models both locally and on remote compute resources. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2002
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Martin Gorner (Google)
Hands-on with Recurrent Neural Networks and Tensorflow. Discover what makes RNNs so powerful for time series analysis. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2009
Secondary topics:  Deep Learning, Text and Language processing and analysis
David Talby (Pacific AI), Alexander Thomas (Indeed), Claudiu Branzan (G2 Web Services)
This is a hands-on tutorial for scalable NLP using the highly performant, highly scalable open-source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Media, Marketing, Advertising
David Arpin (Amazon Web Services)
Learn how to use the Amazon SageMaker platform to build a machine learning model to recommend products to customers based on their past preferences. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2001
Secondary topics:  Ethics, Security and Privacy
Iman Saleh (Intel), Cory Ilo (Intel), Cindy Tseng (Intel)
From healthcare to smart home to autonomous vehicles, new applications of autonomous systems are raising ethical concerns including bias, transparency, and privacy. In this tutorial, we will demonstrate tools and capabilities that can help data scientists address these concerns. The tools help bridge the gap between ethicists and regulators, and machine learning practitioners. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2001
Secondary topics:  Ethics
Patrick Hall (H2O.ai | George Washington University)
If machine learning can lead to financial gains for your organization why isn’t everyone doing it? One reason is training machine learning systems with transparent inner-workings and auditable predictions is difficult. This talk will present the good, bad, and downright ugly lessons learned from the presenters’ years of experience in implementing solutions for interpretable machine learning. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2002
Secondary topics:  Deep Learning, Media, Marketing, Advertising, Model lifecycle management
Abhishek Kumar (Publicis.Sapient), Dr. Vijay Srinivas Agneeswaran (Publicis Sapient)
This tutorial describes deep learning based recommender and personalisation systems that we have built for clients. The tutorial primarily gives the view of TensorFlow Serving and MLFlow for the end-to-end productionalization, including model serving, dockerization, reproducibility and experimentation plus how to use Kubernetes for deployment and orchestration of ML based micro-architectures. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2009
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Jason Dai (Intel), Yuhao Yang (Intel), Jennie Wang (Intel), Guoqiong Song (Intel)
In this tutorial, we will show how to build and productionize deep learning applications for Big Data using "Analytics Zoo":https://github.com/intel-analytics/analytics-zoo (a unified analytics + AI platform that seamlessly unites Spark, TensorFlow, Keras and BigDL programs into an integrated pipeline) using real-world use cases (such as JD.com, MLSListings, World Bank, Baosight, Midea/KUKA, etc.) Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 26, 2019
Location: 2011
Secondary topics:  AI and machine learning in the enterprise
Chi-Yi Kuan (LinkedIn), Yongzheng Zhang (LinkedIn), Julie Wang (LinkedIn), Xiaojing Dong (LinkedIn), Wei Di (LinkedIn)
Thanks to the rapid growth in data resources, it is common for business leaders to appreciate the challenge and importance in mining the information from data. In this tutorial, a group of well respected data scientists would share with you their experiences and success on leveraging the emerging techniques in assisting intelligent decisions, that would lead to impactful outcomes at LinkedIn. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Ethics, Financial Services
Jari Koister (FICO )
Financial Services are increasingly deploying AI services for a wide range of applications such as credit life cycle, fraud, and financial crimes. Such deployment requires models to be interpretable, explainable and resilient to adversarial attacks. Regulatory requirements prohibit application of black-box machine learning models. This talk describes what FICO has developed to support these needs. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud
Tristan Zajonc (Cloudera), Tim Chen (Cloudera)
Data platforms are being asked to support an ever increasing range of workloads and compute environments, including machine learning and elastic cloud platforms. In this talk, we will discuss some emerging capabilities, including running machine learning and Spark workloads on autoscaling container platforms, and share our vision of the road ahead for ML and AI in the cloud. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, Text and Language processing and analysis
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Ali Zaidi (Microsoft)
We show how three cutting-edge machine learning techniques can be used together to up your modeling game: 1. Transfer learning from pre-trained language models 2. Active learning to make more effective use of a limited labeling budget 3. Hyperparameter tuning to maximize model performance We will apply these techniques to a growing business challenge: moderating public discussions. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: 2016
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Open Data, Data Generation and Data Networks
Jeremy Howard (Enlitic)
When deep learning is able to be easily applied by non-engineers (that possess extensive domain expertise), we can accelerate not only the pace of industry adoption but also the rate at which we uncover interesting and relevant research problems. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, Security and Privacy
Alon Kaufman (Duality), Vinod Vaikuntanathan (MIT and Duality Technologies)
In this talk, we will discuss the challenges and opportunities of machine learning on encrypted data and describe the state of the art in this space. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Financial Services, Text and Language processing and analysis
Chakri Cherukuri (Bloomberg LP)
In this talk we will see how machine learning and deep learning techniques can be applied in the field of quantitative finance. We will look at a few use-cases in detail and see how machine learning techniques can supplement and sometimes even improve upon already existing statistical models. We will also look at novel visualizations to help us better understand and interpret these models. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data
Sarah Aerni (Salesforce)
How does Salesforce manage to make data science an agile partner to over 100,000 customers? We will share the nuts and bolts of the platform and our agile process. From our open-source autoML library (TransmogrifAI) and experimentation to deployment and monitoring, we will cover how the tools make it possible for our data scientist to rapidly iterate and adopt a truly agile methodology. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Text and Language processing and analysis
Michael Johnson (Lockheed Martin), Norris Heintzelman (Lockheed Martin)
How do you train a machine learning model with no training data? We will present our journey implementing multiple solutions to bootstrapping training data in the NLP domain. We will cover topics including weak supervision, building an active learning framework, and annotation adjudication for Named Entity Recognition. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  AI and machine learning in the enterprise, Deep Learning, Media, Marketing, Advertising, Retail and e-commerce
Melinda Williams (Dstillery)
Customer segmentation based on coarse survey data has long been a staple of traditional market research. We use deep learning to model the digital pathways of over a hundred million consumers and use this embedding to cluster customer populations into fine-grained behavioral segments and inform smarter consumer insights. Along the way, we create a map of the internet. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, Deep Learning, Media, Marketing, Advertising, Retail and e-commerce
Ron Bodkin (Google)
Google uses Deep Learning extensively in new and existing products. Come learn about how Google has used Deep Learning for recommendations at YouTube, the Play store and for customers in Google Cloud. Learn about the role of embeddings, recurrent networks, contextual variables and wide and deep learning and how to do both candidate generation and ranking with Deep Learning. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Retail and e-commerce
Kapil Gupta (Airbnb)
In this talk, we will present how we approach personalization of travelers’ booking experience using Machine Learning. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Model lifecycle management
Ted Dunning (MapR)
Evaluating machine learning models is surprisingly hard. It gets even harder because these systems interact in very subtle ways. I will break the problem of evaluation apart into operational and function evaluation and show how each can be done without unnecessary pain and suffering. In particular, I will show some exciting visualization techniques that help make differences strikingly apparent. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Ethics
Sharad Goel (Stanford University)
By highlighting these challenges in the foundation of fair machine learning, I hope to help researchers and practitioners productively advance the area. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Chenhui Hu (Microsoft)
Dilated neural networks are a class of recently developed neural networks that achieve promising results in time series forecasting. We introduce representative network architectures of dilated neural networks. Then, we demonstrate their advantages in terms of training efficiency and forecast accuracy by applying them to solve sales forecasting and financial time series forecasting problems. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  Deep Learning, Text and Language processing and analysis
Sonal Gupta (Facebook)
In this talk, I will describe practical systems for building a conversational AI system for task oriented queries. I will describe a way to do more advanced compositional understanding, which can understand cross-domain queries, using hierarchical representations. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2009
Secondary topics:  Deep Learning, Health and Medicine, Text and Language processing and analysis
Anish Kejariwal (Roche), David Talby (Pacific AI)
We’ll show how Roche applies Spark NLP for Healthcare to extract clinical facts from pathology reports and radiology, and the design of the deep learning pipelines used to simplify training, optimization, and inference of such domain-specific models at scale. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Automation in data science and big data, Financial Services, Model lifecycle management
Kelley Rivoire (Stripe)
Production ML applications benefit from reproducible, automated retraining and deployment of ever-more predictive models trained on ever-increasing amounts of data. In this talk, I’ll describe how Stripe built a flexible API for training machine learning models that we use to train thousands of models per week on Kubernetes, supporting automated deployment of new models with improved performance. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Financial Services, Temporal data and time-series analytics
Jeffrey Yau (AllianceBernstein)
Time series forecasting techniques are applied in a wide range of scientific disciplines, business scenarios, and policy settings. This presentation discuss the applications of statistical time series models, such as ARIMA, VAR, and Regime Switching Models, and machine learning models, such as random forest and neural network-based models, to forecasting problems. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Deep Learning, Retail and e-commerce
Jennie Wang (Intel), Luyang Wang (OfficeDepot), Jing (Nicole) Kong (OfficeDepot)
User-based real-time recommendation system has become an important topic in e-commerce field nowadays. This talk demonstrates how to build deep learning algorithms using Analytics Zoo with BigDL on Apache Spark and create end to end system to serve real-time product recommendation. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  Deep Learning, Graph technologies and analytics, Text and Language processing and analysis
Gungor Polatkan (LinkedIn)
Talent search systems at LinkedIn strive to match the potential candidates to the hiring needs of a recruiter expressed in terms of a search query. In this talk, we present the results of our deployment of deep learning models on real-world production system serving 500M+users through LinkedIn Recruiter. The challenges and approaches discussed generalize to any multi-faceted search engine. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2011
Secondary topics:  Automation in data science and big data, Model lifecycle management, Temporal data and time-series analytics
Ting-Fang Yen (DataVisor)
We describe a monitor for production machine learning systems that handle billions of requests daily. Our approach discovers detection anomalies, such as spurious false positives, as well as gradual concept drifts when the model no longer captures the target concept. This session presents new tools for detecting undesirable model behaviors early in large-scale online ML systems. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2014
Secondary topics:  Security and Privacy
Mike Lee Williams (Cloudera Fast Forward Labs)
Imagine building a model whose training data is collected on edge devices such as cell phones or sensors. Each device collects data unlike any other, and the data cannot leave the device because of privacy concerns or unreliable network access. This challenging situation is known as federated learning. In this talk we’ll cover the algorithmic solutions and the product opportunities. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2016
Secondary topics:  Data Platforms, Deep Learning, Streaming and realtime analytics
Zhenxiao Luo (Uber)
From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences. Inside Uber, analysts are using deep learning and big data to train models, make predictions, and run analytics in real time. This talk will share Uber’s engineering effort about running real time Analytics with deep learning. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: Expo Hall
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data, Data Platforms, Model lifecycle management
Kevin Moore (Salesforce)
In this talk, I walk through how our open-source AutoML library built on Spark - TransmogrifAI - automatically generates these models and provides insights into why the model is making the predictions it does. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2009
Secondary topics:  Media, Marketing, Advertising
Boris Yakubchik (Forbes)
Introducing Bertie, our new publishing platform at Forbes. Bertie is an AI assistant that learns from writers at all times and suggests improvements along the way. We will discuss Bertie’s features, architecture, and ultimate goals. We will be giving special attention to how we implement an ensemble of machine learning models that, together, makeup a skill set and personality of the AI assistant. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2011
Sourav Dey (Manifold)
Clustered data is all around us. The best way to attack it? Mixed effect models. This talk explains how the Mixed Effects Random Forests (MERF) model and Python package marries the world of classical mixed effect modeling with modern machine learning algorithms, and how it can be extended to be used with other advanced modeling techniques like gradient boosting machines and deep learning. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and machine learning in the enterprise, Health and Medicine, Security and Privacy
Ram Shankar Kumar (Microsoft (Azure Security))
How can we guarantee to our customers that the ML system we develop is adequately protected from adversarial manipulation? Data scientists, program managers and security experts, will takeaway a framework and corresponding best practices to quantitatively assess the safety of their ML systems. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Security and Privacy
Olivia Wang (Datavisor)
Online fraud flourishes as online services become ubiquitous in our daily life. This talk will discuss how Datavisor leverages cutting-edge deep learning technologies to address the challenges in large-scale fraud detection. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: Expo Hall
Secondary topics:  Security and Privacy
Alex Ingerman (Google)
Federated Learning is the approach of training ML models across a fleet of participating devices, without collecting their data in a central location. Alex Ingerman introduces Federated Learning, compares the traditional and federated ML workflows, and explores the current and upcoming use cases for decentralized machine learning, with examples from Google's deployment of this technology. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Temporal data and time-series analytics
Jeff Chen (US Bureau of Economic Analysis)
Jeff Chen presents strategies for overcoming time series challenges at the intersection of macroeconomics and data science, drawing from machine learning research conducted at the Bureau of Economic Analysis aimed at improving its flagship product the Gross Domestic Product. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Media, Marketing, Advertising
Ken Johnston (Microsoft), Ankit Srivastava (Microsoft)
These days it’s not about normal growth, it’s about driving hockey-stick levels of growth. Sales & marketing orgs are looking to AI to help growth hack their way to new markets and segments. We have used Mutual Information for many years to help filter out noise and find the critical insights to new cohort of users, businesses and networks and now we can do it at scale across massive data sources. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Security and Privacy, Temporal data and time-series analytics
David Rodriguez (Cisco Systems)
Malicious DNS traffic patterns are inconsistent, ranging from periodic to sporadic, and typically thwart anomaly detection. Using Apache Spark and Stripe’s Bayesian inference software - Rainier, we fit the underlying time-series distribution for millions of domains and outline techniques to identify artificial traffic volumes related to spam, malvertising, and botnets we call masquerading traffic. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Temporal data and time-series analytics
Sricharan Kumar (Intuit )
Machine learning is delivering immense value across industries. However, in some instances machine learning models can produce over-confident results - with the potential for catastrophic outcomes. In this talk, we'll describe how to address this challenge through Bayesian machine learning, and highlight real-world examples to illustrate its benefits. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: Expo Hall
Secondary topics:  Open Data, Data Generation and Data Networks, Security and Privacy
Roger Chen (Computable Labs)
New models for generating training data for AI Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Security and Privacy
Animesh Singh (IBM), Tommy Li (IBM)
In this talk we are going to discuss how to provide an implementation for many state-of-the-art methods for attacking and defending classifiers using open source Adversarial Robustness Toolbox. For AI developers, the library provides interfaces that support the composition of comprehensive defense systems using individual methods as building blocks. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Temporal data and time-series analytics
Jonathan Merriman (Verint Intelligent Self Service), Cynthia Freeman (Verint Intelligent Self Service)
An anomaly is a pattern not conforming to past, expected behavior. Its detection has many applications such as tracking business KPIs or fraud spotting in credit card transactions. Unfortunately, there is no one best way to detect anomalies across a variety of domains. We introduce a framework to determine the best anomaly detection method for the application based on time series characteristics. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Graph technologies and analytics, Security and Privacy
Louis DiValentin (Accenture Technology Labs), Dillon Cullinan (Accenture)
In this talk, we will show how Accenture's Cyber Security Lab built Security Analytics Models to detect Attempted Lateral Movement in networks by transforming enterprise scale security data into a graph format, generating graph analytics for individual users, and building time series detection models that visualize the changing graph metrics for security operators. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Financial Services, Temporal data and time-series analytics
Aashish Sheshadri (PayPal Inc)
Deep learning using Sequence to Sequence networks (Seq2Seq) has demonstrated unparalleled success in Neural Machine Translation. A less explored but highly sought-after area of Forecasting can leverage recent gains made in Seq2Seq networks. This talk will introduce the application of deep networks to monitoring and alerting intelligence at PayPal. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Health and Medicine
Kirstin Aschbacher (UCSF Cardiology)
Some people use digital devices to track their blood alcohol content (BAC) – for example, to avoid driving drunk. If a BAC-tracking App could anticipate when a person is likely to have a high BAC, it might offer coaching in a time of need. We offer a machine learning approach that predicts user BAC levels with good precision based on minimal information, thereby enabling targeted interventions. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Text and Language processing and analysis, Transportation and Logistics
Divya Choudhary (GOJEK)
Who would have imagined that a random chat message or a note written in a local language sent by customers to their drivers while waiting for a ride/car to arrive for their pickup can be utilised to carve out unparalleled information about pickup points, their names that sometimes even Google map has no idea of & to finally help in creating a world class customer pick-up experience feature! Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Automation in data science and big data
Till Bergmann (Salesforce)
A problem in predictive modeling data is label leakage. At Enterprise companies such as Salesforce, this problem takes on monstrous proportions as the data is populated by diverse business processes, making it hard to distinguish cause from effect. We will describe how we tackled this problem at Salesforce, where we need to churn out thousands of customer-specific models for any given use case. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Data preparation, data governance, and data lineage, Deep Learning
Sridhar Alla (Comcast), Syed Nasar (Cloudera)
Any Business big or small depends on analytics whether the goal is revenue generation, churn reduction or sales/marketing purposes. No matter the algorithm and the techniques used, the result depends on the accuracy and consistency of the data being processed. In this talk, we will present some techniques used to evaluate the the quality of data and the means to detect the anomalies in the data. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Health and Medicine
Noah Gift (UC Davis ), Michelle Davenport (Ritual)
Learn how to explore exciting ideas in Nutrition using Data Science. In this presentation we analyze the detrimental relationship between sugar and longevity, obesity and chronic diseases. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Graph technologies and analytics
The next big step in data science combines the ease of use of common Python APIs but with the power and scalability of GPUs. This session highlights the progress that has been made on PyGDF, the first step to give data scientists access to familiar APIs while increasing speed. We also discuss how to get started doing data sciend on the GPU and provide use cases involving graph analytics. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  AI and Data technologies in the cloud, AI and machine learning in the enterprise, Media, Marketing, Advertising
Patrick Miller (Civis Analytics)
Brands that test the content of ads before they are shown to an audience can avoid spending resources on the 11% of ads that cause backlash. Using a survey experiment to choose the best ad typically improves effectiveness of marketing campaigns by 13% on average, and up to 37% for particular demographics. We discuss data collection and statistical methods for analysis and reporting. Read more.
Add to your personal schedule
3:50pm4:30pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Data Platforms, Deep Learning
Yuhao Yang (Intel)
The talk introduces how to run distributed TensorFlow on Apache Spark with the open source software package Analytics Zoo. Compared to other solution, Analytics Zoo is built for production environment and encourages more industry users to run deep learning applications with the Big Data ecosystems. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2009
Secondary topics:  Streaming and realtime analytics, Temporal data and time-series analytics, Transportation and Logistics
Alex Gorbachev (Pythian), Paul Speigelhalter (The Pythian Group)
Using the example of r a mining haul truck at a leading Canadian mining company, we will cover mapping preventive maintenance needs to supervised machine learning problems, creating labeled datasets, feature engineering from sensors and alerts data, evaluating models— then converting it all to a complete AI solution on Google Cloud Platform which is integrated with existing on-premise systems. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2011
Secondary topics:  Security and Privacy
Michael Gregory (Cloudera)
The General Data Protection Regulation (GDPR) enacted by the European Union can restrict the use of Machine Learning practices in many cases. This presentation will provide an overview of the regulations, important considerations for both EU and non-EU organizations and tools and technologies to ensure that ML applications can appropriately be used to drive continued transformation and insights. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2014
Secondary topics:  Media, Marketing, Advertising
Shradha Agrawal (Adobe Systems Inc)
Decision making often struggles with the exploration-exploitation dilemma and multi-armed bandits (MAB) are popular Reinforcement Learning for tackling it. However, increasing the number of decision criteria leads to exponential blowup in complexity of MAB and observational delays doesn’t allow for optimal performance. This talk will introduce MAB and explain how to overcome the above challenges. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2016
Secondary topics:  Deep Learning, Retail and e-commerce
Christopher Lennan (idealo.de)
At idealo.de we trained Convolutional Neural Networks (CNN) for aesthetic and technical image quality predictions. We will present our training approach, practical insights, and shed some light on what the trained models actually learned by visualising the convolutional filter weights and output nodes of our trained models. Read more.