Mar 15–18, 2020

Data & AI Business Summit

The 2020 Data & AI Business Summit will give you a thorough understanding of how some of the world’s leading companies build successful data strategies.

Featured Speakers

Platinum pass holders have access to the Data & AI Business Summit Sun–Wed. Gold and Silver pass holders have access to the Data & AI Business Summit on Mon–Wed. Bronze pass holders have access to the Data & AI Business Summit on Tue–Wed.

Sunday-Monday, March 15-16: 2-Day Training (Platinum & Training passes)
Monday, March 16: Tutorials (Gold & Silver passes)
Tuesday, March 17: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Grand Ballroom 220
Data & AI Conference Keynotes
10:30
Morning break
Wednesday, March 18: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45am | Location: Grand Ballroom 220
Data & AI Conference Keynotes
10:30
Morning break
Add to your personal schedule
9:00am - 5:00pm Sunday, March 15.. & Sunday, March 15
Location: 211 C
Gonzalo Diaz (The Data Incubator), Michael Li (The Data Incubator)
The instructors provide a nontechnical overview of AI and data science. Learn common techniques, how to apply them in your organization, and common pitfalls to avoid. You’ll pick up the language and develop a framework to be able to effectively engage with technical experts and use their input and analysis for your business’s strategic priorities and decision making. Read more.
Add to your personal schedule
9:00am5:00pm Monday, March 16, 2020
Location: LL20A
Jeffrey Vah (Dell Technologies), Gayathri Rau (Dell Technologies), Shuo Xiang (Robinhood), Grace Lu (Robinhood), Maureen Teyssier (Reonomy), Aaron Williams (OmniSci), Sriram Ravindran (Adobe Inc), Deepak Pai (Adobe), Shubranshu Shekhar (Carnegie Mellon University), Sherin Thomas (Lyft), Dan Gifford (Getty Images), Shondria Lopez-Merlos (Florida Conference of The United Methodist Church), Sandhya Raghavan (Virgin Hyperloop One), Patryk Oleniuk (Virgin Hyperloop One), Ian Beaver (Verint), Aryn Sargent (Verint)
From banking to biotech, retail to government, every business sector is changing in the face of abundant data. Get better at defining business problems and applying data solutions at Strata Data & AI. Read more.
Add to your personal schedule
9:00am5:00pm Monday, March 16, 2020
Location: LL20B
Alistair Croll (Solve For Interesting), Erich S. Huang, MD, PhD (Duke Forge), Michael Dulin (University of North Carolina at Charlotte | Gray Matter Analytics), Shannon Fuller (Gray Matter Analytics), Kasie Richards (American Red Cross), Cathy Tanimura (Strava), Yue Cathy Chang (TutumGene), Prashant Warier (Qure.ai), Jozo Dujmovic (San Francisco State University), Shilpa Arora (Atlan)
Dive into health, technology, and data in a day-long series of curated talks. Health Data Day at Strata Data and AI takes a closer look at how algorithms, sensors, and big data are changing healthcare forever. Read more.
Add to your personal schedule
1:30pm5:00pm Monday, March 16, 2020
Location: LL20C
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), John-Mark Agosta (Microsoft)
Bob Horton, Mario Inchiosa, and John-Mark Agosta offer an overview of the fundamental concepts of machine learning (ML) for business and healthcare decision makers and software product managers so you'll be able to make a more effective use of ML results and be better able to evaluate opportunities to apply ML in your industries. Read more.
Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL21B
George Chkadua (TBC Bank), Levan Borchkhadze (TBC Bank)
TBC Bank is in transition from product-centric to a client-centri. Obvious applications of analytics are developing personalized next-best product recommendation for clients. George Chkadua and Levan Borchkhadze explain why the bank decided to implement the ALS user-item matrix factorization method and demographic model. As as result, the pilot increased sales conversion rates by 70%. Read more.
Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL21 D
Mudasir Ahmad (Cisco)
Artificial intelligence (AI) is a natural fit for supply chain operations, where decisions and actions need to be taken daily or even hourly about delivery, manufacturing, quality, logistics, and planning. Mudasir Ahmad explains how AI can be implemented in a scalable and cost-effective way in your business' supply chain operations, and he identifies benefits and potential challenges. Read more.
Add to your personal schedule
11:50am12:30pm Tuesday, March 17, 2020
Location: LL21B
Secondary topics:  Streaming and IoT
Mark Grover (Lyft), Dev Tagare (Lyft)
Mark Grover and Dev Tagare offer you a glimpse at the end-to-end data architecture Lyft uses to reduce data lag appearing in its analytical systems from 24+ hours to under 5 minutes. You'll learn the what and why of tech choices, monitoring, and best practices. They outline the use cases Lyft has enabled, especially in ML model performance and evaluation. Read more.
Add to your personal schedule
11:50am12:30pm Tuesday, March 17, 2020
Location: LL21 C
Secondary topics:  Technology Ethics
Guillaume Saint-Jacques (LinkedIn Corporation), Meg Garlinghouse (LinkedIn Corporation)
Most companies want to ensure their products and algorithms are fair. Guillaume Saint-Jacques and Meg Garlinghouse share LinkedIn's A/B testing approach to fairness and describe new methods that detect whether an experiment introduces bias or inequality. You'll learn about a scalable implementation on Spark and discover examples of use cases and impact at LinkedIn. Read more.
Add to your personal schedule
11:50am12:30pm Tuesday, March 17, 2020
Location: LL21 D
Jike Chong (LinkedIn), Yue Cathy Chang (TutumGene)
More than 85% of data science projects fail. This high failure rate is a main reason why data science is still a science. Jike Chong and Yue "Cathy" Chang outline how you can reduce this failure rate and improve teams' confidence in executing successful data science projects by applying data science technology to business problems: scenario mapping, pattern discovery, and success evaluation. Read more.
Add to your personal schedule
1:45pm2:25pm Tuesday, March 17, 2020
Location: LL21B
Joseph Sirosh (Compass)
Compass is changing real estate by leveraging its industry-leading software to build search and analytical tools that help real estate professionals find, market, and sell homes. Joseph Sirosh details how Compass leverages AWS services, including Amazon Elasticsearch Service, to deliver a complete, scalable home-search solution. Read more.
Add to your personal schedule
1:45pm2:25pm Tuesday, March 17, 2020
Location: LL21 C
Secondary topics:  Streaming and IoT
Minal Mishra (Netflix)
Minal Mishra walks you through Netflix's video player release process, the challenges with deriving time series metrics from a firehose of events, and some of the oddities in running analysis on real-time metrics. Read more.
Add to your personal schedule
1:45pm2:25pm Tuesday, March 17, 2020
Location: LL21 D
Katie Malone (Civis Analytics), Michelangelo D'Agostino (ShopRunner)
Data science is relatively young, and the job of managing data scientists is younger still. Many people undertake this management position without the tools, mentorship, or role models they need to do it well. Katie Malone and Michelangelo D'Agostino review key themes from a recent Strata report that examines the steps necessary to build, manage, sustain, and retain a growing data science team. Read more.
Add to your personal schedule
2:35pm3:15pm Tuesday, March 17, 2020
Location: LL21B
Secondary topics:  Security and Privacy
Sathya Chandran (DataVisor)
Sathya Chandran shares key insights into current trends of account takeover fraud by analyzing 52 billion events generated by 1.1 billion users and developing a set of user mobility features to capture suspicious device and IP-switching patterns. You'll learn to incorporate mobility features into an anomaly detection solution to detect suspicious account activity in real time. Read more.
Add to your personal schedule
2:35pm3:15pm Tuesday, March 17, 2020
Location: LL21 C
Ankit Jain (Uber AI), Piero Molino (Uber AI Labs)
Ankit Jain and Piero Molino detail how to generate better restaurant and dish recommendations in Uber Eats by learning entity embeddings using graph convolutional networks implemented in TensorFlow. Read more.
Add to your personal schedule
2:35pm3:15pm Tuesday, March 17, 2020
Location: LL21 D
Barr Moses (Monte Carlo)
Ever had your CEO or customer look at your report and tell you the numbers look way off? Barr Moses defines data downtime—periods of time when your data is partial, erroneous, missing, or otherwise inaccurate. Data downtime is highly costly for organizations, yet is often addressed ad hoc. You'll explore why data downtime matters to the data industry and how best-in-class teams address it. Read more.
Add to your personal schedule
4:15pm4:55pm Tuesday, March 17, 2020
Location: LL21B
Mario A. Vinasco (Credit Sesame)
Uber spends hundreds of millions of dollars in marketing and constantly optimizes the allocation of these budgets. It deploys complex models, using Python and PyTorch, and borrowing from machine learning (ML) to speed up solvers to optimize marketing investment. Mario Vinasco explains the framework of the marketing spend problem and how it was implemented. Read more.
Add to your personal schedule
4:15pm4:55pm Tuesday, March 17, 2020
Location: LL21 C
Lior Gavish (Barracuda)
Lior Gavish breaks down a machine learning (ML)-based system that detects a highly evasive type of email-based fraud. The system combines innovative techniques for labeling and classifying highly unbalanced datasets with a distributed cloud application capable of processing high-volume communication in real time. Read more.
Add to your personal schedule
4:15pm4:55pm Tuesday, March 17, 2020
Location: LL21 D
Secondary topics:  Security and Privacy
Kathy Winger (Law Offices of Kathy Delaney Winger)
Kathy Winger walks you through what business owners and technology professionals need to know about potential risks in the cybersecurity arena. You'll learn the current legal and data security issues and practices along with what’s happening on the regulatory front. Along the way, you'll learn to mitigate the risks you face. Read more.
Add to your personal schedule
5:05pm5:45pm Tuesday, March 17, 2020
Location: LL21B
Harrison Wang (LiveRamp)
A migration to a new environment is never easy. You'll learn how LiveRamp tackled migrating its large-scale production workflows from its private data center to the cloud while maintaining high uptime. Harrison Wang examines the high-level steps and decisions involved, lessons learned, and what to realistically expect out of a migration. Read more.
Add to your personal schedule
5:05pm5:45pm Tuesday, March 17, 2020
Location: LL21 C
Karthik Ramasamy (Streamlio), Anand Madhavan (Narvar)
Narvar originally used a large collection of point technologies such as AWS Kinesis, Lambda, and Apache Kafka to satisfy its requirements for pub/sub messaging, message queuing, logging, and processing. Karthik Ramasamy and Anand Madhavan walk you through how Narvar moved away from using a slew of technologies and consolidating their use cases using Apache Pulsar. Read more.
Add to your personal schedule
5:05pm5:45pm Tuesday, March 17, 2020
Location: LL21 D
Sanjeev Mohan (Gartner)
The acceleration of the migration of workloads to the cloud isn't a binary journey. Some workloads will still be on-premises and some will be on multiple cloud providers. Sanjeev Mohan identifies key data and analytics considerations in modern data architectures, including strategies to handle data latency, gravity, ingress transformation, compliance, and governance needs and data orchestration. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: LL21B
Secondary topics:  Data Management and Storage
Maulik Soneji (Gojek), Dinesh Kumar (Gojek)
Maulik Soneji and Dinesh Kumar explore Gojek's event-processing library to consume events from Kafka and push it to BigQuery. All of its services are event sourced, and Gojek has a high load of 21K messages per second for few topics, and it has hundreds of topics. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: LL21 C
Joy Rimchala (Intuit), Diane Chang (Intuit)
Explainable AI (XAI) has gained industry traction, given the importance of explaining ML-assisted decisions in human terms and detecting undesirable ML defects before systems are deployed. Joy Rimchala and Diane Chang delve into XAI techniques, advantages and drawbacks of black box versus glass box models, concept-based diagnostics, and real-world examples using design thinking principles. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: LL21 D
Mark Donsky (Okera)
Privacy regulation is increasing worldwide with Europe's GDPR, the California Consumer Privacy Act (CCPA), and the New York Privacy Act (NYPA). Penalties for noncompliance are stiff, but many companies still aren't prepared. Mark Donsky shares how to establish best practices for holistic privacy readiness as part of your data strategy. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 18, 2020
Location: LL21B
Secondary topics:  Streaming and IoT
Jeff Chao (Netflix)
Netflix has experienced an unprecedented global increase in membership over the last several years. Production outages today have greater impact in less time than years before. Jeff Chao details the open-sourced Mantis, which allows Netflix to continue providing great experiences for its members, enabling it to get real-time, granular, cost-effective operational insights. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 18, 2020
Location: LL21 C
Patryk Oleniuk (Virgin Hyperloop One), Sandhya Raghavan (Virgin Hyperloop One)
Patryk Oleniuk and Sandhya Raghava investigate how to use demand data to improve on the design of the fifth mode of transport—Hyperloop. They discuss the passenger demand prediction methods and the tech stack (Spark, koalas, Keras, MLflow) used to build a deep neural network (DNN)-based near-future demand prediction for simulation purposes. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, March 18, 2020
Location: LL21B
Secondary topics:  Data Management and Storage
Qorry Asfar (Pusat Demokrasi dan Hak Asasi Manusia), Muhammad Asfar (University of Airlangga)
With the disclosure of the Cambridge Analytica scandal, political practitioners have started to adopt big data technology to give them better understanding and management of data. Qorry Asfar and Muhammad Asfar provide a big data case study to develop political strategy and examine how technological adoption will shape a better political landscape. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, March 18, 2020
Location: LL21 C
Utkarsh B (Flipkart), Giridhar Yasa (Flipkart)
Utkarsh B. and Giridhar Yasa lead a deep dive into architectural patterns and the solutions Flipkart developed to ensure business continuity to millions of online customers, and how it leveraged technology to avert or mitigate risks from catastrophic failures. Solving for business continuity requires investments application, data management, and infrastructure. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, March 18, 2020
Location: LL21 D
Arvind Prabhakar (StreamSets)
DataOps is the best approach for enterprises to improve business and drives future revenue streams and competitive differentiation, which is why so many businesses are rethinking their data strategy. Arvind Prabhakar explains how DataOps solves all the problems that come along with managing data movement at scale. Read more.
Add to your personal schedule
2:35pm3:15pm Wednesday, March 18, 2020
Location: LL21B
ravi krishnaswamy (Autodesk)
Today’s applications interact with data in a distributed and decentralized world. Using graphs at scale, you can infer communities and your interaction by tracking access to common data across users and applications. Ravi Krishnaswamy displays a real-world product example with millions of users that uses the combined powers of Spark and graph databases to gain insights into customer workflows. Read more.
Add to your personal schedule
2:35pm3:15pm Wednesday, March 18, 2020
Location: LL21 C
Micah Wylde (Lyft)
Lyft processes millions of events per second in real time to compute prices, balance marketplace dynamics, and detect fraud, among many other use cases. Micah Wylde showcases how Lyft uses Kubernetes along with Flink, Beam, and Kafka to enable service engineers and data scientists to easily build real-time data applications. Read more.
Add to your personal schedule
2:35pm3:15pm Wednesday, March 18, 2020
Location: LL21 D
Steven Beales describes applications of NLP, machine learning, and the data-driven rules that generate significant productivity and quality improvements in the complex business workflows of drug safety and pharmacovigilance without large upfront investment. Pragmatic use of AI allows organizations to create immediate value and ROI before widening adoption as their capabilities with AI increase. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, March 18, 2020
Location: LL21B
Kelly Zhiling Wan (LinkedIn), Jason Wang (LinkedIn), Lili Zhou (LinkedIn)
Studies show that good customer services accelerates customers' cohesion toward a product, which increases product engagement and revenue spending. It's traditional to use customer surveys to measure how customers feel about services and products. Kelly Wan, Jason Wang, and Lili Zhou examine the innovative data product to measure customer happiness from LinkedIn. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, March 18, 2020
Location: LL21 C
Penghui Li (Zhaopin), Neng Lu (StreamNative)
Penghui Li and Neng Lu walk you through building an event streaming platform based on Apache Pulsar and simplifying a stream processing pipeline by Pulsar Functions, Pulsar Schema, and Pulsar SQL. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, March 18, 2020
Location: LL21 D
Anand Rao (PwC), Joseph Voyles (PwC)
Anand Rao and Joseph Voyles introduce you to the core differences between software and machine learning model life cycles. They demonstrate how AI’s success also limits its scale and detail leading practices for establishing AIOps to overcome limitations by automating CI/CD, supporting continuous learning, and enabling model safety. Read more.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires