Mar 15–18, 2020

Schedule: Data Science and Machine Learning sessions

Add to your personal schedule
9:00am12:30pm Monday, March 16, 2020
Location: LL21 E/F
Mehrnoosh Sameki (MERS) (Microsoft), Sarah Bird (Microsoft)
Mehrnoosh Sameki and Sarah Bird examine six core principles of responsible AI: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability, focusing on transparency, fairness, and privacy. You'll discover best practices and state-of-the-art open source toolkits that empower researchers, data scientists, and stakeholders to build trustworthy AI systems. Read more.
Add to your personal schedule
9:00am12:30pm Monday, March 16, 2020
Location: LL20D
Matt Harrison (MetaSnake)
You can use pandas to load data, inspect it, tweak it, visualize it, and do analysis with only a few lines of code. Matt Harrison leads a deep dive in plotting and Matplotlib integration, data quality, and issues such as missing data. Matt uses the split-apply-combine paradigm with groupBy and Pivot and explains stacking and unstacking data. Read more.
Add to your personal schedule
9:00am5:00pm Monday, March 16, 2020
Location: LL20A
Jeffrey Vah (Dell Technologies), Gayathri Rau (Dell Technologies), Shuo Xiang (Robinhood), Maureen Teyssier (Reonomy), Aaron Williams (OmniSci), Sriram Ravindran (Adobe Inc), Deepak Pai (Adobe), Shubranshu Shekhar (Carnegie Mellon University), Sherin Thomas (Lyft), Dan Gifford (Getty Images), Shondria Lopez-Merlos (Florida Conference of The United Methodist Church), Sandhya Raghavan (Virgin Hyperloop One), Patryk Oleniuk (Virgin Hyperloop One), Ian Beaver (Verint - Next IT), Aryn Sargent (Verint)
From banking to biotech, retail to government, every business sector is changing in the face of abundant data. Get better at defining business problems and applying data solutions at Strata Data & AI. Read more.
Add to your personal schedule
1:30pm5:00pm Monday, March 16, 2020
Location: LL21 E/F
Patrick Hall (H2O.ai | George Washington University)
Even if you've followed current best practices for model training and assessment, machine learning models can be hacked, socially discriminatory, or just plain wrong. Patrick Hall breaks down model debugging strategies to test and fix security vulnerabilities, unwanted social biases, and latent inaccuracies in models. Read more.
Add to your personal schedule
1:30pm5:00pm Monday, March 16, 2020
Location: LL21 C
Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)
Machine learning (ML) models are data, which means they require the same data governance considerations as the rest of your data. Boris Lublinsky and Dean Wampler outline metadata management for model serving and explore what information about running systems you need and why it's important. You'll also learn how Apache Atlas can be used for storing and managing this information. Read more.
Add to your personal schedule
1:30pm5:00pm Monday, March 16, 2020
Location: LL20C
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), John-Mark Agosta (Microsoft)
Bob Horton, Mario Inchiosa, and John-Mark Agosta offer an overview of the fundamental concepts of machine learning (ML) for business and healthcare decision makers and software product managers so you'll be able to make a more effective use of ML results and be better able to evaluate opportunities to apply ML in your industries. Read more.
Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL20D
Navinder Pal Singh Brar (Walmart Labs)
One of the major use cases for stream processing is real-time fraud detection. Ecommerce has to deal with frauds on a wider scale as more and more companies are trying to provide customers with incentives such as free shipping by moving on to subscription-based models. Navinder Pal Singh Brar dives into the architecture, problems faced, and lessons from building such a pipeline. Read more.
Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL21 D
Mudasir Ahmad (Cisco)
Artificial intelligence (AI) is a natural fit for supply chain operations, where decisions and actions need to be taken daily or even hourly about delivery, manufacturing, quality, logistics, and planning. Mudasir Ahmad explains how AI can be implemented in a scalable and cost-effective way in your business' supply chain operations, and he identifies benefits and potential challenges. Read more.
Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL20A
Alasdair Allan (Babilim Light Industries)
Much of the data we collect is thrown away, but that's about to change; the power envelope needed to run machine learning models on embedded hardware has fallen dramatically, enabling you to put the smarts on the device rather than in the cloud. Alasdair Allan explains how the data you throw away can be processed in real time at the edge, and this has huge implications for how you deal with data. Read more.
Add to your personal schedule
11:50am12:30pm Tuesday, March 17, 2020
Location: LL21 D
Jike Chong (LinkedIn), Yue Cathy Chang (TutumGene)
More than 85% of data science projects fail. This high failure rate is a main reason why data science is still a science. Jike Chong and Yue "Cathy" Chang outline how you can reduce this failure rate and improve teams' confidence in executing successful data science projects by applying data science technology to business problems: scenario mapping, pattern discovery, and success evaluation. Read more.
Add to your personal schedule
4:15pm4:55pm Tuesday, March 17, 2020
Location: LL20D
Ebrahim Safavi (Mist Systems), Jisheng Wang (Mist Systems)
Anomaly detection models are essential to run data-driven businesses intelligently. At Mist Systems, the need for accuracy and the scale of the data impose challenges to build and automate ML pipelines. Ebrahim Safavi and Jisheng Wang explain how recurrent neural networks and novel statistical models allow Mist Systems to build a cloud native solution and automate the anomaly detection workflow. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: Expo Hall
Mehul Sheth (Druva)
Any software product needs to be tested against data, and it's difficult to have a random but realistic dataset representing production data. Mehul Sheth highlights using production data to generate models. Production data is accessed without exposing it or violating any customer agreements on privacy, and the models then generate test data at scale in lower environments. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: LL21 C
Talia Tron (Intuit ), Joy Rimchala (Intuit)
Explainable AI (XAI) has gained industry traction, given the importance of explaining ML-assisted decisions in human terms and detecting undesirable ML defects before systems are deployed. Talia Tron and Joy Rimchala delve into XAI techniques, advantages and drawbacks of black box versus glass box models, concept-based diagnostics, and real-world examples using design thinking principles. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 18, 2020
Location: LL20D
Liqun Shao (Microsoft)
Liqun Shao leads you through a new GitHub repository to show you how data scientists without NLP knowledge can quickly train, evaluate, and deploy state-of-the-art NLP models. She focuses on two use cases with distributed training on Azure Machine Learning with Horovod: GenSen for sentence similarity and BERT for question-answering using Jupyter notebooks for Python. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 18, 2020
Location: LL21 E/F
Alice Zheng (Amazon)
You'll learn four lessons in building and operating large-scale, production-grade machine learning systems at Amazon with Alice Zheng, useful for practitioners and would-be practitioners in the field. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, March 18, 2020
Location: LL21B
Kelly Zhiling Wan (LinkedIn), Jason Wang (LinkedIn), Lili Zhou (LinkedIn)
Studies show that good customer services accelerates customers' cohesion toward a product, which increases product engagement and revenue spending. It's traditional to use customer surveys to measure how customers feel about services and products. Kelly Wan, Jason Wang, and Lili Zhou examine the innovative data product to measure customer happiness from LinkedIn. Read more.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires