Mar 15–18, 2020

Sunday, 03/15/2020

9:00am

Add to your personal schedule
9:00am–5:00pm Sunday, 03/15/2020
Training
Jesse Anderson (Big Data Institute)
Jesse Anderson leads a deep dive into Apache Kafka. You'll learn how Kafka works and how to create real-time systems with it. You'll also discover how to create consumers and publishers in Kafka and how to use Kafka Streams, Kafka Connect, and KSQL as you explore the Kafka ecosystem. Read more.
Add to your personal schedule
9:00am–5:00pm Sunday, 03/15/2020
Training
David Anderson (Ververica GmbH), Seth Wiesman (Ververica)
A hands-on introduction to Apache Flink for Java and Scala developers who want to learn to build streaming applications. The curriculum will focus on the core concepts of distributed streaming dataflows, event time, and key-partitioned state, while also looking in depth at the runtime, ecosystem, and use cases. The exercises help you understand how the pieces fit together to solve real problems. Read more.
Add to your personal schedule
9:00am–5:00pm Sunday, 03/15/2020
Training
Rich Ott (The Pragmatic Institute), Michael Li (The Data Incubator)
You’ll learn common techniques, how to apply them in your organization, and common pitfalls to avoid. Though this course, you’ll pick up the language and develop a framework to be able to effectively engage with technical experts and utilize their input and analysis for your business’s strategic priorities and decision making. Read more.
Add to your personal schedule
9:00am–5:00pm Sunday, 03/15/2020
Don Fox (The Pragmatic Institute)
We will walk through all the steps - from prototyping to production - of developing a machine learning pipeline. We’ll look at data cleaning, feature engineering, model building/evaluation, and deployment. Students will extend these models into two applications from real-world datasets. All work will be done in Python. Read more.
Add to your personal schedule
9:00am–5:00pm Sunday, 03/15/2020
Robert Schroll (The Pragmatic Institute)
The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. This training will introduce TensorFlow's capabilities in Python. It will move from building machine learning algorithms piece by piece to using the Keras API provided by TensorFlow with several hands-on applications. Read more.
9:00am–5:00pm Sunday, 03/15/2020
Training
TBC
9:00am–5:00pm Sunday, 03/15/2020
Training
TBC

Monday, 03/16/2020

9:00am

9:00am–12:30pm Monday, 03/16/2020 TBC
9:00am–12:30pm Monday, 03/16/2020
Tutorial
Data Quality
TBC
Add to your personal schedule
9:00am–12:30pm Monday, 03/16/2020
Alice Zhao (Metis)
As a data scientist, we are known to crunch numbers, but what happens when we run into text data? In this tutorial, I will walk through the steps to turn text data into a format that a machine can understand, share some of the most popular text analytics techniques, and showcase several natural language processing (NLP) libraries in Python including NLTK, TextBlob, spaCy and gensim. Read more.
Add to your personal schedule
9:00am–12:30pm Monday, 03/16/2020
Secondary topics:  Streaming and IoT
David Anderson (Ververica GmbH), Seth Wiesman (Ververica)
This tutorial demonstrates that building and managing scalable, stateful, event driven applications can be easier and more straightforward than you might expect. We’ll provide a hands-on introduction to this topic as we implement a ridesharing application together. Read more.
9:00am–12:30pm Monday, 03/16/2020 TBC
Add to your personal schedule
9:00am–12:30pm Monday, 03/16/2020
Danilo Sato (ThoughtWorks)
We will walk you through applying continuous delivery (CD), pioneered by ThoughtWorks, to data science and machine learning. Join in to learn how to make changes to your models while safely integrating and deploying them into production, using testing and automation techniques to release reliably at any time and with a high frequency. Read more.
Add to your personal schedule
9:00am–12:30pm Monday, 03/16/2020
Mehrnoosh Sameki (MERS) (Microsoft), Sarah Bird (Microsoft)
Main focus: Six core principles of responsible AI: fairness, reliability/safety, privacy/security, inclusiveness, transparency and accountability. We will focus on Transparency (Interpretability), Fairness, and Privacy and cover best practices and state-of-the-art open source toolkits that empower researchers, data scientists, and stakeholders to build more trustworthy AI systems. Read more.

1:30pm

Add to your personal schedule
1:30pm–5:00pm Monday, 03/16/2020
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), John-Mark Agosta (Microsoft)
This workshop introduces the fundamental concepts of ML to business and healthcare decision makers and software product managers so that they will be able to make more effective use of machine learning results, and be better able to evaluate opportunities to apply ML in their industries. The optional exercises require a web browser and Microsoft Excel. Read more.
Add to your personal schedule
1:30pm–5:00pm Monday, 03/16/2020
Robert Nishihara (University of California, Berkeley), Ion Stoica (University of California, Berkeley), Philipp Moritz (University of California, Berkeley)
Surprisingly, there is no simple way to scale up Python applications from your laptop to the cloud. Ray is an open source framework for parallel and distributed computing that makes it easy to program and analyze data at any scale by providing general-purpose high-performance primitives. This tutorial will show how to use Ray to scale up Python applications, data processing, and machine learning. Read more.
Add to your personal schedule
1:30pm–5:00pm Monday, 03/16/2020
David Talby (Pacific AI), Alex Thomas (John Snow Labs), Claudiu Branzan (Accenture)
This is a hands-on tutorial on applying the latest advances in deep learning for common NLP tasks such as named entity recognition, document classification, sentiment analysis, spell checking and OCR. Learn to build complete text analysis pipelines using the highly performant, high scalable, open-source Spark NLP library in Python. Read more.
1:30pm–5:00pm Monday, 03/16/2020 TBC
Add to your personal schedule
1:30pm–5:00pm Monday, 03/16/2020
Boris Lublinsky (Lightbend), Dean Wampler (Lightbend)
Machine learning models are data, which means they require the same data governance considerations as the rest of your data. In this tutorial we will concentrate on metadata management for model serving. We will discuss what information about running systems we need and why it is important. We will also show how Apache Atlas can be used for storing and managing this information. Read more.
1:30pm–5:00pm Monday, 03/16/2020
TBC
Add to your personal schedule
1:30pm–5:00pm Monday, 03/16/2020
Patrick Hall (H2O.ai | George Washington University)
Even if you've followed current best practices for model training and assessment, machine learning models can be hacked, socially discriminatory, or just plain wrong. This presentation introduces model debugging strategies to test and fix security vulnerabilities, unwanted social biases, and latent inaccuracies in models. Read more.

Tuesday, 03/17/2020

Wednesday, 03/18/2020

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires