Mar 15–18, 2020

Tutorials

These expert-led presentations on Monday, March 16 give you a chance to dive deep. To attend tutorials, you must register for a Gold or Silver pass (does not include access to training courses on Sunday or Monday).

Monday, March 16

Add to your personal schedule
9:00am12:30pm
Location: LL21A
Alice Zhao (Metis)
Data scientists are known to crunch numbers, but you may also run into text data. Alice Zhao teaches you to turn text data into a format that a machine can understand, identifies some of the most popular text analytics techniques, and showcases several natural language processing (NLP) libraries in Python including the natural language toolkit (NLTK), TextBlob, spaCy, and gensim. Read more.
Add to your personal schedule
9:00am12:30pm
Location: LL21 C
Sourav Dey (Manifold), Alex Ng (Manifold)
Today, ML engineers are working at the intersection of data science and software engineering—that is, MLOps. Sourav Dey and Alex Ng highlight the six steps of the Lean AI process and explain how it helps ML engineers work as an integrated part of development and production teams. You'll go hands-on using real-world data so you can get up and running seamlessly. Read more.
Add to your personal schedule
9:00am12:30pm
Location: LL21 E/F
Mehrnoosh Sameki (MERS) (Microsoft), Sarah Bird (Microsoft)
Mehrnoosh Sameki and Sarah Bird examine six core principles of responsible AI: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability, focusing on transparency, fairness, and privacy. You'll discover best practices and state-of-the-art open source toolkits that empower researchers, data scientists, and stakeholders to build trustworthy AI systems. Read more.
Add to your personal schedule
9:00am12:30pm
Location: LL21B
David Anderson (Ververica), Seth Wiesman (Ververica)
David Anderson and Seth Wiesman demonstrate how building and managing scalable, stateful, event-driven applications can be easier and more straightforward than you might expect. You'll go hands-on to implement a ride-sharing application together. Read more.
Add to your personal schedule
9:00am12:30pm
Location: LL21 D
Danilo Sato (ThoughtWorks)
Danilo Sato lead you through applying continuous delivery (CD) to data science and machine learning (ML). Join in to learn how to make changes to your models while safely integrating and deploying them into production using testing and automation techniques to release reliably at any time and with a high frequency. Read more.
Add to your personal schedule
9:00am12:30pm
Location: LL20D
Matt Harrison (MetaSnake)
You can use pandas to load data, inspect it, tweak it, visualize it, and do analysis with only a few lines of code. Matt Harrison leads a deep dive in plotting and Matplotlib integration, data quality, and issues such as missing data. Matt uses the split-apply-combine paradigm with groupBy and Pivot and explains stacking and unstacking data. Read more.
Add to your personal schedule
9:00am12:30pm
Location: 210 D/H
Fatma Tarlaci (Quansight)
Language is at the heart of everything we—humans—do. Natural language processing (NLP) is one of the most challenging tasks of artificial intelligence, mainly due to the difficulty of detecting nuances and common sense reasoning in natural language. Fatma Tarlaci invites you to learn more about NLP and get a complete hands-on implementation of an NLP deep learning model. Read more.
Add to your personal schedule
9:00am12:30pm
Location: 210 F
Catherine Nelson (Concur Labs, SAP Concur), Hannes Hapke (Wunderbar.ai)
Most deep learning models don’t get analyzed, validated, and deployed. Catherine Nelson and Hannes Hapke explain the necessary steps to release machine learning models for real-world applications. You'll view an example project using the TensorFlow ecosystem, focusing on how to analyze models and deploy them efficiently. Read more.
Add to your personal schedule
9:00am12:30pm
Location: 210A
Jike Chong (LinkedIn), Yue Cathy Chang (TutumGene)
More than 85% of data science projects fail. This high failure rate is a main reason why data science is still a "science." As data science practitioners, reducing this failure rate is a priority. Jike Chong and Yue Cathy Chang explain the three key steps of applying data science technology to business problems and three concerns for applying domain insights in AI and ML initiatives. Read more.
Add to your personal schedule
9:00am12:30pm
Location: 210 E
Paroma Varma (Snorkel)
Paroma Varma teaches you how to build and manage training datasets programmatically with Snorkel, an open source framework developed at the Stanford AI Lab, and demonstrates how this can lead to more efficiently building and managing machine learning (ML) models in a range of practical settings. Read more.
Add to your personal schedule
9:00am12:30pm
Location: 210 B
Robert Crowe (Google)
Putting together an ML production pipeline for training, deploying, and maintaining ML and deep learning applications is much more than just training a model. Robert Crowe outlines what's involved in creating a production ML pipeline and walks you through working code. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL21B
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Anurag Khandelwal (RISELab, UC Berkeley)
Arun Kejariwal, Karthik Ramasamy, and Anurag Khandelwal walk you through through the landscape of streaming systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage. You'll get an overview of the inception and growth of the serverless paradigm. They explore Apache Pulsar, which provides native serverless support in the form of Pulsar functions. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL21 E/F
Patrick Hall (H2O.ai | George Washington University)
Even if you've followed current best practices for model training and assessment, machine learning models can be hacked, socially discriminatory, or just plain wrong. Patrick Hall breaks down model debugging strategies to test and fix security vulnerabilities, unwanted social biases, and latent inaccuracies in models. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL21 C
Boris Lublinsky (Lightbend), Dean Wampler (Lightbend)
Machine learning (ML) models are data, which means they require the same data governance considerations as the rest of your data. Boris Lublinsky and Dean Wampler outline metadata management for model serving and explore what information about running systems you need and why it's important. You'll also learn how Apache Atlas can be used for storing and managing this information. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL20C
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), John-Mark Agosta (Microsoft)
Robert Horton, Mario Inchiosa, and John-Mark Agosta offer an overview of the fundamental concepts of machine learning (ML) to business and healthcare decision makers and software product managers so you'll be able to make a more effective use of ML results and be better able to evaluate opportunities to apply ML in your industries. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL20D
Robert Nishihara (University of California, Berkeley), Ion Stoica (University of California, Berkeley), Philipp Moritz (University of California, Berkeley)
There's no easy way to scale up Python applications to the cloud. Ray is an open source framework for parallel and distributed computing, making it easy to program and analyze data at any scale by providing general-purpose high-performance primitives. Robert Nishihara, Ion Stoica, and Philipp Moritz demonstrate how to use Ray to scale up Python applications, data processing, and machine learning. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: LL21A
David Talby (Pacific AI), Alex Thomas (John Snow Labs), Claudiu Branzan (Accenture)
David Talby, Alex Thomas, and Claudiu Branzan detail the application of the latest advances in deep learning for common natural language processing (NLP) tasks such as named entity recognition, document classification, sentiment analysis, spell checking, and OCR. You'll learn to build complete text analysis pipelines using the highly performant, scalable, open source Spark NLP library in Python. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: 210 F
Vijay Srinivas Agneeswaran (Walmart Labs), Pramod Singh (Walmart Labs ), Akshay kulkarni (Publicis Sapient)
Vijay Srinivas Agneeswaran, Pramod Singh, and Akshay Kulkarni demonstrate the in-depth process of building a text summarization model with an attention network using TensorFlow (TF) 2.0. You'll gain the practical hands-on knowledge to build and deploy a scalable text summarization model on top of Kubeflow. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: 210 B
Dennis Wei (IBM Research)
Dennis Wei teaches you to use and contribute to the new open source Python package AI Explainability 360 directly from its creators. Dennis translates new developments from research labs to data science practitioners in industry. You'll get a first look at the first comprehensive toolkit for explainable AI, including eight diverse and state-of-the-art methods from IBM Research. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: 210 E
Mars Geldard (University of Tasmania), Paris Buttfield-Addison (Secret Lab), Tim Nugent (lonely.coffee)
Mars Geldard, Tim Nugent, and Paris Buttfield-Addison are here to prove Swift isn't just for app developers. Swift for TensorFlow provides the power of TensorFlow with all the advantages of Python (and complete access to Python libraries) and Swift—the safe, fast, incredibly capable open source programming language; Swift for TensorFlow is the perfect way to learn deep learning and Swift. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: 210A
Ira Cohen (Anodot)
While the role of the manager doesn't require deep knowledge of ML algorithms, it does require understanding how ML-based products should be developed. Ira Cohen explores the cycle of developing ML-based capabilities (or entire products) and the role of the (product) manager in each step of the cycle. Read more.
Add to your personal schedule
1:30pm5:00pm
Location: 210 D/H
lukas biewald (Weights & Biases)
Join Lukas Biewald to build and deploy long short-term memories (LSTMs), grated recurrent units (GRUs), and other text classification techniques using Keras and scikit-learn. Read more.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires