Sep 23–26, 2019

Schedule

Topics

1A 12/14

9:00am Tutorial Efficient ML engineering: Tools and best practices Sourav Dey (Manifold), Jakov Kucan (Manifold)

1:30pm Tutorial Deep learning methods for natural language processing Garrett Hoffman (StockTwits)

1A 15/16

9:00am 2-day Training Hands-on data science with Python (Day 2) Michael Cullan (Pragmatic Institute)

1A 21

9:00am Tutorial SOLD OUT: Managing the complete machine learning lifecycle with MLflow Jules Damji (Databricks)

1:30pm Tutorial Building a recommender system with Amazon ML services Karthik Sonti (Amazon Web Services), Emily Webber (Amazon Web Services), Varun Rao Bhamidimarri (Amazon Web Services)

1A 23/24

9:00am Tutorial Introduction to natural language processing in Python Alice Zhao (Metis)

1:30pm Tutorial Natural language understanding at scale with Spark NLP David Talby (Pacific AI), Alex Thomas (John Snow Labs), Saif Addin Ellafi (John Snow Labs), Claudiu Branzan (Accenture)

1E 09

9:00am Tutorial Serverless streaming architectures and algorithms for the enterprise Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Anurag Khandelwal (Yale University)

1:30pm Tutorial From relational databases to cloud databases: Using the right tool for the right job Gowrishankar Balasubramanian (Amazon Web Services), Rajeev Srinivasan (Amazon Web Services)

1E 12/13

9:00am Tutorial Deep learning from scratch Bruno Goncalves (Data For Science)

1:30pm Tutorial Architecting a data platform for enterprise use Mark Madsen (Teradata), Todd Walter (Archimedata)

1E 14

9:00am Tutorial Running multidisciplinary big data workloads in the cloud with CDP James Morantus (Cloudera), Tony Huinker (Cloudera), Naren Koneru (Cloudera), Ramachandran Venkatesh (Cloudera), Gunther Hagleitner (Cloudera), Olli Draese (Cloudera)

1:30pm Tutorial Kafka and Streams Messaging Manager (SMM) crash course Purnima Reddy Kuchikulla (Cloudera), Dan Chaffelson (Cloudera), Attila Kanto (Cloudera), Tony Wu (Cloudera)

1A 01/02

9:00am 2-day Training SOLD OUT: Big data for managers (Day 2) Michael Li (The Data Incubator), Gonzalo Diaz (The Data Incubator)

1A 03

9:00am 2-day Training Recommendation systems using deep learning (Day 2) Bargava Subramanian (Binaize), Amit Kapoor (narrativeVIZ)

1A 04/05

9:00am 2-day Training Serverless machine learning with TensorFlow and BigQuery (sponsored by Google Cloud) (Day 2) Jeff Davis (Google Cloud)

1E 06

9:00am 2-day Training Professional Kafka development (Day 2) Jesse Anderson (Big Data Institute)

1E 15/16

9:00am Tutorial Getting ready for CCPA: Securing data lakes for heavy privacy regulation Mark Donsky (Okera), Lars George (Okera), Michael Ernest (Dataiku), Ifigeneia Derekli (Cloudera)

1:30pm Tutorial Hands-on machine learning with Kafka-based streaming pipelines Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)

1A 17

9:00am 2-day Training SOLD OUT: Building a serverless big data application on AWS (Day 2) Jorge Lopez (Amazon Web Services), Radhika Ravirala (Amazon Web Services), Nikki Rouda (Amazon Web Services), Jesse Gebhardt (Amazon Web Services), Rajeev Chakrabarti (Amazon Web Services)

1A 18

9:00am 2-day Training Expand your data science and machine learning skills with Python, R, SQL, Spark, and TensorFlow (Day 2) Ian Cook (Cloudera)

1E 07

9:00am 2-day Training Machine learning from scratch in TensorFlow (Day 2) Dylan Bargteil (The Data Incubator)

1A 06

9:00am Tutorial Data Case Studies David Boyle (Audience Strategies), Richard Evans (Statistics Canada), Rosaria Silipo (KNIME), Leah Xu (Spotify), Arup Nanda (Capital One), Victoriya Kalmanovich (Navy), Tusharadri Mukherjee (Lenovo), David Boyle (Audience Strategies), Richard Evans (Statistics Canada), Leah Xu (Spotify), Victoriya Kalmanovich (Navy), Moise Convolbo (Rakuten), Martin Mendez-Costabel (Bayer Crop Science), gloria macia (F. Hoffmann-La Roche AG), Gwen Campbell (Revibe Technologies), Moise Convolbo (Rakuten), Muhammed Idris (Capria VC | TeraCrunch)

1A 07

9:00am Day-Long Training Machine learning for the enterprise (sponsored by IBM) Matt Kirk (Your Chief Scientist), Miguel Maldonado (IBM)

1A 08

9:00am Tutorial Findata Day Alistair Croll (Solve For Interesting), Jennifer Yang (Wells Fargo ECS), Brian Lynch (TD Bank Group), Dan Barker (RSA Security), Rochelle March (Trucost), Catherine Gu (Stanford University), Karan Jaswal (Cinchy), Moto Tohda (Tokyo Century (USA)), Viridiana Lourdes (Ayasdi), Peter Swartz (Altana Trade), Mikheil Nadareishvili (TBC Bank)

1A 10

9:00am Tutorial Building and leading a successful AI practice for your organization Rossella Blatt Vital (Wonderlic), Ross Piper (Wonderlic), Daniel Schmerling (Wonderlic)

1:30pm Tutorial Managing data science in the enterprise Alexander Izydorczyk (Coatue Managment), Benjamin Singleton (JetBlue), Joshua Poduska (Domino Data Lab)

1E 08

9:00am Tutorial Learning Presto: SQL on anything Matt Fuller (Starburst)

1:30pm Tutorial Apache Metron: Open source cybersecurity at scale Carolyn Duby (Cloudera), Madhan Neethiraj (Cloudera), Michael Gregory (Cloudera), Sangeeta Doraiswamy (cloudera)

1E 10

9:00am Tutorial Real-time SQL stream processing at scale with Apache Kafka and KSQL Viktor Gamov (Confluent)

1:30pm Tutorial Foundations for successful data projects Ted Malaska (Capital One), Jonathan Seidman (Cloudera), Matthew Schumpert (Cloudera, Inc.), Raman Rajasekhar (Cloudera Inc), Krishna Maheshwari (Cloudera)

1E 11

9:00am Tutorial Cloudera Edge Management in the IoT Purnima Reddy Kuchikulla (Cloudera), Timothy Spann (Cloudera), Abdelkrim Hadjidj (Cloudera), Andre Araujo (Cloudera), Hemanth Yamijala (Cloudera)

1:30pm Tutorial Sketching data and other magic tricks Sophie Watson (Red Hat), William Benton (Red Hat)

5:00pm Opening Reception | Room: Expo Hall - 3B

12:30pm Lunch | Room: Lunch

10:30am Morning break sponsored by Microsoft | Room: Break

3:00pm Afternoon break sponsored by Dataiku | Room: Break

9:00am-12:30pm (3h 30m) Data Science, Machine Learning, & AI Culture and Organization, Model Development, Governance, Operations

Efficient ML engineering: Tools and best practices

Sourav Dey (Manifold), Jakov Kucan (Manifold)

Sourav Dey and Jakov Kucan walk you through the six steps of the Lean AI process and explain how it helps your ML engineers work as an an integrated part of your development and production teams. You'll get a hands-on example using real-world data, so you can get up and running with Docker and Orbyter and see firsthand how streamlined they can make your workflow.

1:30pm-5:00pm (3h 30m) Data Science, Machine Learning, & AI Deep Learning, Financial Services, Text and Language processing and analysis

Deep learning methods for natural language processing

Garrett Hoffman (StockTwits)

Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include Word2Vec, recurrent neural networks (RNNs) and variants (long short-term memory [LSTM] and gated recurrent unit [GRU]), and convolutional neural networks.

9:00am-5:00pm (8h) Data Science, Machine Learning, & AI

Hands-on data science with Python (Day 2)

Michael Cullan (Pragmatic Institute)

Michael Cullan walks you through developing a machine learning pipeline from prototyping to production. You'll learn about data cleaning, feature engineering, model building and evaluation, and deployment and then extend these models into two applications from real-world datasets. All work will be done in Python.

9:00am-12:30pm (3h 30m) Data Science, Machine Learning, & AI Model Development, Governance, Operations

SOLD OUT: Managing the complete machine learning lifecycle with MLflow

Jules Damji (Databricks)

ML development brings many new complexities beyond the software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information. Jules Damji walks you through MLflow, an open source project that simplifies the entire ML lifecycle, to solve this problem.

1:30pm-5:00pm (3h 30m) Data Science, Machine Learning, & AI Cloud Platforms and SaaS, Deep dive into specific tools, platforms, or frameworks

Building a recommender system with Amazon ML services

Karthik Sonti (Amazon Web Services), Emily Webber (Amazon Web Services), Varun Rao Bhamidimarri (Amazon Web Services)

Karthik Sonti, Emily Webber, and Varun Rao Bhamidimarri introduce you to the Amazon SageMaker machine learning platform and provide a high-level discussion of recommender systems. You'll dig into different machine learning approaches for recommender systems, including common methods such as matrix factorization as well as newer embedding approaches.

9:00am-12:30pm (3h 30m) Data Science, Machine Learning, & AI Text and Language processing and analysis

Introduction to natural language processing in Python

Alice Zhao (Metis)

As a data scientist, we are known to crunch numbers, but you need to decide what to do when you run into text data. Alice Zhao walks you through the steps to turn text data into a format that a machine can understand, explores some of the most popular text analytics techniques, and showcases several natural language processing (NLP) libraries in Python, including NLTK, TextBlob, spaCy, and gensim.

1:30pm-5:00pm (3h 30m) Data Science, Machine Learning, & AI Deep dive into specific tools, platforms, or frameworks, Text and Language processing and analysis

Natural language understanding at scale with Spark NLP

David Talby (Pacific AI), Alex Thomas (John Snow Labs), Saif Addin Ellafi (John Snow Labs), Claudiu Branzan (Accenture)

David Talby, Alex Thomas, Saif Addin Ellafi, and Claudiu Branzan walk you through state-of-the-art natural language processing (NLP) using the highly performant, highly scalable open source Spark NLP library. You'll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve.

9:00am-12:30pm (3h 30m) Data Engineering and Architecture, Streaming and IoT Cloud Platforms and SaaS, Data, Analytics, and AI Architecture, Streaming and IoT, Temporal data and time-series analytics

Serverless streaming architectures and algorithms for the enterprise

Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio), Anurag Khandelwal (Yale University)

Arun Kejariwal, Karthik Ramasamy, and Anurag Khandelwal walk you through the landscape of streaming systems and examine the inception and growth of the serverless paradigm. You'll take a deep dive into Apache Pulsar, which provides native serverless support in the form of Pulsar functions and get a bird’s-eye view of the application domains where you can leverage Pulsar functions.

1:30pm-5:00pm (3h 30m) Data Engineering and Architecture BI, Interactive Analytics and Visualization, Cloud Platforms and SaaS, Data Management and Storage, Data, Analytics, and AI Architecture

From relational databases to cloud databases: Using the right tool for the right job

Gowrishankar Balasubramanian (Amazon Web Services), Rajeev Srinivasan (Amazon Web Services)

Enterprises adopt cloud platforms such as AWS for agility, elasticity, and cost savings. Database design and management requires a different mindset in AWS when compared to traditional RDBMS design. Gowrishankar Balasubramanian and Rajeev Srinivasan explore considerations in choosing the right database for your use case and access pattern while migrating or building a new application on the cloud.

9:00am-12:30pm (3h 30m) Data Science, Machine Learning, & AI Deep Learning

Deep learning from scratch

Bruno Goncalves (Data For Science)

You'll go hands-on to learn the theoretical foundations and principal ideas underlying deep learning and neural networks. Bruno Gonçalves provides the code structure of the implementations that closely resembles the way Keras is structured, so that by the end of the course, you'll be prepared to dive deeper into the deep learning applications of your choice.

1:30pm-5:00pm (3h 30m) Data Engineering and Architecture BI, Interactive Analytics and Visualization, Cloud Platforms and SaaS, Data, Analytics, and AI Architecture

Architecting a data platform for enterprise use

Mark Madsen (Teradata), Todd Walter (Archimedata)

Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build a multiuse data infrastructure that isn't subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure.

9:00am-12:30pm (3h 30m) Data Engineering and Architecture Cloud Platforms and SaaS, Data Management and Storage

Running multidisciplinary big data workloads in the cloud with CDP

James Morantus (Cloudera), Tony Huinker (Cloudera), Naren Koneru (Cloudera), Ramachandran Venkatesh (Cloudera), Gunther Hagleitner (Cloudera), Olli Draese (Cloudera)

Organizations now run diverse, multidisciplinary, big data workloads that span data engineering, data warehousing, and data science applications. Many of these workloads operate on the same underlying data, and the workloads themselves can be transient or long running in nature. There are many challenges with moving these workloads to the cloud. In this talk we start off with a technical deep...

1:30pm-5:00pm (3h 30m) Data Engineering and Architecture, Streaming and IoT Deep dive into specific tools, platforms, or frameworks, Streaming and IoT

Kafka and Streams Messaging Manager (SMM) crash course

Purnima Reddy Kuchikulla (Cloudera), Dan Chaffelson (Cloudera), Attila Kanto (Cloudera), Tony Wu (Cloudera)

Kafka is omnipresent and the backbone of streaming analytics applications and data lakes. The challenge is understanding what's going on overall in the Kafka cluster, including performance, issues, and message flows. Purnima Reddy Kuchikulla and Dan Chaffelson walk you through a hands-on experience to visualize the entire Kafka environment end-to-end and simplify Kafka operations via SMM.

9:00am-5:00pm (8h) Strata Business Summit

SOLD OUT: Big data for managers (Day 2)

Michael Li (The Data Incubator), Gonzalo Diaz (The Data Incubator)

Michael Li and Gonzalo Diaz provide a nontechnical overview of AI and data science. Learn common techniques, how to apply them in your organization, and common pitfalls to avoid. You’ll pick up the language and develop a framework to be able to effectively engage with technical experts and use their input and analysis for your business’s strategic priorities and decision making.

9:00am-5:00pm (8h) Data Science, Machine Learning, & AI

Recommendation systems using deep learning (Day 2)

Bargava Subramanian (Binaize), Amit Kapoor (narrativeVIZ)

Recommendation systems play a significant role—for users, a new world of options; for companies, it drives engagement and satisfaction. Amit Kapoor and Bargava Subramanian walk you through the different paradigms of recommendation systems and introduce you to deep learning-based approaches. You'll gain the practical hands-on knowledge to build, select, deploy, and maintain a recommendation system.

9:00am-5:00pm (8h) Sponsored

Serverless machine learning with TensorFlow and BigQuery (sponsored by Google Cloud) (Day 2)

Jeff Davis (Google Cloud)

Jeff Davis provides a hands-on introduction to designing and building machine learning models on structured data on Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, you'll learn machine learning (ML) concepts and how to implement them using both BigQuery Machine Learning and TensorFlow and Keras.

9:00am-5:00pm (8h) Data Engineering and Architecture

Professional Kafka development (Day 2)

Jesse Anderson (Big Data Institute)

Jesse Anderson offers you an in-depth look at Apache Kafka. You'll learn how Kafka works and how to create real-time systems with it, as well as how to create consumers and publishers. You'll take a look Jesse then walks you through Kafka’s ecosystem, demonstrating how to use tools like Kafka Streams, Kafka Connect, and KSQL.

9:00am-12:30pm (3h 30m) Security and Privacy Privacy and Security

Getting ready for CCPA: Securing data lakes for heavy privacy regulation

Mark Donsky (Okera), Lars George (Okera), Michael Ernest (Dataiku), Ifigeneia Derekli (Cloudera)

New regulations drive compliance, governance, and security challenges for big data. Infosec and security groups must ensure a secured and governed environment across workloads that span on-premises, private cloud, multicloud, and hybrid cloud. Mark Donsky, Lars George, Michael Ernest, and Ifigeneia Derekli outline hands-on best practices for meeting these challenges with special attention to CCPA.

1:30pm-5:00pm (3h 30m) Data Engineering and Architecture Model Development, Governance, Operations

Hands-on machine learning with Kafka-based streaming pipelines

Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)

Boris Lublinsky and Dean Wampler examine ML use in streaming data pipelines, how to do periodic model retraining, and low-latency scoring in live streams. Learn about Kafka as the data backplane, the pros and cons of microservices versus systems like Spark and Flink, tips for TensorFlow and SparkML, performance considerations, metadata tracking, and more.

9:00am-5:00pm (8h) Data Engineering and Architecture

SOLD OUT: Building a serverless big data application on AWS (Day 2)

Jorge Lopez (Amazon Web Services), Radhika Ravirala (Amazon Web Services), Nikki Rouda (Amazon Web Services), Jesse Gebhardt (Amazon Web Services), Rajeev Chakrabarti (Amazon Web Services)

Serverless technologies let you build and scale applications and services rapidly without the need to provision or manage servers. Join the AWS team to learn how to incorporate serverless concepts into your big data architectures. You'll explore design patterns to ingest, store, and analyze your data as you build a big data application using AWS technologies such as S3, Athena, Kinesis, and more.

9:00am-5:00pm (8h) Data Science, Machine Learning, & AI

Expand your data science and machine learning skills with Python, R, SQL, Spark, and TensorFlow (Day 2)

Ian Cook (Cloudera)

Advancing your career in data science requires learning new languages and frameworks—but you face an overwhelming array of choices, each with different syntaxes, conventions, and terminology. Ian Cook simplifies the learning process by outlining the abstractions common to these systems. You'll go hands-on exercises to overcome obstacles to getting started using new tools.

9:00am-5:00pm (8h) Data Science, Machine Learning, & AI

Machine learning from scratch in TensorFlow (Day 2)

Dylan Bargteil (The Data Incubator)

The TensorFlow library provides for the use of computational graphs with automatic parallelization across resources. This architecture is ideal for implementing neural networks. Dylan Bargteil explores TensorFlow's capabilities in Python, demonstrating how to build machine learning algorithms piece by piece and how to use TensorFlow's Keras API with several hands-on applications.

9:00am-5:00pm (8h)

Data Case Studies

David Boyle (Audience Strategies), Richard Evans (Statistics Canada), Rosaria Silipo (KNIME), Leah Xu (Spotify), Arup Nanda (Capital One), Victoriya Kalmanovich (Navy), Tusharadri Mukherjee (Lenovo), David Boyle (Audience Strategies), Richard Evans (Statistics Canada), Leah Xu (Spotify), Victoriya Kalmanovich (Navy), Moise Convolbo (Rakuten), Martin Mendez-Costabel (Bayer Crop Science), gloria macia (F. Hoffmann-La Roche AG), Gwen Campbell (Revibe Technologies), Moise Convolbo (Rakuten), Muhammed Idris (Capria VC | TeraCrunch)

From banking to biotech, retail to government, every business sector is changing in the face of abundant data. Get better at defining business problems and applying data solutions at Strata.

9:00am-5:00pm (8h) Sponsored

Machine learning for the enterprise (sponsored by IBM)

Matt Kirk (Your Chief Scientist), Miguel Maldonado (IBM)

Note: This free workshop, courtesy of IBM, is open to the first 50 registrants. You'll take a fascinating deep dive into the power and applications of machine learning in the enterprise.

9:00am-5:00pm (8h)

Findata Day

Alistair Croll (Solve For Interesting), Jennifer Yang (Wells Fargo ECS), Brian Lynch (TD Bank Group), Dan Barker (RSA Security), Rochelle March (Trucost), Catherine Gu (Stanford University), Karan Jaswal (Cinchy), Moto Tohda (Tokyo Century (USA)), Viridiana Lourdes (Ayasdi), Peter Swartz (Altana Trade), Mikheil Nadareishvili (TBC Bank)

From analyzing risk and detecting fraud to predicting payments and improving customer experience, take a deep dive into the ways data technologies are transforming the financial industry.

9:00am-12:30pm (3h 30m) Executive Briefing and best practices, Strata Business Summit Culture and Organization

Building and leading a successful AI practice for your organization

Rossella Blatt Vital (Wonderlic), Ross Piper (Wonderlic), Daniel Schmerling (Wonderlic)

Creating and leading a successful ML strategy is an elegant orchestration of many components: master key ML concepts, operationalize ML workflow, prioritize highest-value projects, build a high-performing team, nurture strategic partnerships, align with the company’s mission, etc. Rossella Blatt Vital details insights and lessons learned in how to create and lead a flourishing ML practice.

1:30pm-5:00pm (3h 30m) Executive Briefing and best practices, Strata Business Summit Culture and Organization

Managing data science in the enterprise

Alexander Izydorczyk (Coatue Managment), Benjamin Singleton (JetBlue), Joshua Poduska (Domino Data Lab)

The honeymoon era of data science is ending and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders must deliver measurable impact on an increasing share of an enterprise’s KPIs. The speakers explore how leading organizations take a holistic approach to people, process, and technology to build a sustainable advantage.

9:00am-12:30pm (3h 30m) Data Engineering and Architecture BI, Interactive Analytics and Visualization, Data Management and Storage, Deep dive into specific tools, platforms, or frameworks

Learning Presto: SQL on anything

Matt Fuller (Starburst)

Used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and others, Presto has become the ubiquitous open source software for SQL on anything. Presto was built from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs. Join Matt Fuller to learn how to use Presto and explore use cases and best practices you can implement today.

1:30pm-5:00pm (3h 30m) Security and Privacy Privacy and Security

Apache Metron: Open source cybersecurity at scale

Carolyn Duby (Cloudera), Madhan Neethiraj (Cloudera), Michael Gregory (Cloudera), Sangeeta Doraiswamy (cloudera)

Bring your laptop, roll up your sleeves, and get ready to crunch some cybersecurity events with Apache Metron, an open source big data cybersecurity platform. Carolyn Duby walks you through how Metron finds actionable events in real time.

9:00am-12:30pm (3h 30m) Data Engineering and Architecture Data Integration and Data Processing, Deep dive into specific tools, platforms, or frameworks, Streaming and IoT

Real-time SQL stream processing at scale with Apache Kafka and KSQL

Viktor Gamov (Confluent)

Building stream processing applications is certainly one of the hot topics in the IT community. But if you've ever thought you needed to be a programmer to do stream processing and build stream processing data pipelines, think again. Viktor Gamov explores KSQL, the stream processing query engine built on top of Apache Kafka.

1:30pm-5:00pm (3h 30m) Data Engineering and Architecture Culture and Organization

Foundations for successful data projects

Ted Malaska (Capital One), Jonathan Seidman (Cloudera), Matthew Schumpert (Cloudera, Inc.), Raman Rajasekhar (Cloudera Inc), Krishna Maheshwari (Cloudera)

The enterprise data management space has changed dramatically in recent years, and this has led to new challenges for organizations in creating successful data practices. Ted Malaska and Jonathan Seidman detail guidelines and best practices from planning to implementation based on years of experience working with companies to deliver successful data projects.

9:00am-12:30pm (3h 30m) Data Engineering and Architecture, Streaming and IoT Deep dive into specific tools, platforms, or frameworks, Streaming and IoT

Cloudera Edge Management in the IoT

Purnima Reddy Kuchikulla (Cloudera), Timothy Spann (Cloudera), Abdelkrim Hadjidj (Cloudera), Andre Araujo (Cloudera), Hemanth Yamijala (Cloudera)

There are too many edge devices and agents, and you need to control and manage them. Purnima Reddy Kuchikulla, Timothy Spann, Abdelkrim Hadjidj, and Andre Araujo walk you through handling the difficulty in collecting real-time data and the trouble with updating a specific set of agents with edge applications. Get your hands dirty with CEM, which addresses these challenges with ease.

1:30pm-5:00pm (3h 30m) Data Science, Machine Learning, & AI Streaming and IoT, Temporal data and time-series analytics

Sketching data and other magic tricks

Sophie Watson (Red Hat), William Benton (Red Hat)

Go hands-on with Sophie Watson and William Benton to examine data structures that let you answer interesting queries about massive datasets in fixed amounts of space and constant time. This seems like magic, but they'll explain the key trick that makes it possible and show you how to use these structures for real-world machine learning and data engineering applications.

5:00pm-6:30pm (1h 30m)

Opening Reception

Enjoy delicious snacks and beverages with fellow Strata attendees, speakers, and sponsors at the Opening Reception, happening immediately after tutorials on Tuesday.

12:30pm-1:30pm (1h)

Break: Lunch

10:30am-11:00am (30m)

Break: Morning break sponsored by Microsoft

3:00pm-3:30pm (30m)

Break: Afternoon break sponsored by Dataiku