Mar 15–18, 2020

Schedule: Data Analytics: From Forecasting to Data Viz sessions

Add to your personal schedule
11:00am11:40am Tuesday, March 17, 2020
Location: LL21A
Rashmina Menon (GumGum), jatinder assi (GumGum)
GumGum receives 30 billion programmatic inventory impressions amounting to 25 TB of data per day. By generating near-real-time inventory forecast based on campaign-specific targeting rules, it enables users to set up successful future campaigns. Rashmina Menon and Jatinder Assi highlight the architecture that enables forecasting in less than 30 seconds with Delta Lake and Databricks Delta caching. Read more.
Add to your personal schedule
11:50am12:30pm Tuesday, March 17, 2020
Location: LL21A
Shankar Venkitachalam (Adobe), Megahanath Macha Yadagiri (Carnegie Mellon University), Deepak Pai (Adobe)
Identifying customer stages in a buying cycle enables you to perform personalized targeting depending on the stage. Shankar Venkitachalam, Megahanath Macha Yadagiri, and Deepak Pai identify ML techniques to analyze a customer's clickstream behavior to find the different stages of the buying cycle and quantify the critical click events that help transition a user from one stage to another. Read more.
Add to your personal schedule
1:45pm2:25pm Tuesday, March 17, 2020
Location: LL21A
Secondary topics:  Culture and Organization
Nancy Rausch (SAS)
For data to be meaningful, it needs to be presented in a way people can relate to. Nancy Rausch explains how SAS combined AI and art to tell a compelling data story and how it combined streaming data from local beehives to forecast hive health. It visualized this data in a live-action art sculpture, which helped to bring the data to life in a fun and compelling way. Read more.
Add to your personal schedule
2:35pm3:15pm Tuesday, March 17, 2020
Location: LL21A
Secondary topics:  Data Quality
David Kohn (TimescaleDB)
The sheer volume of time series data from servers, applications, or IoT devices introduces performance challenges, both to insert data at high rates and to process aggregates for subsequent understanding. David Kohn demonstrates how systems can properly continuously maintain up-to-date aggregates, even correctly handling late or out-of-order data, to simplify data analysis. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 18, 2020
Location: LL21A
Secondary topics:  Streaming and IoT
Paige Roberts (Vertica)
What works in production is the only technology criterion that matters. Companies with successful high-scale production IoT analytics programs like Philips, Anritsu, and OptimalPlus show remarkable similarities. IoT at production scale requires certain technology choices. Paige Roberts drills into the architectures of successful production implementations to identify what works and what doesn’t. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 18, 2020
Location: LL21A
Claudiu Barbura (Blueprint)
Claudiu Barbura exposes a tech stack for BI tools and data science notebooks, using live demos to explain the lessons learned using Spark (CPU), BlazingSQL and Rapids.ai (GPU), and Apache Arrow in the quest to exponentially increase the performance of the data virtualizer, which enables real-time access to data sources across different cloud providers and on-premises databases and APIs. Read more.
Add to your personal schedule
1:45pm2:25pm Wednesday, March 18, 2020
Location: LL21A
Secondary topics:  Culture and Organization
Dave Stuart (Department of Defense )
Dave Stuart takes a look into how the US Intelligence Community (IC) uses Jupyter and Python to harness subject matter expertise of analysts in a DIY analytic movement. You'll cover the technical and cultural challenges the community encountered in its quest to find success at a large scale and address the strategies used to mitigate the challenges. Read more.
Add to your personal schedule
2:35pm3:15pm Wednesday, March 18, 2020
Location: LL21A
Kshitij Wadhwa (Rockset), Dhruba Borthakur (Rockset)
Rockset is a serverless search and analytics engine that enables real-time search and analytics on raw data from Amazon DynamoDB—with full featured SQL. Kshitij Wadhwa and Dhruba Borthakur explore how Rockset takes an entirely new approach to loading, analyzing, and serving data so you can run powerful SQL analytics on data from DynamoDB without ETL. Read more.
Add to your personal schedule
4:15pm4:55pm Wednesday, March 18, 2020
Location: LL21A
Chendi Xue (Intel), Jian Zhang (Intel), binwei yang (intel)
Chendi Xue and Jian Zhang explore how Intel accelerated Spark SQL with AVX-supported vectorization technology. They outline the design and evaluation, including how to enable columnar process in Spark SQL, how to use Arrow as intermediate data, how to leverage AVX-enabled Gandiva for data processing, and performance analysis with system metrics and breakdown. Read more.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires