Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Schedule: Streaming and IoT sessions

Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2004
Fabian Hueske (Ververica)
Average rating: *****
(5.00, 1 rating)
Fabian Hueske offers an overview of Apache Flink via the SQL interface, covering stream processing and Flink's various modes of use. Then you'll use Flink to run SQL queries on data streams and contrast this with the Flink DataStream API. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 26, 2019
Location: 2007
Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)
Average rating: ***..
(3.85, 13 ratings)
Boris Lublinsky and Dean Wampler walk you through using ML in streaming data pipeline and doing periodic model retraining and low-latency scoring in live streams. You'll explore using Kafka as a data backplane, the pros and cons of microservices versus systems like Spark and Flink, tips for TensorFlow and SparkML, performance considerations, model metadata tracking, and other techniques. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 27, 2019
Location: 2006
Average rating: ****.
(4.50, 2 ratings)
GE produces a third of the world's power and 60% of its airplane engines—a critical portion of the world's infrastructure that requires meticulous monitoring of the hundreds of sensors streaming data from each turbine. June Andrews and John Rutherford explain how GE's monitoring and diagnostics teams released the first real-time ML systems used to determine turbine health into production. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 28, 2019
Location: 2006
Sijie Guo (StreamNative), Penghui Li (Zhaopin)
Average rating: ****.
(4.00, 1 rating)
Using a messaging system to build an event bus is very common. However, certain use cases demand a messaging system with a certain set of features. Sijie Guo and Penghui Li discuss the event bus requirements for Zhaopin.com, one of China's biggest online recruitment services providers, and explain why the company chose Apache Pulsar. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2004
Fabian Hueske (Ververica)
Average rating: ****.
(4.30, 10 ratings)
Processing streaming data with SQL is becoming increasingly popular. Fabian Hueske explains why SQL queries on streams should have the same semantics as SQL queries on static data. He then shares a selection of common use cases and demonstrates how easily they can be addressed with Flink SQL. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 28, 2019
Location: 2007
Avner Braverman (Binaris)
Average rating: ****.
(4.00, 3 ratings)
What is serverless, and how can it be utilized for data analysis and AI? Avner Braverman outlines the benefits and limitations of serverless with respect to data transformation (ETL), AI inference and training, and real-time streaming. This is a technical talk, so expect demos and code. Read more.
Add to your personal schedule
4:40pm5:20pm Thursday, March 28, 2019
Location: 2006
Jinchul Kim (SK Telecom)
Average rating: **...
(2.17, 6 ratings)
Druid supports autoscaling for data ingestion, but it's only available on AWS EC2. You can't rely on the feature on your private cloud. Jinchul Kim demonstrates autoscale-out/in on Kubernetes, details the benefit on this approach, and discusses the development of Druid Helm charts, rolling updates, and custom metric usage for horizontal autoscaling. Read more.