Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Schedule: Stream processing and analytics sessions

Add to your personal schedule
9:00am - 5:00pm Monday, September 25 & Tuesday, September 26
Location: 1A 03
Secondary topics:  Architecture, Cloud, Streaming
SOLD OUT
Jesse Anderson (Big Data Institute)
To handle real-time big data, you need to solve two difficult problems: how do you ingest that much data and how will you process that much data? Jesse Anderson explores the latest real-time frameworks (both open source and managed cloud services), discusses the leading cloud providers, and explains how to choose the right one for your company. Read more.
Add to your personal schedule
Add to your personal schedule
9:00am12:30pm Tuesday, September 26, 2017
Location: 1E 14 Level: Intermediate
Secondary topics:  Streaming
Ian Wrigley (StreamSets)
Ian Wrigley demonstrates how Kafka Connect and Kafka Streams can be used together to build real-world, real-time streaming data pipelines. Using Kafka Connect, you'll ingest data from a relational database into Kafka topics as the data is being generated and then process and enrich the data in real time using Kafka Streams before writing it out for further analysis. Read more.
Add to your personal schedule
9:00am5:00pm Tuesday, September 26, 2017
Location: 1E 09
Rose Winterton (Pitney Bowes), Audrey Spencer-Alvarado (Portland Trail Blazers), Amie Elcan (CenturyLink), Sean Power (Repable), Parisa Foster (Play The Future), Nick Selby (CJX, Inc. | Midlothian Police Department), Salema Rice (Allegis Group), Aneesh Karve (Quilt), Derek Ruths (CAI), Kristina Bergman (Integris Software), Natalia Adler (UNICEF HQ), Brandon O'Brien (Expedia, Inc)
In a series of 12 half-hour talks aimed at a business audience, you’ll hear data-themed case studies from household brands and global companies, explaining the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 26, 2017
Location: 1E 14 Level: Beginner
Secondary topics:  Architecture, Streaming
Karthik Ramasamy (Streamlio), Sanjeev Kulkarni (Streamlio), Avrilia Floratau (Microsoft), Ashvin Agrawal (Microsoft), Arun Kejariwal (MZ), Sijie Guo (Streamlio)
Karthik Ramasamy, Sanjeev Kulkarni, Avrilia Floratau, Ashvin Agrawal, Arun Kejariwal, and Sijie Guo walk you through state-of-the-art streaming systems, algorithms, and deployment architectures, covering the typical challenges in modern real-time big data platforms and offering insights on how to address them. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 27, 2017
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Streaming
Dean Wampler (Lightbend)
While stream processing is now popular, streaming architectures must be more reliable and scalable than ever before—more like microservice architectures in fact. Dean Wampler defines "stream" based on characteristics for such systems, using specific tools like Kafka, Spark, Flink, and Akka as examples, and argues that big data and microservices architectures are converging. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 27, 2017
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Architecture, IoT, Streaming
Michael Freedman (TimescaleDB | Princeton)
Michael Freedman offers an overview of TimescaleDB, a new scale-out database designed for time series workloads yet open-sourced and engineered up as a plugin to Postgres. Unlike most time series newcomers, TimescaleDB supports full SQL while achieving fast ingest and complex queries. Read more.
Add to your personal schedule
2:05pm2:45pm Wednesday, September 27, 2017
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Streaming
Dustin Cote (Confluent)
Dustin Cote shares his experience troubleshooting Apache Kafka in production environments and explains how to avoid pitfalls like message loss or performance degradation in your environment. Read more.
Add to your personal schedule
2:55pm3:35pm Wednesday, September 27, 2017
Location: 1A 15/16/17 Level: Intermediate
Roy Ben-Alta (Amazon Web Services), Allan MacInnis (Amazon Web Services)
Speed matters. Today, decisions are made based on real-time insights, but in order to support the substantial growth of streaming data, companies are required to innovate. Roy Ben-Alta and Allan MacInnis explore AWS solutions powered by machine learning and artificial intelligence. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 27, 2017
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Streaming
Fabian Hueske (data Artisans)
Although the most widely used language for data analysis, SQL is only slowly being adopted by open source stream processors. One reason is that SQL's semantics and syntax were not designed with streaming data in mind. Fabian Hueske explores Apache Flink's two relational APIs for streaming analytics—standard SQL and the LINQ-style Table API—discussing their semantics and showcasing their usage. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 27, 2017
Location: 1E 07/08 Level: Beginner
Secondary topics:  Financial services, Media, Streaming
Karthik Ramasamy (Streamlio), Supun Kamburugamuve (Indiana University)
Modern enterprises are data driven and want to move at light speed. To achieve real-time performance, financial applications use streaming infrastructures for low latency and high throughput. Twitter Heron is an open source streaming engine with low latency around 14 ms. Karthik Ramasamy and Supun Kamburugamuvee explain how they ported Heron to Infiniband to achieve latencies as low as 7 ms. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Location: 1A 23/24 Level: Beginner
Secondary topics:  Architecture, Cloud, Streaming
Gwen Shapira (Confluent)
Gwen Shapira explains how the three realities of modern programming—the explosion of data and data systems, building business processes as microservices instead of monolithic applications, and the rise of the public cloud—affect how developers and companies operate today and why companies across all industries are turning to streaming data and Apache Kafka for mission-critical applications. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Location: 1E 07/08 Level: Beginner
Secondary topics:  Streaming
Reuven Lax (Google)
Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. Reuven Lax offers an overview of Beam basic concepts and demonstrates that portability in action. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 28, 2017
Location: 1E 14
Secondary topics:  Streaming
Dean Wampler (Lightbend), Jun Rao (Confluent), Karthik Ramasamy (Streamlio), Pramod Immaneni (DataTorrent)
In a series of three 11-minute presentations, key members of Apache Kafka, Heron, and Apache Apex discuss their respective implementations of exactly once delivery and semantics. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 28, 2017
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Streaming
Tyler Akidau (Google)
What does it mean to execute streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing? And how does all of this relate to the programmatic frameworks we’re all familiar with? Tyler Akidau answers these questions and more as he walks you through key concepts underpinning data processing in general. Read more.
Add to your personal schedule
2:55pm3:35pm Thursday, September 28, 2017
Location: 1E 15/16 Level: Intermediate
Secondary topics:  Streaming
Sahaana Suri (Stanford University)
Sahaana Suri offers an overview of MacroBase, a new analytics engine from Stanford designed to prioritize the scarcest resource in large-scale, fast-moving data streams: human attention. MacroBase allows reconfigurable, real-time root-cause analyses that have already diagnosed issues in production streams in mobile, data center, and industrial applications. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 28, 2017
Location: 1E 07/08 Level: Intermediate
Shant Hovsepian (Arcadia Data)
Streaming visual analytics is a technique for visualizing and interacting with streaming data in near real time. Shant Hovsepian explains how lambda- and polling-based architectures are being disrupted by reactive visualization systems, as streaming engines embrace the CQRS pattern, and offers analysis of visualizing streams from Apache Kafka, Apache Flink, and Apache Spark. Read more.