Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Pulsar: Real-time analytics at scale leveraging Kafka, Kylin, and Druid

Tony Ng (WeWork)
5:25pm–6:05pm Wednesday, 09/28/2016
IoT & real-time
Location: River Pavilion Level: Beginner
Average rating: ****.
(4.00, 1 rating)

Prerequisite knowledge

  • A basic knowledge of SQL, OLAP, event processing, and messaging systems
  • What you'll learn

  • Understand how Pulsar integrates Kafka, Kylin, and Druid to provide flexibility and scalability in event and metrics consumption
  • Description

    Enterprises are increasingly demanding real-time analytics and insights. Tony Ng offers an overview of Pulsar, an open source real-time streaming system used at eBay, which can scale to millions of events per second with 4GL SQL-like language support. Pulsar provides real-time sessionization, multidimensional metrics aggregation over time windows, and custom stream creation through data enrichment, filtering, and stateful processing. Tony explains how Pulsar integrates Kafka, Kylin, and Druid to provide flexibility and scalability in event and metrics consumption.

    Topics include:

    • Real-time analytics and its applications, such as personalization, monitoring, and marketing
    • Pulsar’s real-time analytics pipeline
    • Pulsar’s architecture to support high scalability and availability
    • Pulsar’s event-processing framework and language
    • Integration of Pulsar with Kafka to support replay of unprocessed or undelivered events to avoid data loss
    • Integration of Pulsar with Kylin to provide multidimensional slice and dice of data
    • Integration of Pulsar with Druid to provide real-time metrics and dashboards
    Photo of Tony Ng

    Tony Ng

    WeWork

    Tony Ng is a Sr. Director of Engineering at WeWork, where he is responsible for WeWork’s Data Platform.