Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Powering real-time analytics on Xfinity using Kudu

Sridhar Alla (Comcast), Kiran Muglurmath (Comcast)
11:20am–12:00pm Wednesday, 09/28/2016
IoT & real-time
Location: 1 E 12/1 E 13 Level: Intermediate
Average rating: ****.
(4.17, 6 ratings)

Prerequisite knowledge

  • A basic understanding of Hadoop, Hive, Spark, Impala, or Kafka
  • What you'll learn

  • Understand the basics of the new Kudu way of storing data on Hadoop
  • Description

    Kudu is redefining the big data ecosystem and opening doors to capabilities not available before. Comcast is moving in the direction of adopting Kudu with Impala and Spark for several projects, including real-time processing of events from Xfinity devices. Sridhar Alla and Kiran Muglurmath explain how real-time analytics on Comcast Xfinity set-top boxes (STBs) help drive several customer-facing and internal data-science-oriented applications and how Comcast uses Kudu to fill the gaps in batch and real-time storage and computation needs, allowing Comcast to process the high-speed data without the elaborate solutions needed till now.

    Sridhar and Kiran showcase the platform Comcast is testing using Kudu: real-time STB events (tunes) are streamed from Kafka to Spark, which updates Kudu tables with high speed (~5,000 eps) while also sessionizing and maintaining state for tens of millions of devices in Kudu. While the Spark platform updates the transactions in real time directly on HDFS, the middle tier accesses Kudu tables (through Impala) to generate subsecond real-time dashboards while still having the power of Hadoop to deliver batch analytics and integrations with other platforms. This is key to the success of the platform as previously Comcast had to rely on variety of multitiered architectures to both provide fast storage and be able to update just like NoSQL engines—but without the slowness caused by several thousand updates per second. Sridhar and Kiran also explore how Comcast stores half-a-trillion events using Kudu and still gets great performance analyzing the data.

    Photo of Sridhar Alla

    Sridhar Alla

    Comcast

    Sridhar Alla is director of data science and engineering at Comcast. A big data expert, over his career, Sridhar has helped companies large and small solve complex problems such as data warehousing, governance, security, real-time processing, high-frequency trading, and establishing large-scale data science practices. Previously, he was the chief technology officer at cybersecurity firm eIQNetworks and a storage software engineer at Network Appliance. Sridhar is a certified Agile DevOps practitioner and implementer. He is an avid presenter at conferences including Strata + Hadoop World and Spark Summit. Sridhar also provides onsite and online training for several technologies. He has several patents filed with the US PTO on large-scale computing and distributed systems. Sridhar holds a bachelor’s degree in computer science from JNTU in Hyderabad, India. He lives with his wife in New Jersey.

    Photo of Kiran Muglurmath

    Kiran Muglurmath

    Comcast

    Kiran Muglurmath is the executive director of big data analytics at Comcast, where he manages a team of data scientists and big data engineers for machine learning, data mining, and predictive analytics. Prior to Comcast, Kiran was a consulting big data platform architect and data scientist at T-Mobile and Boeing. He holds an MBA from the Kellogg School at Northwestern University and a computer science degree from Bangalore University.

    Comments on this page are now closed.

    Comments

    09/29/2016 7:31am EDT

    are slides available?