NMC provides customers (both marketers and publishers) with real-time analytics tools to profile their target audiences. To achieve that, the company needs to ingest billions of events per day into its big data stores in a scalable, cost-efficient, and consistent way—no loss or duplication.
When working with Spark and Kafka, the way to achieve data consistency is to manage your consumer offsets the right way.
Simona Meriam explains how NMC used to manage its Kafka consumer offsets against Spark-Kafka 0.8 consumer and why the company decided to upgrade from Spark-Kafka 0.8 to 0.10 consumer. Simona reviews the problems encountered during the upgrade and details the process that led to the solution.
Simona Meriam is a big data engineer at Nielsen Marketing Cloud, where she specializes in research and development of solutions for big data infrastructures using cutting-edge technologies such as Spark, Kafka, and Elasticsearch.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com