NMC (Nielsen Marketing Cloud) provides customers (both marketers and publishers) with real-time analytics tools to profile their target audiences. To achieve that, the company needs to ingest billions of events per day into its big data stores in a scalable, cost-efficient way.
Itai Yaffe explains how NMC continuously transforms its data infrastructure to support these goals. Itai details how the company went from CSV files and standalone Java applications to multiple Kafka and Spark clusters, performing a mixture of streaming and batch ETLs, and supporting 10x data growth. Join in to hear the company’s experience as an early adopters of Spark Streaming and Spark Structured Streaming and how it overcame the technical barriers the company faced (and there were plenty).
Itai concludes by sharing a rather unique solution of using Kafka to imitate streaming over NMC’s data lake while significantly reducing cloud services costs.
Topics include:
Itai Yaffe is a big data tech lead at Nielsen Identity Engine, where he deals with big data challenges using tools like Spark, Druid, Kafka, and others. He’s also a part of the Israeli chapter’s core team of Women in Big Data. Itai is keen about sharing his knowledge and has presented his real-life experience in various forums in the past.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com