MyHeritage collects billions of events every day, including request logs from web servers and backend services, events describing user activities across different platforms, and change data capture logs recording every change made to its database records.
Delivering these events to analytics is a complex task, requiring a robust and scalable data pipeline.
Ofir Sharony shares MyHeritage’s journey to find a reliable and efficient way to achieve real-time analytics and offers an overview of the system the company decided on: shipping events to Apache Kafka and loading them to analysis in Google BigQuery. Along the way, Ofir compares several data loading techniques, helping you make better choices when building your next data pipeline.
For more information, take a look at Ofir’s recent blog post on the subject.
Ofir Sharony is a senior member of MyHeritage’s backend team, where he is currently focused on building pipelines on-premises and in the cloud using batch and streaming technologies. An expert in building data pipelines, Ofir acquired most of his experience planning and developing scalable server-side solutions.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org