Transitioning a full company to the cloud is hard. Starting a data lake halfway through is even harder. Integrating data sources from both cloud and legacy data centers has to be done tailored to the past with the future in mind.
Max Schultze details Zalondo’s end-to-end data integration platform to serve analytical use cases and machine learning throughout the company, covering raw data collection, standardized data preparation (binary conversion, partitioning, etc.), user-driven analytics, and machine learning. Along the way, he shares the technical challenges and lessons learned building and operating data pipelines and computation as a service for hundreds of teams operating at petabyte scale.
Max Schultze is a data engineer working on building a data lake at Zalando, Europe’s biggest online fashion retailer. His focus lies on building data pipelines at scale of terabytes per day and productionizing Spark and Presto as analytical platforms inside the company. He graduated from the Humboldt University of Berlin, actively taking part in the university’s initial development of Apache Flink.
Comments on this page are now closed.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com
Comments
Hi everyone,
the slides are now available :)
It was an amazing presentation. Can you upload or link to the slides? Thanks!
Would it be possible to upload the presentation that you made?