In early 2016, Spotify decided that it didn’t want to be in the data center business. The future was the cloud. Josh Baer and Alison Gilles explain what it took to move Spotify to the cloud, covering Spotify’s technology choices, challenges faced, and the lessons Spotify learned along the way.
Before migrating to Google Cloud Platform (GCP), data processing at Spotify was mainly done in Hadoop with products in the larger Apache Hadoop ecosystem: Hive for ad hoc analysis, MapReduce for daily batch jobs, Storm for real-time processing, and Spark for machine learning. Josh and Alison describe how Spotify’s data processing platform has evolved through the cloud migration to incorporate GCP offerings like BigQuery, Cloud Dataflow, Cloud Pub/Sub, and TensorFlow.
Josh and Alison also explore some of the organizational changes and culture shifts that the cloud migration has brought—training highly skilled engineers, who are used to solving their own problems, in how to problem solve alongside a provider; leveraging Spotify’s relationship with Google as a vendor; and leading an engineering organization through a transition to focus higher up the stack—as well as some of the beneficial and not-so-beneficial changes the company has made during the still-in-progress migration.
Josh Baer is a data infrastructure product lead at Spotify, where he is leading the data processing track of Spotify’s migration to Google Cloud Platform. During his time at Spotify, Josh has worked on growing Spotify’s Hadoop footprint from 180 machines to 2,000, enabling everyday real-time processing and providing infrastructure for advanced machine learning tasks.
Alison Gilles is director of engineering for data infrastructure at Spotify, where she coaches and leads teams in backend services and data infrastructure. Previously, she led engineering teams at nonprofit organizations in education and corporate social responsibility.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com