Did you know that your Uber rides are powered by Apache Spark? Join Felix Cheung to learn how Uber is building its data platform with Apache Spark at enormous scale and discover the unique challenges the company faced and overcame.
Felix begins at the beginning, explaining how Uber built out its emerging big data platform. Felix looks at how the data stack has evolved to chase the explosive growth in the last few years and inspects the latest overall architecture, diving into the current internal service and tooling offerings, including a few pipeline-as-a-service implementations. Throughout, he highlights the role Apache Spark plays in the platform.
Felix then walks you through a few intelligent systems built on this data platform, exploring the design choices made to empower distributed machine learning at scale and adapt it to Uber’s distinctive setting. Felix concludes by analyzing a few unique challenges with reliability, resource utilization, and observability at this volume and scale and shares lessons learned building a data platform at such an enormous scale. Along the way, Felix details best practices for developing, testing, validating, and deploying to production in a multi-data-center environment and explains how Uber juggles business reality and the idealism of free and open source software (FOSS) to overcome the hurdle in engaging the open source community.
Felix Cheung is an engineer at Uber and a PMC and committer for Apache Spark. Felix started his journey in the big data space about five years ago with the then state-of-the-art MapReduce. Since then, he’s (re-)built Hadoop clusters from metal more times than he would like, created a Hadoop distro from two dozen or so projects, and juggled hundreds to thousands of cores in the cloud or in data centers. He built a few interesting apps with Apache Spark and ended up contributing to the project. In addition to building stuff, he frequently presents at conferences, meetups, and workshops. He was also a teaching assistant for the first set of edX MOOCs on Apache Spark.
Comments on this page are now closed.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com
Comments
(please contact me..)
hi can I have access to this presentation?