With its scalable data store, elastic compute, and pay-as-you-go cost model, cloud infrastructure is well-suited for large-scale data engineering workloads, especially those such as ETL and model training batch workloads that use Hive and Spark compute engines. Kostas Sakellis explains how data engineers can leverage the cloud in order to successfully run data engineering workloads. Kostas explores the latest cloud technologies, focusing on data engineering workloads, cost, security, and ease-of-use implications for data engineers, and covers the advantages of the managed service deployment model and security best practices to demonstrate how to apply these technologies in your own projects.
Kostas Sakellis is the lead and engineering manager of the Apache Spark team at Cloudera. Kostas holds a bachelor’s degree in computer science from the University of Waterloo, Canada.
Philip Langdale is the engineering lead for cloud at Cloudera. He joined the company as one of the first engineers building Cloudera Manager and served as an engineering lead for that project until moving to working on cloud products. Previously, Philip worked at VMware, developing various desktop virtualization technologies. Philip holds a bachelor’s degree with honors in electrical engineering from the University of Texas at Austin.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com