Apache Impala has offered fast SQL analytics over big data since its initial beta release in 2012. As the popularity and utilization of Impala deployments increases, clusters often become victims of their own success when demand for resources exceeds the supply.
Tim Armstrong dives into the latest resource management features in Impala to maintain high cluster availability and optimal performance and provides examples of how to configure them in your Impala deployment. Tim also discusses ongoing work on Impala’s admission control to make workload management simpler, more flexible, and automatic, including how the setup of Impala admission control was streamlined and efforts to make out-of-memory errors a thing of the past.
Tim Armstrong is an engineer at Cloudera, where he works on making Apache Impala faster and more robust via improvements to query execution and resource management. He holds a PhD focused on the intersection of high-performance computing and programming language implementation.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com