Resource Management with YARN: Past, Present and Future
YARN is the resource management platform introduced by Hadoop 2.0. This talk will cover the background of YARN, its current architecture and discuss current and long term improvements. The talk will start by going over the architecture of YARN and it how it achieves effective cluster utilization, fair sharing of resources, and allows different type of applications to utilize the cluster. It will also cover work done for supporting low latency scheduling namely Llama.
We will then go over recent improvements in YARN. We will discuss in depth High Availability of various components, support for multi resource scheduling, moving jobs between queues, and preemption warnings. We will also look at things coming down the pipeline. We will go over work preserving restart, support for long running applications, and generic application history support. Will also touch long term features such as container resizing, i/o scheduling, and deadline based scheduling.
Anubhav is a software engineer working on Resource Management in Hadoop at Cloudera. Previously he worked for 10 years in Microsoft building different distributed system components across different platforms including Bing, Azure, and AppFabric. His most recent project was adding event processing to Cosmos, the big data platform used at Microsoft.
Comments on this page are now closed.