So far, we have seen setting up Zookeeper and HDFS, now we will see how to setup YARN which stands for Yet Another Resource Negotiator. YARN (MRv2) introduces newer daemons that is responsible for job scheduling/monitoring and resource management. When you submit mapreduce or any other job in the cluster, the resources should be shared efficiently. To address the challenges in managing the resources, A generic job management tool which can share resources between “mapreduce” based distributed processing tools as well as “non mapreduce” based distributed processing tools is developed which is “YARN”.
The following Daemon Processes are involved as part of YARN. As Master process we have Resource Manager and slave processes on all nodes called “Node Managers” on data nodes. As YARN can submit generic or any type jobs, every job is considered as an application. So App Timeline server gives the details of the application.
- Select YARN (MR v2) to install from Add a service Menu
- Assign role or select the host to install component
- Review and modify important configuration related to the component
- Complete installation – Run installer and start each component
How YARN handles Resource Allocations?
Using Resource Manager, Node Manager and per job application master (unlike job tracker and task tracker in MRv1/classic). We need to define several parameters for resource allocation (CPU/cores and Memory).
The following configuration files will be available as part of YARN.
- Yarn-site.xml – Parameters related resource and node managers (Node Level)
- Mapred-site.xml – Parameters related to mappers and reducers (Task Level)
And we have the following Web User Interfaces
- Resource Manager – http://<RM-node>:8088/cluster
- Job History Server – http://<RM-node>:19888/jobhistory
- MapReduce v2 (MRv2/YARN) – Fault Tolerance
- Task Failure (mostly same as classic/MRv1)
- Application Master Failure
- If application master is failed that means a job is failed. It can be controlled by yarn.resourcemanager.am.max.retries (default 1)
- Node Manager Failure
- If there are no heartbeats from Node Manager to Resource Manager for 10 minutes (default), then that Node Manager will be removed from the pool
- Resource Manager Failure
- Although probability of Resource Manager failure is relatively low, no jobs can be submitted until RM is brought back up and running.
- High availability can be configured in YARN which means there will be multiple RM running in the cluster. There is no high availability in MRv1 (only one job tracker)