Decommission Yarn node with minimal impact on the running applications.
Display running Yarn nodes.
$ yarn node -list
2021-05-26 21:41:02,009 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032 Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers datanode3.example.org:41963 RUNNING datanode3.example.org:8042 0 datanode1.example.org:36073 RUNNING datanode1.example.org:8042 0 datanode2.example.org:43967 RUNNING datanode2.example.org:8042 0
Create a hosts.yarn.exclude
file on a resourcemanager node.
$ sudo -u hadoop touch /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.exclude
Define a yarn.resourcemanager.nodes.exclude-path
option inside yarn-site.xml
on a resourcemanager node.
$ sudo -u hadoop vim /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn-site.xml
<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.nodes.exclude-path</name> <value>/opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.exclude</value> </property> </configuration>
Restart service on a resourcemanager node.
$ sudo systemctl restart hadoop-yarn-resourcemanger.service
Append a node that needs to be decommissioned to the above-mentioned file (one per line) on a resourcemanager node.
$ echo datanode3.example.org | sudo -u hadoop tee -a /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.exclude
Re-read the exclude file on a resourcemanager node.
$ yarn rmadmin -refreshNodes
2021-05-26 21:54:52,613 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033
Display running Yarn nodes.
$ yarn node -list
2021-05-26 21:56:25,345 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032 Total Nodes:2 Node-Id Node-State Node-Http-Address Number-of-Running-Containers datanode1.example.org:36073 RUNNING datanode1.example.org:8042 0 datanode2.example.org:43967 RUNNING datanode2.example.org:8042 0
Display all Yarn nodes.
$ yarn node -list -all
2021-05-26 21:56:29,658 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032 Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers datanode1.example.org:36073 RUNNING datanode1.example.org:8042 0 datanode2.example.org:43967 RUNNING datanode2.example.org:8042 0 datanode3.example.org:41963 DECOMMISSIONED datanode3.example.org:8042 0