Categories
SysOps

How to create Yarn nodes whitelist

Create Yarn nodes whitelist.

Display running Yarn nodes.

$ yarn node -list
2021-06-06 22:40:12,371 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032
Total Nodes:3
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
datanode3.example.org:35637             RUNNING datanode3.example.org:8042                                 0
datanode1.example.org:36305             RUNNING datanode1.example.org:8042                                 0
datanode2.example.org:33917             RUNNING datanode2.example.org:8042                                 0

Create a hosts.yarn.include file on a resourcemanager node.

$ cat <<EOF | sudo -u hadoop tee /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.include
datanode1.example.org
datanode2.example.org
EOF

Define a yarn.resourcemanager.nodes.include-path option inside yarn-site.xml on a resourcemanager node.

$ sudo -u hadoop vim /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn-site.xml 
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->

<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
 </property>
 <property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 </property>
 <property>
   <name>yarn.resourcemanager.nodes.include-path</name>
   <value>/opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.include</value>
 </property>
</configuration>

Restart service on a resourcemanager node.

$ sudo systemctl restart hadoop-yarn-resourcemanger.service

Display running Yarn nodes.

$ yarn node -list                  
2021-06-06 22:49:17,827 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032
Total Nodes:2
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
datanode1.example.org:36305             RUNNING datanode1.example.org:8042                                 0
datanode2.example.org:33917             RUNNING datanode2.example.org:8042                                 0

Append a Yarn node that needs to be whitelisted to the above-mentioned file on a resourcemanager node.

$ echo datanode3.example.org | sudo -u hadoop tee -a /opt/hadoop/hadoop-3.2.2/etc/hadoop/yarn.nodes.include

Re-read the include file on a resourcemanager node.

$ yarn rmadmin -refreshNodes
2021-06-06 22:52:19,317 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033

Display running Yarn nodes.

$ yarn node -list                  
2021-06-06 22:52:49,604 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032
Total Nodes:2
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
datanode1.example.org:36305             RUNNING datanode1.example.org:8042                                 0
datanode2.example.org:33917             RUNNING datanode2.example.org:8042                                 0

Display all Yarn nodes.

$ yarn node -list -all
2021-06-06 22:53:37,559 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager.example.org/192.168.8.172:8032
Total Nodes:3
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
datanode3.example.org:46125             RUNNING datanode3.example.org:8042                                 0
datanode1.example.org:36305             RUNNING datanode1.example.org:8042                                 0
datanode2.example.org:33917             RUNNING datanode2.example.org:8042                                 0

Simple as that.