Decommission HDFS data node with minimal impact on the running applications.
This functionality allows decommissioning datanode which is useful for long-term operations that could take days.
Display HDFS report.
$ hdfs dfsadmin -report
Configured Capacity: 63010750464 (58.68 GB) Present Capacity: 52174749112 (48.59 GB) DFS Remaining: 48046993408 (44.75 GB) DFS Used: 4127755704 (3.84 GB) DFS Used%: 7.91% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (3): Name: 192.168.8.173:9866 (datanode1.example.org) Hostname: datanode1.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1375665487 (1.28 GB) Non DFS Used: 2522518193 (2.35 GB) DFS Remaining: 16014880768 (14.92 GB) DFS Used%: 6.55% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:37:47 UTC 2021 Last Block Report: Wed May 26 19:48:56 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.174:9866 (datanode2.example.org) Hostname: datanode2.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1382729504 (1.29 GB) Non DFS Used: 2514303200 (2.34 GB) DFS Remaining: 16016031744 (14.92 GB) DFS Used%: 6.58% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:37:45 UTC 2021 Last Block Report: Wed May 26 19:48:57 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.175:9866 (datanode3.example.org) Hostname: datanode3.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1369360713 (1.28 GB) Non DFS Used: 2527622839 (2.35 GB) DFS Remaining: 16016080896 (14.92 GB) DFS Used%: 6.52% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:37:45 UTC 2021 Last Block Report: Wed May 26 19:48:57 UTC 2021 Num of Blocks: 10859
Create a hosts.exclude
file on a namenode.
$ sudo -u hadoop touch /opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude
Define a dfs.hosts.exclude
option inside hdfs-site.xml
on a namenode.
$ sudo -u hadoop vim /opt/hadoop/hadoop-3.2.2/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/opt/hadoop/local_data/namenode</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>https://secondarynamenode.example.org:9870</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude</value> </property> </configuration>
Restart service on a namenode.
$ sudo systemctl restart hadoop-namenode.service
Inspect decommissioning datanodes.
$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 63010750464 (58.68 GB) Present Capacity: 52337483776 (48.74 GB) DFS Remaining: 48046972928 (44.75 GB) DFS Used: 4290510848 (4.00 GB) DFS Used%: 8.20% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Decommissioning datanodes (0):
Append a node that needs to be decommissioned to the above-mentioned file (one per line) on a namenode.
$ echo datanode3.example.org | sudo -u hadoop tee -a /opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude
Re-read the exclude file on a namenode.
$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
Display HDFS report.
$ hdfs dfsadmin -report
Configured Capacity: 42007166976 (39.12 GB) Present Capacity: 34891247616 (32.50 GB) DFS Remaining: 32030896128 (29.83 GB) DFS Used: 2860351488 (2.66 GB) DFS Used%: 8.20% Replicated Blocks: Under replicated blocks: 10859 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (3): Name: 192.168.8.173:9866 (datanode1.example.org) Hostname: datanode1.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430147072 (1.33 GB) Non DFS Used: 2468044800 (2.30 GB) DFS Remaining: 16014872576 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:58:29 UTC 2021 Last Block Report: Wed May 26 20:52:57 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.174:9866 (datanode2.example.org) Hostname: datanode2.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430204416 (1.33 GB) Non DFS Used: 2466836480 (2.30 GB) DFS Remaining: 16016023552 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:58:29 UTC 2021 Last Block Report: Wed May 26 20:52:57 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.175:9866 (datanode3.example.org) Hostname: datanode3.example.org Decommission Status : Decommission in progress Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430159360 (1.33 GB) Non DFS Used: 2466828288 (2.30 GB) DFS Remaining: 16016076800 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:58:29 UTC 2021 Last Block Report: Wed May 26 20:52:57 UTC 2021 Num of Blocks: 10859 Decommissioning datanodes (1): Name: 192.168.8.175:9866 (datanode3.example.org) Hostname: datanode3.example.org Decommission Status : Decommission in progress Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430159360 (1.33 GB) Non DFS Used: 2466828288 (2.30 GB) DFS Remaining: 16016076800 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:58:29 UTC 2021 Last Block Report: Wed May 26 20:52:57 UTC 2021 Num of Blocks: 10859
Display HDFS report for currently decommissioning nodes.
$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 42007166976 (39.12 GB) Present Capacity: 34891247616 (32.50 GB) DFS Remaining: 32030896128 (29.83 GB) DFS Used: 2860351488 (2.66 GB) DFS Used%: 8.20% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Decommissioning datanodes (1): Name: 192.168.8.175:9866 (datanode3.example.org) Hostname: datanode3.example.org Decommission Status : Decommission in progress Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430159360 (1.33 GB) Non DFS Used: 2466828288 (2.30 GB) DFS Remaining: 16016076800 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 20:58:11 UTC 2021 Last Block Report: Wed May 26 20:52:57 UTC 2021 Num of Blocks: 10859
Wait till the node is decommissioned before taking any further action.
Display HDFS report.
$ hdfs dfsadmin -report
Configured Capacity: 42007166976 (39.12 GB) Present Capacity: 34891239424 (32.49 GB) DFS Remaining: 32030887936 (29.83 GB) DFS Used: 2860351488 (2.66 GB) DFS Used%: 8.20% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (3): Name: 192.168.8.173:9866 (datanode1.example.org) Hostname: datanode1.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430147072 (1.33 GB) Non DFS Used: 2468048896 (2.30 GB) DFS Remaining: 16014868480 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 21:18:34 UTC 2021 Last Block Report: Wed May 26 21:15:58 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.174:9866 (datanode2.example.org) Hostname: datanode2.example.org Decommission Status : Normal Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430204416 (1.33 GB) Non DFS Used: 2466840576 (2.30 GB) DFS Remaining: 16016019456 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 21:18:34 UTC 2021 Last Block Report: Wed May 26 21:15:58 UTC 2021 Num of Blocks: 10859 Name: 192.168.8.175:9866 (datanode3.example.org) Hostname: datanode3.example.org Decommission Status : Decommissioned Configured Capacity: 21003583488 (19.56 GB) DFS Used: 1430159360 (1.33 GB) Non DFS Used: 2466832384 (2.30 GB) DFS Remaining: 16016072704 (14.92 GB) DFS Used%: 6.81% DFS Remaining%: 76.25% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed May 26 21:18:34 UTC 2021 Last Block Report: Wed May 26 21:15:58 UTC 2021 Num of Blocks: 10859
Display HDFS report for currently decommissioning nodes.
$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 42007166976 (39.12 GB) Present Capacity: 34891239424 (32.49 GB) DFS Remaining: 32030887936 (29.83 GB) DFS Used: 2860351488 (2.66 GB) DFS Used%: 8.20% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Decommissioning datanodes (0):
In this specific case all data nodes are included by default as dfs.hosts
option is not used.