Categories
SysOps

How to decommission HDFS datanode

Decommission HDFS data node with minimal impact on the running applications.

This functionality allows decommissioning datanode which is useful for long-term operations that could take days.

Display HDFS report.

$ hdfs dfsadmin -report
Configured Capacity: 63010750464 (58.68 GB)
Present Capacity: 52174749112 (48.59 GB)
DFS Remaining: 48046993408 (44.75 GB)
DFS Used: 4127755704 (3.84 GB)
DFS Used%: 7.91%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.8.173:9866 (datanode1.example.org)
Hostname: datanode1.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1375665487 (1.28 GB)
Non DFS Used: 2522518193 (2.35 GB)
DFS Remaining: 16014880768 (14.92 GB)
DFS Used%: 6.55%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:37:47 UTC 2021
Last Block Report: Wed May 26 19:48:56 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.174:9866 (datanode2.example.org)
Hostname: datanode2.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1382729504 (1.29 GB)
Non DFS Used: 2514303200 (2.34 GB)
DFS Remaining: 16016031744 (14.92 GB)
DFS Used%: 6.58%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:37:45 UTC 2021
Last Block Report: Wed May 26 19:48:57 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.175:9866 (datanode3.example.org)
Hostname: datanode3.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1369360713 (1.28 GB)
Non DFS Used: 2527622839 (2.35 GB)
DFS Remaining: 16016080896 (14.92 GB)
DFS Used%: 6.52%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:37:45 UTC 2021
Last Block Report: Wed May 26 19:48:57 UTC 2021
Num of Blocks: 10859

Create a hosts.exclude file on a namenode.

$ sudo -u hadoop touch /opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude

Define a dfs.hosts.exclude option inside hdfs-site.xml on a namenode.

$ sudo -u hadoop vim /opt/hadoop/hadoop-3.2.2/etc/hadoop/hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.name.dir</name>
                <value>/opt/hadoop/local_data/namenode</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>https://secondarynamenode.example.org:9870</value>
        </property>

        <property>
                <name>dfs.replication</name>
                <value>2</value>
        </property>
        <property>
                <name>dfs.hosts.exclude</name>
                <value>/opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude</value>
        </property>
</configuration>

Restart service on a namenode.

$ sudo systemctl restart hadoop-namenode.service

Inspect decommissioning datanodes.

$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 63010750464 (58.68 GB)
Present Capacity: 52337483776 (48.74 GB)
DFS Remaining: 48046972928 (44.75 GB)
DFS Used: 4290510848 (4.00 GB)
DFS Used%: 8.20%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Decommissioning datanodes (0):

Append a node that needs to be decommissioned to the above-mentioned file (one per line) on a namenode.

$ echo datanode3.example.org | sudo -u hadoop tee -a /opt/hadoop/hadoop-3.2.2/etc/hadoop/hosts.exclude

Re-read the exclude file on a namenode.

$ hdfs dfsadmin -refreshNodes
Refresh nodes successful

Display HDFS report.

$ hdfs dfsadmin -report
Configured Capacity: 42007166976 (39.12 GB)
Present Capacity: 34891247616 (32.50 GB)
DFS Remaining: 32030896128 (29.83 GB)
DFS Used: 2860351488 (2.66 GB)
DFS Used%: 8.20%
Replicated Blocks:
        Under replicated blocks: 10859
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.8.173:9866 (datanode1.example.org)
Hostname: datanode1.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430147072 (1.33 GB)
Non DFS Used: 2468044800 (2.30 GB)
DFS Remaining: 16014872576 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:58:29 UTC 2021
Last Block Report: Wed May 26 20:52:57 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.174:9866 (datanode2.example.org)
Hostname: datanode2.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430204416 (1.33 GB)
Non DFS Used: 2466836480 (2.30 GB)
DFS Remaining: 16016023552 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:58:29 UTC 2021
Last Block Report: Wed May 26 20:52:57 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.175:9866 (datanode3.example.org)
Hostname: datanode3.example.org
Decommission Status : Decommission in progress
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430159360 (1.33 GB)
Non DFS Used: 2466828288 (2.30 GB)
DFS Remaining: 16016076800 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:58:29 UTC 2021
Last Block Report: Wed May 26 20:52:57 UTC 2021
Num of Blocks: 10859


Decommissioning datanodes (1):

Name: 192.168.8.175:9866 (datanode3.example.org)
Hostname: datanode3.example.org
Decommission Status : Decommission in progress
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430159360 (1.33 GB)
Non DFS Used: 2466828288 (2.30 GB)
DFS Remaining: 16016076800 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:58:29 UTC 2021
Last Block Report: Wed May 26 20:52:57 UTC 2021
Num of Blocks: 10859

Display HDFS report for currently decommissioning nodes.

$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 42007166976 (39.12 GB)
Present Capacity: 34891247616 (32.50 GB)
DFS Remaining: 32030896128 (29.83 GB)
DFS Used: 2860351488 (2.66 GB)
DFS Used%: 8.20%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Decommissioning datanodes (1):

Name: 192.168.8.175:9866 (datanode3.example.org)
Hostname: datanode3.example.org
Decommission Status : Decommission in progress
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430159360 (1.33 GB)
Non DFS Used: 2466828288 (2.30 GB)
DFS Remaining: 16016076800 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 20:58:11 UTC 2021
Last Block Report: Wed May 26 20:52:57 UTC 2021
Num of Blocks: 10859

Wait till the node is decommissioned before taking any further action.

Display HDFS report.

$ hdfs dfsadmin -report
Configured Capacity: 42007166976 (39.12 GB)
Present Capacity: 34891239424 (32.49 GB)
DFS Remaining: 32030887936 (29.83 GB)
DFS Used: 2860351488 (2.66 GB)
DFS Used%: 8.20%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.8.173:9866 (datanode1.example.org)
Hostname: datanode1.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430147072 (1.33 GB)
Non DFS Used: 2468048896 (2.30 GB)
DFS Remaining: 16014868480 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 21:18:34 UTC 2021
Last Block Report: Wed May 26 21:15:58 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.174:9866 (datanode2.example.org)
Hostname: datanode2.example.org
Decommission Status : Normal
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430204416 (1.33 GB)
Non DFS Used: 2466840576 (2.30 GB)
DFS Remaining: 16016019456 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 21:18:34 UTC 2021
Last Block Report: Wed May 26 21:15:58 UTC 2021
Num of Blocks: 10859


Name: 192.168.8.175:9866 (datanode3.example.org)
Hostname: datanode3.example.org
Decommission Status : Decommissioned
Configured Capacity: 21003583488 (19.56 GB)
DFS Used: 1430159360 (1.33 GB)
Non DFS Used: 2466832384 (2.30 GB)
DFS Remaining: 16016072704 (14.92 GB)
DFS Used%: 6.81%
DFS Remaining%: 76.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed May 26 21:18:34 UTC 2021
Last Block Report: Wed May 26 21:15:58 UTC 2021
Num of Blocks: 10859

Display HDFS report for currently decommissioning nodes.

$ hdfs dfsadmin -report -decommissioning
Configured Capacity: 42007166976 (39.12 GB)
Present Capacity: 34891239424 (32.49 GB)
DFS Remaining: 32030887936 (29.83 GB)
DFS Used: 2860351488 (2.66 GB)
DFS Used%: 8.20%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
Decommissioning datanodes (0):

In this specific case all data nodes are included by default as dfs.hosts option is not used.