Reindex directories located in the Ext4 filesystem after removing a huge number of files to optimize directory sizes.
I will use ext4
filesystem located on the /dev/sdb1
device in this example to illustrate what happens when you delete a huge number of files.
$ lsblk -o NAME,FSTYPE,UUID,MOUNTPOINT /dev/sdb1 NAME FSTYPE UUID MOUNTPOINT sdb1 ext4 68eebaf9-0d11-45ed-a75c-6fdc44e4c8e7 /srv
Create huge_directory
directory for sample data.
$ sudo mkdir /srv/huge_directory
Inspect directory size.
$ ls -lh /srv/ total 20K drwxr-xr-x 2 root root 4.0K Nov 10 02:21 huge_directory drwx------ 2 root root 16K Nov 10 02:41 lost+found
Create a huge number (50000
) of files.
$ for i in $(seq 1 50000); do sudo touch /srv/huge_directory/empty_${i}.file; done
Inspect directory size.
$ ls -lh /srv/ total 1.8M drwxr-xr-x 2 root root 1.8M Nov 10 02:49 huge_directory drwx------ 2 root root 16K Nov 10 02:41 lost+found
Delete recently created files.
$ sudo find /srv/huge_directory/ -maxdepth 1 -type f -name "empty_*.file" -delete
Inspect huge_directory
directory to confirm that these files have been deleted.
$ ls -lh /srv/huge_directory/ total 0
Inspect directory size.
$ ls -lh /srv/ total 1.8M drwxr-xr-x 2 root root 1.8M Nov 10 02:51 huge_directory drwx------ 2 root root 16K Nov 10 02:41 lost+found
Unmount Ext4 filesystem.
$ sudo umount /srv
Dry-run fsck
to get summary information.
fsck
to perform this operation as the Ext4 filesystem is clean.$ sudo fsck.ext4 -nvf /dev/sdb1 e2fsck 1.44.5 (15-Dec-2018) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information 12 inodes used (0.02%, out of 65536) 0 non-contiguous files (0.0%) 0 non-contiguous directories (0.0%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 4 9302 blocks used (3.55%, out of 261883) 0 bad blocks 1 large file 0 regular files 3 directories 0 character device files 0 block device files 0 fifos 0 links 0 symbolic links (0 fast symbolic links) 0 sockets ------------ 3 files
Use fsck
with -D
parameter to optimize directories in the Ext4 filesystem.
fsck
to perform this operation as the Ext4 filesystem is clean.$ sudo fsck.ext4 -Dvf /dev/sdb1 e2fsck 1.44.5 (15-Dec-2018) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 3A: Optimizing directories Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/sdb1: ***** FILE SYSTEM WAS MODIFIED ***** 12 inodes used (0.02%, out of 65536) 0 non-contiguous files (0.0%) 0 non-contiguous directories (0.0%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 4 8860 blocks used (3.38%, out of 261883) 0 bad blocks 1 large file 0 regular files 3 directories 0 character device files 0 block device files 0 fifos 0 links 0 symbolic links (0 fast symbolic links) 0 sockets ------------ 3 files
Mount Ext4 filesystem.
$ sudo mount /srv
Inspect directory size.
$ ls -lh /srv/ total 20K drwxr-xr-x 2 root root 4.0K Nov 10 02:51 huge_directory drwx------ 2 root root 16K Nov 10 02:41 lost+found
Additional notes
Excerpt from the (unofficial) Linux Kernel Mailing List archive.
Date: Fri, 15 May 2009 06:58:15 -0400 From: Theodore Tso <> Subject: Re: ext3/ext4 directories don't shrink after deleting lots of files On Thu, May 14, 2009 at 08:45:38PM -0400, Timo Sirainen wrote: > > I was rather thinking something that I could run while the system was > fully operational. Otherwise just moving the files to a temp directory + > rmdir() + rename() would have been fine too. > > I just tested that xfs, jfs and reiserfs all shrink the directories > immediately. Is it more difficult to implement for ext* or has no one > else found this to be a problem? It's probably fairest to say no one has thought it worth the effort. It would require some fancy games to swap out block locations in the extent trees (life would be easier with non-extent-using inodes), and in the case of htree, we would have to keep track of the index block so we could remove it from the htree index. So it's all doable, if a bit tricky in terms of the technical details; it's just that the people who could do it have been busy enough with other things. It's hasn't been considered high priority because most of the time directories don't go from holding thousands of files down to a small handful. - Ted