Yesterday, I left a long-running copy process between two external USB hard-drives. I did not want to assume anything, so I created a simple shell script to ensure that the whole process went smoothly.

The idea description

The idea is to recursively compute file checksums in the source directory.

$ cd /source/directory
$ find . -type f -exec sha256sum '{}' \; > /tmp/checksums.file

Copy the required data.

$ cp -r /source/directory /destination/directory

Verify file checksums in the destination directory.

$ cd /destination/directory
$ cat /tmp/checksums.file | sha256sum -c

Shell script

The following shell script can be divided into four distinct sections: compute checksums, copy data, verify checksums, and send a mail message.

#!/bin/sh
# simple "copy [directories] and verify [files]" shell script
# Sample usage: copy.sh /from_directory /to_directory

# used commands
find_command=$(which find)
shasum_command=$(which sha256sum)
cat_command=$(which cat)
unlink_command=$(which unlink)

# copy command with additional arguments
copy_command=$(which cp)
copy_arguments="-rp" # recursive mode
                     # preserve mode, ownership, timestamps

# mail command and with used email address
mail_command=$(which mail)
mail_subject_argument="-s"
mail_address="milosz"

if [ -d "$1" -a ! -d "$2" ]; then
  # first  directory          exists
  # second directory does not exists

  # compute 256-bit checksums
  shasum_log=$(mktemp)
  (cd $1 && $find_command . -type f -exec $shasum_command '{}' \; > $shasum_log)

  # copy data
  copy_log=$(mktemp)
  $copy_command $copy_arguments "$1" "$2" > $copy_log

  # verify computed checksums
  verify_log=$(mktemp)
  (cd $2 && $cat_command $shasum_log | $shasum_command -c > $verify_log)
  shasum_exit_code="$?"

  # prepare message and send mail message
  mail_file=$(mktemp)
  if [ "$shasum_exit_code" -eq "0" ]; then
    mail_subject="Subject: ${0}: Success"
  else
    mail_subject="Subject: ${0}: Error"
  fi
  echo                                     >  $mail_file
  echo "Command-line: ${0} ${1} ${2}"      >> $mail_file

  if [ -s "$copy_log" ]; then
    echo                                   >> $mail_file
    echo "Copy process"                    >> $mail_file
    $cat_command $copy_log                 >> $mail_file
  fi

  if [ "$shasum_exit_code" -ne "0" ]; then
    echo                                   >> $mail_file
    echo "Verify process"                  >> $mail_file
    $cat_command $verify_log | grep -v OK$ >> $mail_file
  fi

  $mail_command $mail_subject_argument "${mail_subject}" $mail_address < $mail_file

  # cleanup temporary files
  $unlink_command $mail_file
  $unlink_command $verify_log
  $unlink_command $copy_log
  $unlink_command $shasum_log
else
  echo "Problem with parameters\nCommand-line: ${0} ${1} ${2}" | $mail_command $mail_subject_argument "${0}" $mail_address
  exit 5
fi

I have omitted to check for mktemp exit code, but it can be easily added using a similar code.

# compute 256-bit checksums
shasum_log=$(mktemp)
if [ "$?" -ne "0" ]; then
  echo "Cannot create temporary file" | $mail_command $mail_subject_argument "${0}" $mail_address
  exit 1
fi
(cd $1 && $find_command . -type f -exec $shasum_command '{}' \; > $shasum_log)

The whole idea is quite simple, but you are not limited to it, as you can use any available file integrity checker.

Update – 18.02.2015

Thanks to Fernand Ka I have updated the shell script to grep through the right file.

Please notice that locale settings take an important role here as I am using grep command to exclude lines ending with OK string.