These are easy mistakes to make as the bash
builtin read
command will terminate with the error exit code on the end-of-file condition and use Internal Field Separator to split the line into words. It is kind of tricky, but it is worth knowing how to deal with such problems.
Skip to the third (while loop) or fourth step (until loop) to see the final solution.
I will use the following input sample to show you each issue separately.
$ printf "first\tline\n\nthird line \n fourth line"
first line third line fourth line$ _
Notice, the first line contains a tab character in the middle. The second line is empty (single newline character), the third line contains a trailing white space character, the fourth line contains two leading white space characters and does not contain the newline character.
It contains 4 lines and 37 characters.
The first step – look at the while loop
Create the most straightforward shell script to count lines/characters.
#!/bin/bash # set initial number of lines n_lines=0 # set initial number of characters n_chars=0 while read -r -u 0 line; do n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s\n" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" done echo "Lines: $n_lines" echo "Characters: $n_chars"
Check it out.
$ printf "first\tline\n\nthird line \n fourth line" | bash ./read_inside_while_loop_1.sh
Parsed line 'first line' Parsed line '' Parsed line 'third line' Lines: 3 Characters: 23
There are two problems. The last line is missing as it does not contain newline character, and leading/trailing white spaces disappeared.
The second step – fix the last line without the newline character
Update shell script to parse the last line as it will be left untouched if it does not contain a newline character.
#!/bin/bash # set initial number of lines n_lines=0 # set initial number of characters n_chars=0 while read -r -u 0 line; do n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s\n" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" done if [ -n "$line" ]; then n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" fi echo "Lines: $n_lines" echo "Characters: $n_chars"
Check it out.
$ printf "first\tline\n\nthird line \n fourth line" | bash ./read_inside_while_loop_2.sh
Parsed line 'first line' Parsed line '' Parsed line 'third line' Parsed line 'fourth line' Lines: 4 Characters: 34
It looks better, one problem solved, but leading/trailing white spaces are still missing.
The third step – fix the missing leading/trailing white spaces
Update shell script to parse leading/trailing white space characters. All you need to do now is to alter the Internal Field Separator as it contains characters that are used to split the line into words.
#!/bin/bash # set initial number of lines n_lines=0 # set initial number of characters n_chars=0 OLDIFS="$IFS" IFS="" while read -r -u 0 line; do n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s\n" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" done if [ -n "$line" ]; then n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" fi IFS="$OLDIFS" echo "Lines: $n_lines" echo "Characters: $n_chars"
Check it out.
$ printf "first\tline\n\nthird line \n fourth line" | bash ./read_inside_while_loop_3.sh
Parsed line 'first line' Parsed line '' Parsed line 'third line ' Parsed last line ' fourth line' Lines: 4 Characters: 37
Success. Shell script using the while loop displays correct results.
The fourth step – an alternative solution
If you search for something slightly different, then use the until loop. I will skip the adjustment process as it is very similar to the previous one.
#!/bin/bash # set initial number of lines n_lines=0 # set initial number of characters n_chars=0 OLDIFS="$IFS" IFS="" file_read=false until $file_read; do read -r -u 0 line || file_read=true if [ "$file_read" == false ]; then n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s\n" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" elif ([ "$file_read" == true ] && [ -n "$line" ]); then n_lines=$(expr $n_lines \+ 1) l_chars=$(printf "%s" "$line" | wc -c) n_chars=$(expr $n_chars \+ $l_chars) printf "Parsed line '%s'\n" "$line" fi done IFS="$OLDIFS" echo "Lines: $n_lines" echo "Characters: $n_chars"
Check it out.
$ printf "first\tline\n\nthird line \n fourth line" | bash ./read_inside_while_loop_3.sh
Parsed line 'first line' Parsed line '' Parsed line 'third line ' Parsed line ' fourth line' Lines: 4 Characters: 37
Success. Shell script using the until loop displays correct results.
Want to know more? Read the bash
manual page and look for IFS
special variable and builtin read
command.