The article is dangerously wrong in its discussion of IFS.
What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.
In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.
Seconded. It is quite off mark. This will break code which depends on splitting, like when you have a some variable called FOO_FLAG which contains "--blah arg" that's supposed to expand to two arguments. Observing proper quoting is the way (except for internal data representations that you can guarantee not to have spaces).
And also, the newline and tab is not explained! What is with that?
"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"
Huh?
If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):
$ VAR="a b c d"
$ for x in $VAR ; do echo $x ; done
a
b
c
d
$ IFS='' ; for x in $VAR ; do echo $x ; done
a b c d
No splitting on anything: not spaces, tabs or newlines!
Agreed. In addition to still having trouble with tabs and newlines, setting IFS still leaves the other big problem with unquoted variables: unexpected expansion of wildcards. The shell considers any unquoted string that contains * , ?, or [ to be a glob expression, and will replace it with a list of matching files. This can cause some really strange bugs.
Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:
if [ -n $var ]; then
... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).
Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.
The test command has certain rules depending on the number of arguments.
The most pertinent rule is:
For one argument, the expression is true if, and only if, the argument is not null.
In this case
[ -n $var ]
is the same as
test -n $var
$var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.
> the reason is complex, I'll leave it as a puzzle for the reader
[ -n ]
is the same as
test -n
In this case -n has no argument, so it cannot be parsed as "-n STRING", instead it is parsed as "STRING", where STRING is "-n", with the behaviour "True if string is not empty.".
has no issue with spaces in filenames. If *.txt matches "foo bar.txt", then that's what the filename variable is set to. In the body of the loop you have to make sure you have "$filename".
You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.
Wildcards are not variables. Wildcards don't get expanded in quotes. Variables get expanded in double quotes but not single quotes. $@ obeys all the same expansion rules as all other variables. Command substitution with both $() and `` follow the same rules as variables.
Quoted "$* " separates the parameters using the first character stored in IFS, or with nothing if IFS is unset/null. Usually, the first character of IFS is space, giving the impression that "$* " means "separate with spaces"; i.e. that it's just an ordinary quote job around $* (i.e. that it is "regular", in your words).
> $@ obeys all the same rules as all other variables.
That's hardy the case. Most other variables do not represent the positional parameters, and don't have the logic of "$@" which effectively produces "$1" "$2" "$3" ...
How about some naming convention? If a variable contains a word with no spaces, call it $foo_w or something. Shell linting programs can be patched to recognize that and suppress their warnings.
Heck the shell language itself should have a declaration for this!
IMAGINARY FEATURE:
typeset -w foo # foo is not expected to contain spaces
(Or more generally, expansions of foo are not expected to undergo field splitting by IFS regardless of content.)
Now if you have an unquoted $foo that undergoes field-splitting, bash produces an error if that splitting actually breaks the contents of foo into two or more pieces.)
Furthermore, a way could be provided to declare that a variable requires splitting. Maybe "typeset -W". This could even assert into how many pieces "typeset -W3 foo" means that expansions of foo are expected to undergo splitting, and it must be into three fields.
Then there could be a global diagnostic option (similar to set -u and set -e) which diagnoses all unquoted expansions of variables, except for the -W and -w ones. The -w ones are diagnosed if they are subject to splitting, and splitting actually occurs. The -W ones are diagnosed if they quoted, or if they are unquoted and splitting doesn't produce the required number of pieces, if specified.
What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.
In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.