The article is dangerously wrong in its discussion of IFS. What you should do to...

kazinator · on March 18, 2016

Seconded. It is quite off mark. This will break code which depends on splitting, like when you have a some variable called FOO_FLAG which contains "--blah arg" that's supposed to expand to two arguments. Observing proper quoting is the way (except for internal data representations that you can guarantee not to have spaces).

And also, the newline and tab is not explained! What is with that?

"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"

Huh?

If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):

   $ VAR="a b c d"
   $ for x in $VAR ; do echo $x ; done
   a
   b
   c
   d
   $ IFS='' ; for x in $VAR ; do echo $x ; done
   a b c d

No splitting on anything: not spaces, tabs or newlines!

gdavisson · on March 18, 2016

Agreed. In addition to still having trouble with tabs and newlines, setting IFS still leaves the other big problem with unquoted variables: unexpected expansion of wildcards. The shell considers any unquoted string that contains * , ?, or [ to be a glob expression, and will replace it with a list of matching files. This can cause some really strange bugs.

Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:

if [ -n $var ]; then

... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).

Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.

stantona · on March 20, 2016

The test command has certain rules depending on the number of arguments. The most pertinent rule is: For one argument, the expression is true if, and only if, the argument is not null.

In this case

    [ -n $var ]

is the same as

    test -n $var

$var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.

Therefore, always quote your variables.

ycmbntrthrwaway · on March 19, 2016

> the reason is complex, I'll leave it as a puzzle for the reader

    [ -n ]

is the same as

    test -n

In this case -n has no argument, so it cannot be parsed as "-n STRING", instead it is parsed as "STRING", where STRING is "-n", with the behaviour "True if string is not empty.".

nemild · on March 19, 2016

Google's Testing on the Toilet for Bash talks about $, and is a great reference for those trying to improve their bash scripting:

http://robertmuth.blogspot.com/2012/08/better-bash-scripting...

Hello71 · on March 18, 2016

it would almost make sense (but still be wrong) if it were discussing POSIX shell, where arrays do not exist and these contortions are necessary.

what the author is doing is like this in Python:

    stuff=["a b", "c d", "e f"]
    for thing in '\n'.join(stuff).split('\n'):
        print thing

ianbicking · on March 18, 2016

That's right for $@, but AFAIK only $@ – for instance you can't do:

    for filename in "*.txt" ; do...

kazinator · on March 18, 2016

   for filename in *.txt ; do ...

has no issue with spaces in filenames. If *.txt matches "foo bar.txt", then that's what the filename variable is set to. In the body of the loop you have to make sure you have "$filename".

You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.

adrusi · on March 18, 2016

Wildcards are not variables. Wildcards don't get expanded in quotes. Variables get expanded in double quotes but not single quotes. $@ obeys all the same expansion rules as all other variables. Command substitution with both $() and `` follow the same rules as variables.

hibbelig · on March 18, 2016

No, $@ is special where $* is regular. Consider the following script foo.sh and call it like so:

    ./foo.sh one "a b" two

The $* part will print one line, the $@ part will print three lines.

    #!/bin/bash
    for x in "$*"; do
        echo $x
    done
    for x in "$@"; do
        echo $x
    done

I use "$@" often, but to this day I don't fully understand how $@ works...

kazinator · on March 18, 2016

Unquoted $@ works exactly like unquoted $* .

Quoted "$* " separates the parameters using the first character stored in IFS, or with nothing if IFS is unset/null. Usually, the first character of IFS is space, giving the impression that "$* " means "separate with spaces"; i.e. that it's just an ordinary quote job around $* (i.e. that it is "regular", in your words).

Quoted "$@" does ... what you clearly understand.

kazinator · on March 18, 2016

> $@ obeys all the same rules as all other variables.

That's hardy the case. Most other variables do not represent the positional parameters, and don't have the logic of "$@" which effectively produces "$1" "$2" "$3" ...

adrusi · on March 18, 2016

I was refering to the expansion behavior, which was the point in contention. I've clarified the original comment.

setpatchaddress · on March 18, 2016

Agreed, but that's not something you can automatically enforce.

dllthomas · on March 18, 2016

I guess you could require a comment on any line with an unquoted variable expansion...

kazinator · on March 18, 2016

How about some naming convention? If a variable contains a word with no spaces, call it $foo_w or something. Shell linting programs can be patched to recognize that and suppress their warnings.

Heck the shell language itself should have a declaration for this!

IMAGINARY FEATURE:

   typeset -w foo  # foo is not expected to contain spaces

(Or more generally, expansions of foo are not expected to undergo field splitting by IFS regardless of content.)

Now if you have an unquoted $foo that undergoes field-splitting, bash produces an error if that splitting actually breaks the contents of foo into two or more pieces.)

Furthermore, a way could be provided to declare that a variable requires splitting. Maybe "typeset -W". This could even assert into how many pieces "typeset -W3 foo" means that expansions of foo are expected to undergo splitting, and it must be into three fields.

Then there could be a global diagnostic option (similar to set -u and set -e) which diagnoses all unquoted expansions of variables, except for the -W and -w ones. The -w ones are diagnosed if they are subject to splitting, and splitting actually occurs. The -W ones are diagnosed if they quoted, or if they are unquoted and splitting doesn't produce the required number of pieces, if specified.