Bash: How to tokenize a string variable?

Solution 1:

$ string="john is 17 years old"
$ tokens=( $string )
$ echo ${tokens[*]}

For other delimiters, like ';'

$ string="john;is;17;years;old"
$ IFS=';' tokens=( $string )
$ echo ${tokens[*]}

Solution 2:

Use the shell's automatic tokenization of unquoted variables:

$ string="john is 17 years old"
$ for word in $string; do echo "$word"; done
john
is
17
years
old

If you want to change the delimiter you can set the $IFS variable, which stands for internal field separator. The default value of $IFS is " \t\n" (space, tab, newline).

$ string="john_is_17_years_old"
$ (IFS='_'; for word in $string; do echo "$word"; done)
john
is
17
years
old

(Note that in this second example I added parentheses around the second line. This creates a sub-shell so that the change to $IFS doesn't persist. You generally don't want to permanently change $IFS as it can wreak havoc on unsuspecting shell commands.)

Solution 3:

$ string="john is 17 years old"
$ set -- $string
$ echo $1
john
$ echo $2
is
$ echo $3
17