How would you separate fields with multiple spaces and store them in an array?
In my file mytxt
:
field1 field2
------ -------
this are numbers 12345
this letters abc def ghi
Let's say I want to store the first field in an array:
i=0
while read line; do
field_one[$i]=$(echo $line | awk '{print $1}')
echo ${field_one[i]}
((i++))
done < mytxt
That would give me the this
two times in the output.
Any ideas of how could I store them in an array and get the output:
this are numbers
this letters
I have tried changing delimiters, squeezing spaces, and using sed
, but I'm stuck. Any hint would be appreciated.
My final goal is to store both fields in an array.
Solution 1:
Using colrm to remove columns from file.
#!/bin/bash
shopt -s extglob
a=()
while read; do
a+=("${REPLY%%*( )}")
done < <(colrm 26 < text.txt)
printf %s\\n "${a[@]:2:3}"
(Bash builtin version):
#!/bin/bash
shopt -s extglob
a=()
while read; do
b="${REPLY::26}"; a+=("${b%%*( )}")
done < text.txt
printf %s\\n "${a[@]:2:3}"
Solution 2:
Moving my comment, based on this source, to just show a particular column on multiple-spaces based table:
awk -F ' +' '{print $2}' mytxt.txt # Or with -F ' {2,}'
Note that this won't work if you use double quotes.
I found it particularly useful to find duplicates, using something like:
somelist... | sort | uniq -c | sort -rn | grep -vP "^ +1 " | awk -F ' +' '{print $3}'
Solution 3:
You could use the bash builtin mapfile
(aka readarray
) with a callback that uses parameter expansion to trim the longest trailing substring starting with two spaces:
mapfile -c 1 -C 'f() { field_one[$1]="${2%% *}"; }; f' < mytxt
Ex. given
$ cat mytxt
field1 field2
------ -------
this are numbers 12345
this letters abc def ghi
then
$ mapfile -c 1 -C 'f() { field_one[$1]="${2%% *}"; }; f' < mytxt
$
$ printf '%s\n' "${field_one[@]}" | cat -A
field1$
------$
this are numbers$
this letters$
Solution 4:
This answer focuses on removing two heading lines from the array to match output requirements.
$ cat fieldone.txt
field1 field2
------ -------
this are numbers 12345
this letters abc def ghi
$ fieldone
this are numbers
this letters
Here is the script:
#!/bin/bash
# NAME: fieldone
# PATH: $HOME/askubuntu/
# DESC: Answer for: https://askubuntu.com/questions/1194620/
# how-would-you-separate-fields-with-multiple-spaces-and-store-them-in-an-array
# DATE: December 8, 2019.
i=0 # Current 0-based array index number
while read line; do # Read all lines from input file
((LineNo++)) # Current line number of input file
[[ $LineNo -eq 1 ]] && continue # "Field 1 Field 2" skip first line
if [[ $LineNo -eq 2 ]] ; then # Is this is line 2?
# Grab the second column position explained in:
# https://unix.stackexchange.com/questions/153339/
# how-to-find-a-position-of-a-character-using-grep
Len="$(grep -aob ' -' <<< "$line" | \grep -oE '[0-9]+')"
continue # Loop back for first field
fi
field_one[$i]="${line:0:$Len}" # Extract line position 0 for Len
echo "${field_one[i]}" # Display array index just added
((i++)) # Increment for next array element
done < fieldone.txt # Input filename fed into read loop
Hopefully code and comments are self explanatory. If not don't hesitate to comment.
The script still works if only one space separates the two columns whereas some other answers will break:
field1 field2
------ ------
this is letter abcdef
this is number 123456