Replacing/removing excess white space between columns in a file

I am trying to parse a file with similar contents:

I am a string         12831928  
I am another string           41327318   
A set of strings      39842938  
Another string           3242342  

I want the out file to be tab delimited:

I am a string\t12831928  
I am another string\t41327318   
A set of strings\t39842938  
Another string\t3242342 

I have tried the following:

sed 's/\s+/\t/g' filename > outfile

I have also tried cut, and awk.


Solution 1:

Just use awk:

$ awk -F'  +' -v OFS='\t' '{sub(/ +$/,""); $1=$1}1' file
I am a string   12831928
I am another string     41327318
A set of strings        39842938
Another string  3242342

Breakdown:

-F'  +'           # tell awk that input fields (FS) are separated by 2 or more blanks
-v OFS='\t'       # tell awk that output fields are separated by tabs
'{sub(/ +$/,"");  # remove all trailing blank spaces from the current record (line)
$1=$1}            # recompile the current record (line) replacing FSs by OFSs
1'                # idiomatic: any true condition invokes the default action of "print"

I highly recommend the book Effective Awk Programming, 4th Edition, by Arnold Robbins.