AWK and file names with space in it.

I'm trying to parse files with awk to change their names. Everything went well until im started to do this with files with space in file name. File names are something like 11237_712312955_2012-01-04 18_31_03.wav and I want to replace wav from file name. This is example of my code:

ls | awk -F\. '{print $1}'

After i run this in console evething is ok and I get file name whithout extension.

Example: file 11237_712312955_2012-01-04 18_31_03.wav

after ls | awk -F\. '{print $1}' in console I'm geting:

11237_712312955_2012-01-04 18_31_03 and this is correct.

But when I put this in my script:


#!/bin/bash
for i in $(ls);
do
  FILENAME=echo $i | awk -F\. '{print $1}'; #problematic line 
  echo $FILENAME
done

Script is splitting file into two in place where space occurred.

Output from script is:

11237_712312955_2012-01-04
18_31_03

How to make my script work properly ?


Solution 1:

The issue here is parsing with ls. Consider to take a look here: Why you shouldn't parse the output of ls.

The reason why you shouldn't do it is since UNIX allows almost any character in a filename, including whitespace, newlines, commas, pipe symbols, and pretty much anything else you'd ever try to use as a delimiter except NUL. In its default mode, if standard output isn't a terminal, ls separates filenames with newlines. This is fine until you have a file with a newline in its name.

Solution 2:

Oh my god, that's awful.

Your script uses bash; I suggest you do this instead:

#!/bin/bash
for i in *.wav; do mv "${i}" "${i%.wav}.ext"; done

See the Bash Guide for more details on parameter expansion.

Solution 3:

You could try this.

awk '{print substr($0, index($0,$9))}'

For example this is the output of ls command:

-rw-r--r--. 1 root root 73834496 Dec 6 10:55 File with spaces 2

If you use simple awk like this

# awk '{print $9}'

It returns only

# File

If used with the full command

# awk '{print substr($0, index($0,$9))}'

I get the whole output

File with spaces 2

Here substr(s, a, b) : it returns b number of chars from string s, starting at position a. The parameter b is optional.

For example if the match is addr:192.168.1.133 and you use substr as follows

# awk '{print substr($2,6)}'

You get the IP i.e 192.168.1.133. Note the 6 is the character starting from a in addr

So in the proper command the $2 is $0 ( which prints whole line.) and index($0,$9) matches $9 and prints everything ahead of column 9. You can change that to index($0,$8) and see that the output changes to

# 10:55 File with spaces 2

`index(IN, FIND)' This searches the string IN for the first occurrence of the string FIND, and returns the position in characters where that occurrence begins in the string IN.

I hope it helps. Moreover if you are assigning this value to a variable in script then you need to enclose the variables in double quotes. Other wise you will get errors if you are doing some other operation for the extracted file name.