Using sed to mass rename files
Objective
Change these filenames:
- F00001-0708-RG-biasliuyda
- F00001-0708-CS-akgdlaul
- F00001-0708-VF-hioulgigl
to these filenames:
- F0001-0708-RG-biasliuyda
- F0001-0708-CS-akgdlaul
- F0001-0708-VF-hioulgigl
Shell Code
To test:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
To perform:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
My Question
I don't understand the sed code. I understand what the substitution command
$ sed 's/something/mv'
means. And I understand regular expressions somewhat. But I don't understand what's happening here:
\(.\).\(.*\)
or here:
& \1\2/
The former, to me, just looks like it means: "a single character, followed by a single character, followed by any length sequence of a single character"--but surely there's more to it than that. As far as the latter part:
& \1\2/
I have no idea.
Solution 1:
First, I should say that the easiest way to do this is to use the prename or rename commands.
On Ubuntu, OSX (Homebrew package rename
, MacPorts package p5-file-rename
), or other systems with perl rename (prename):
rename s/0000/000/ F0000*
or on systems with rename from util-linux-ng, such as RHEL:
rename 0000 000 F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If you run man sed and search for & (using the / command to search), you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\)
matches the first character, which can be referenced by \1
.
Then .
matches the next character, which is always 0.
Then \(.*\)
matches the rest of the filename, which can be referenced by \2
.
The replacement string puts it all together using &
(the original
filename) and \1\2
which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for some reason the rename command was not available and you wanted to use sed to do the rename (or perhaps you were doing something too complex for rename?), being more explicit in your regex would make it much more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the s/search/replacement/ makes it much more readable. Also it won't keep sucking characters out of your filename if you accidentally run it twice or something.
Solution 2:
you've had your sed explanation, now you can use just the shell, no need external commands
for file in F0000*
do
echo mv "$file" "${file/#F0000/F000}"
# ${file/#F0000/F000} means replace the pattern that starts at beginning of string
done
Solution 3:
I wrote a small post with examples on batch renaming using sed
couple of years ago:
http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/
For example:
for i in *; do
mv "$i" "`echo $i | sed "s/regex/replace_text/"`";
done
If the regex contains groups (e.g. \(subregex\
) then you can use them in the replacement text as \1\
,\2
etc.
Solution 4:
The easiest way would be:
for i in F00001*; do mv "$i" "${i/F00001/F0001}"; done
or, portably,
for i in F00001*; do mv "$i" "F0001${i#F00001}"; done
This replaces the F00001
prefix in the filenames with F0001
.
credits to mahesh here: http://www.debian-administration.org/articles/150