Better way to rename files based on multiple patterns
Two answer: using perl rename or using pure bash
As there are some people who dislike perl, I wrote my bash only version
Renaming files by using the rename
command.
Introduction
Yes, this is a typical job for rename
command which was precisely designed for:
man rename | sed -ne '/example/,/^[^ ]/p'
For example, to rename all files matching "*.bak" to strip the
extension, you might say
rename 's/\.bak$//' *.bak
To translate uppercase names to lower, you'd use
rename 'y/A-Z/a-z/' *
More oriented samples
Simply drop all spaces and square brackets:
rename 's/[ \[\]]*//g;' *.ext
Rename all .jpg
by numbering from 1
:
rename 's/^.*$/sprintf "IMG_%05d.JPG",++$./e' *.jpg
Demo:
touch {a..e}.jpg
ls -ltr
total 0
-rw-r--r-- 1 user user 0 sep 6 16:35 e.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 d.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 c.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 b.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 a.jpg
rename 's/^.*$/sprintf "IMG_%05d.JPG",++$./e' *.jpg
ls -ltr
total 0
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00005.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00004.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00003.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00002.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00001.JPG
Full syntax for matching SO question, in safe way
There is a strong and safe way using rename
utility:
As this is perl common tool, we have to use perl syntax:
rename 'my $o=$_;
s/[ \[\]]+/-/g;
s/-+/-/g;
s/^-//g;
s/-\(\..*\|\)$/$1/g;
s/(.*[^\d])(|-(\d+))(\.[a-z0-9]{2,6})$/
my $i=$3;
$i=0 unless $i;
sprintf("%s-%d%s", $1, $i+1, $4)
/eg while
$o ne $_ &&
-f $_;
' *
Testing rule:
touch '[ www.crap.com ] file.name.ext' 'www.crap.com - file.name.ext'
ls -1
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
rename 'my $o=$_; ...
...
...' *
ls -1
www.crap.com-file.name-1.ext
www.crap.com-file.name.ext
touch '[ www.crap.com ] file.name.ext' 'www.crap.com - file.name.ext'
ls -1
www.crap.com-file.name-1.ext
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
www.crap.com-file.name.ext
rename 'my $o=$_; ...
...
...' *
ls -1
www.crap.com-file.name-1.ext
www.crap.com-file.name-2.ext
www.crap.com-file.name-3.ext
www.crap.com-file.name.ext
... and so on...
... and it's safe while you don't use -f
flag to rename
command: file won't be overwrited and you will get an error message if something goes wrong.
Renaming files by using bash and so called bashisms:
I prefer doing this by using dedicated utility, but this could even be done by using pure bash (aka without any fork)
There is no use of any other binary than bash (no sed
, awk
, tr
or other):
#!/bin/bash
for file;do
newname=${file//[ \]\[]/.}
while [ "$newname" != "${newname#.}" ] ;do
newname=${newname#.}
done
while [ "$newname" != "${newname//[.-][.-]/.}" ] ;do
newname=${newname//[.-][.-]/-};done
if [ "$file" != "$newname" ] ;then
if [ -f $newname ] ;then
ext=${newname##*.}
basename=${newname%.$ext}
partname=${basename%%-[0-9]}
count=${basename#${partname}-}
[ "$partname" = "$count" ] && count=0
while printf -v newname "%s-%d.%s" $partname $[++count] $ext &&
[ -f "$newname" ] ;do
:;done
fi
mv "$file" $newname
fi
done
To be run with files as argument, for sample:
/path/to/my/script.sh \[*
- Replacing spaces and square bracket by dot
- Replacing sequences of
.-
,-.
,--
or..
by only one-
. - Test if filename don't differ, there is nothing to do.
- Test if a file exist with newname...
- split filename, counter and extension, for making indexed newname
- loop if a file exist with newname
- Finaly rename the file.
Take advantage of the following classical pattern:
job_select /path/to/directory| job_strategy | job_process
where job_select
is responsible for selecting the objects of your job, job_strategy
prepares a processing plan for these objects and job_process
eventually executes the plan.
This assumes that filenames do not contain a vertical bar |
nor a newline character.
The job_select function
# job_select PATH
# Produce the list of files to process
job_select()
{
find "$1" -name 'www.*.com - *' -o -name '[*] - *'
}
The find
command can examine all properties of the file maintained by the file system, like creation time, access time, modification time. It is also possible to control how the filesystem is explored by telling find
not to descend into mounted filesystems, how much recursions levels are allowed. It is common to append pipes to the find
command to perform more complicated selections based on the filename.
Avoid the common pitfall of including the contents of hidden directories in the output of the job_select
function. For instance, the directories CVS
, .svn
, .svk
and .git
are used by the corresponding source control management tools and it is almost always wrong to include their contents in the output of the job_select
function. By inadvertently batch processing these files, one can easily make the affected working copy unusable.
The job_strategy function
# job_strategy
# Prepare a plan for renaming files
job_strategy()
{
sed -e '
h
s@/www\..*\.com - *@/@
s@/\[^]]* - *@/@
x
G
s/\n/|/
'
}
This commands reads the output of job_select
and makes a plan for our renaming job. The plan is represented by text lines having two fields separated by the character |
, the first field being the old name of the file and the second being the new computed file of the file, it looks like
[ www.crap.com ] file.name.1.ext|file.name.1.ext
www.crap.com - file.name.2.ext|file.name.2.ext
The particular program used to produce the plan is essentially irrelevant, but it is common to use sed
as in the example; awk
or perl
for this. Let us walk through the sed
-script used here:
h Replace the contents of the hold space with the contents of the pattern space.
… Edit the contents of the pattern space.
x Swap the contents of the pattern and hold spaces.
G Append a newline character followed by the contents of the hold space to the pattern space.
s/\n/|/ Replace the newline character in the pattern space by a vertical bar.
It can be easier to use several filters to prepare the plan. Another common case is the use of the stat
command to add creation times to file names.
The job_process function
# job_process
# Rename files according to a plan
job_process()
{
local oldname
local newname
while IFS='|' read oldname newname; do
mv "$oldname" "$newname"
done
}
The input field separator IFS is adjusted to let the function read the output of job_strategy
. Declaring oldname
and newname
as local is useful in large programs but can be omitted in very simple scripts. The job_process
function can be adjusted to avoid overwriting existing files and report the problematic items.
About data structures in shell programs
Note the use of pipes to transfer data from one stage to the other: apprentices often rely on variables to represent such information but it turns out to be a clumsy choice. Instead, it is preferable to represent data as tabular files or as tabular data streams moving from one process to the other, in this form, data can be easily processed by powerful tools like sed
, awk
, join
, paste
and sort
— only to cite the most common ones.