How to rename file names to avoid conflict in Windows or Mac?

How can I batch rename file names so that they do not include characters that clash with other file systems as for instance,

Screenshot 2015-09-07-25:10:10

Note that the colons are the issue in this file name. These will not be digested by Windows or Mac.

These files could be renamed to

Screenshot 2015-09-07-25--10--10

I have to move a large amount of files from Ubuntu to another OS. I copied them to an NTFS drive using Rsync, but that lost some files. I also copied them to an ext4 drive.

The following list are the reserved characters:

< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)

Another issue is that Windows is not case-sensitive when it comes to file names, (and most OS X systems as well).


You could do something like:

rename 's/[<>:"\\|?*]/_/g' /path/to/file

This will replace all these characters with a _. Note that you need not to replace /, since it's an invalid character for filenames in both filesystems, but is used as the Unix path separator. Extend to a directory and all its contents with:

find /path/to/directory -depth -exec rename 's/[<>:"\\|?*]/_/g' {} +

Note that both / (which marks the end of the pattern) and \ are escaped. To retain uniqueness, you could append a random prefix to it:

$ rename -n 's/[<>:"\/\\|?*]/_/g && s/^/int(rand(10000))/e' a\\b
a\b renamed as 8714a_b

A more complete solution should, at least:

  1. Convert all characters to the same case
  2. Use a sane counting system

That's to say, foo.mp3 should not become foo.mp3.1, but foo.1.mp3, since Windows is more reliant on extensions.

With that in mind, I wrote the following script. I tried to be non-destructive, by using a prefix path into which I can copy the renamed files, instead of modifying the original.

#! /bin/bash

windows_chars='<>:"\|?*'
prefix="windows/"

# Find number of files/directories which has this name as a prefix
find_num_files ()
(
    if [[ -e $prefix$1$2 ]]
    then
        shopt -s nullglob
        files=( "$prefix$1-"*"$2" )
        echo ${#files[@]}
    fi
)

# From http://www.shell-fu.org/lister.php?id=542
# Joins strings with a separator. Separator not present for
# edge case of single string.
str_join ()
(
    IFS=${1:?"Missing separator"}
    shift
    printf "%s" "$*"
)

for i
do
    # convert to lower case, then replace special chars with _
    new_name=$(tr "$windows_chars" _ <<<"${i,,}")

    # if a directory, make it, instead of copying contents
    if [[ -d $i ]]
    then
        mkdir -p "$prefix$new_name"
        echo mkdir -p "$prefix$new_name"
    else
        # get filename without extension
        name_wo_ext=${new_name%.*}
        # get extension
        # The trick is to make sure that, for:
        # "a.b.c", name_wo_ext is "a.b" and ext is ".c"
        # "abc", name_wo_ext is "abc" and ext is empty
        # Then, we can join the strings without worrying about the
        # . before an extension
        ext=${new_name#$name_wo_ext}
        count=$(find_num_files "$name_wo_ext" "$ext")
        name_wo_ext=$(str_join - "$name_wo_ext" $count)
        cp "$i" "$prefix$name_wo_ext$ext"
        echo cp "$i" "$prefix$name_wo_ext$ext"
    fi
done

In action:

$ tree a:b
a:b
├── b:c
│   ├── a:d
│   ├── A:D
│   ├── a:d.b
│   └── a:D.b
├── B:c
└── B"c
    └── a<d.b

3 directories, 5 files
$ find a:b -exec ./rename-windows.sh {} +
mkdir -p windows/a_b
mkdir -p windows/a_b/b_c
mkdir -p windows/a_b/b_c
cp a:b/B"c/a<d.b windows/a_b/b_c/a_d.b
mkdir -p windows/a_b/b_c
cp a:b/b:c/a:D.b windows/a_b/b_c/a_d-0.b
cp a:b/b:c/A:D windows/a_b/b_c/a_d
cp a:b/b:c/a:d windows/a_b/b_c/a_d-1
cp a:b/b:c/a:d.b windows/a_b/b_c/a_d-1.b
$ tree windows/
windows/
└── a_b
    └── b_c
        ├── a_d
        ├── a_d-0.b
        ├── a_d-1
        ├── a_d-1.b
        └── a_d.b

2 directories, 5 files

The script is available in my Github repo.


Recursively replace a list of strings or characters in filenames by other strings or characters

The script below can be used to replace a list of strings or characters, possibly occurring in a file's name, by an arbitrary replacement per string. Since the script only renames the file itself (not the path), there is no risk of messing with directories.

The replacement is defined in the list: chars (see further below). It is possible to give each string its own replacement, to be able to reverse the renaming if you'd ever want to do that. (assuming the replacement is a unique string). In case you'd like to replace all problematic strings by an underscore, simply define the list like:

chars = [
    ("<", "_"),
    (">", "_"),
    (":", "_"),
    ('"', "_"),
    ("/", "_"),
    ("\\", "_"),
    ("|", "_"),
    ("?", "_"),
    ("*", "_"),
    ]

Dupes

To prevent duplicated names, the script first creates the "new" name. It then checks if a similarly named file already exists in the same directory. If so, it creates a new name, preceded by dupe_1or dupe_2, until it finds an "available" new name for the file:

enter image description here

becomes:

enter image description here

The script

#!/usr/bin/env python3
import os
import shutil
import sys

directory = sys.argv[1]

# --- set replacement below in the format ("<string>", "<replacement>") as below
chars = [
    ("<", "_"),
    (">", "_"),
    (":", "_"),
    ('"', "_"),
    ("/", "_"),
    ("\\", "_"),
    ("|", "_"),
    ("?", "_"),
    ("*", "_"),
    ]
# ---

for root, dirs, files in os.walk(directory):
    for file in files:
        newfile = file
        for c in chars:
            newfile = newfile.replace(c[0], c[1])
        if newfile != file:
            tempname = newfile; n = 0
            while os.path.exists(root+"/"+newfile):
                n = n+1; newfile = "dupe_"+str(n)+"_"+tempname
            shutil.move(root+"/"+file, root+"/"+newfile)

How to use

  1. Copy the script into an empty file, save it as rename_chars.py.
  2. Edit if you want the replacement list. As it is, the scrip0t replaces all occurrences of problematic characters by an underscore, but the choice is yours.
  3. Test- run it on a directory by the command:

    python3 /path/to/rename_chars.py <directory_to_rename>
    

Note

Note that in the line:

("\\", "_bsl_"),

in python, a backslash needs to be escaped by another backslash.