How does the Windows RENAME command interpret wildcards?

How does the Windows RENAME (REN) command interpret wildcards?

The built in HELP facility is of no help - it doesn't address wildcards at all.

The Microsoft technet XP online help isn't much better. Here is all it has to say regarding wildcards:

"You can use wildcards (* and ?) in either file name parameter. If you use wildcards in filename2, the characters represented by the wildcards will be identical to the corresponding characters in filename1."

Not much help - there are many ways that statement can be interpretted.

I've managed to successfully use wildcards in the filename2 parameter on some occasions, but it has always been trial and error. I haven't been able to anticipate what works and what doesn't. Frequently I've had to resort to writing a small batch script with a FOR loop that parses each name so that I can build each new name as needed. Not very convenient.

If I knew the rules for how wildcards are processed then I figure I could use the RENAME command more effectively without having to resort to batch as often. Of course knowing the rules would also benefit batch development.

(Yes - this is a case where I am posting a paired question and answer. I got tired of not knowing the rules and decided to experiment on my own. I figure many others may be interested in what I discovered)


These rules were discovered after extensive testing on a Vista machine. No tests were done with unicode in file names.

RENAME requires 2 parameters - a sourceMask, followed by a targetMask. Both the sourceMask and targetMask can contain * and/or ? wildcards. The behavior of the wildcards changes slightly between source and target masks.

Note - REN can be used to rename a folder, but wildcards are not allowed in either the sourceMask or targetMask when renaming a folder. If the sourceMask matches at least one file, then the file(s) will be renamed and folders will be ignored. If the sourceMask matches only folders and not files, then a syntax error is generated if wildcards appear in source or target. If the sourceMask does not match anything, then a "file not found" error results.

Also, when renaming files, wildcards are only allowed in the file name portion of the sourceMask. Wildcards are not allowed in the path leading up to the file name.

sourceMask

The sourceMask works as a filter to determine which files are renamed. The wildcards work here the same as with any other command that filters file names.

  • ? - Matches any 0 or 1 character except . This wildcard is greedy - it always consumes the next character if it is not a . However it will match nothing without failure if at name end or if the next character is a .

  • * - Matches any 0 or more characters including . (with one exception below). This wildcard is not greedy. It will match as little or as much as is needed to enable subsequent characters to match.

All non-wildcard characters must match themselves, with a few special case exceptions.

  • . - Matches itself or it can match the end of name (nothing) if no more characters remain. (Note - a valid Windows name cannot end with .)

  • {space} - Matches itself or it can match the end of name (nothing) if no more characters remain. (Note - a valid Windows name cannot end with {space})

  • *. at the end - Matches any 0 or more characters except . The terminating . can actually be any combination of . and {space} as long as the very last character in the mask is . This is the one and only exception where * does not simply match any set of characters.

The above rules are not that complex. But there is one more very important rule that makes the situation confusing: The sourceMask is compared against both the long name and the short 8.3 name (if it exists). This last rule can make interpretation of the results very tricky, because it is not always obvious when the mask is matching via the short name.

It is possible to use RegEdit to disable the generation of short 8.3 names on NTFS volumes, at which point interpretation of file mask results is much more straight forward. Any short names that were generated before disabling short names will remain.

targetMask

Note - I haven't done any rigorous testing, but it appears these same rules also work for the target name of the COPY commmand

The targetMask specifies the new name. It is always applied to the full long name; The targetMask is never applied to the short 8.3 name, even if the sourceMask matched the short 8.3 name.

The presence or absence of wildcards in the sourceMask has no impact on how wildcards are processed in the targetMask.

In the following discussion - c represents any character that is not *, ?, or .

The targetMask is processed against the source name strictly from left to right with no back-tracking.

  • c - Advances the position within the source name only if the source character is not ., and always appends c to the target name. (Replaces the character that was in source with c, but never replaces .)

  • ? - Matches the next character from the source long name and appends it to the target name as long as the source character is not . If the next character is . or if at the end of the source name then no character is added to the result and the current position within the source name is unchanged.

  • * at end of targetMask - Appends all remaining characters from source to the target. If already at the end of source, then does nothing.

  • *c - Matches all source characters from current position through the last occurance of c (case sensitive greedy match) and appends the matched set of characters to the target name. If c is not found, then all remaining characters from source are appended, followed by c This is the only situation I am aware of where Windows file pattern matching is case sensitive.

  • *. - Matches all source characters from current position through the last occurance of . (greedy match) and appends the matched set of characters to the target name. If . is not found, then all remaining characters from source are appended, followed by .

  • *? - Appends all remaining characters from source to the target. If already at end of source then does nothing.

  • . without * in front - Advances the position in source through the first occurance of . without copying any characters, and appends . to the target name. If . is not found in the source, then advances to the end of source and appends . to the target name.

After the targetMask has been exhausted, any trailing . and {space} are trimmed off the end of the resulting target name because Windows file names cannot end with . or {space}

Some practical examples

Substitute a character in the 1st and 3rd positions prior to any extension (adds a 2nd or 3rd character if it doesn't exist yet)

ren  *  A?Z*
  1        -> AZ
  12       -> A2Z
  1.txt    -> AZ.txt
  12.txt   -> A2Z.txt
  123      -> A2Z
  123.txt  -> A2Z.txt
  1234     -> A2Z4
  1234.txt -> A2Z4.txt

Change the (final) extension of every file

ren  *  *.txt
  a     -> a.txt
  b.dat -> b.txt
  c.x.y -> c.x.txt

Append an extension to every file

ren  *  *?.bak
  a     -> a.bak
  b.dat -> b.dat.bak
  c.x.y -> c.x.y.bak

Remove any extra extension after the initial extension. Note that adequate ? must be used to preserve the full existing name and initial extension.

ren  *  ?????.?????
  a     -> a
  a.b   -> a.b
  a.b.c -> a.b
  part1.part2.part3    -> part1.part2
  123456.123456.123456 -> 12345.12345   (note truncated name and extension because not enough `?` were used)

Same as above, but filter out files with initial name and/or extension longer than 5 chars so that they are not truncated. (Obviously could add an additional ? on either end of targetMask to preserve names and extensions up to 6 chars long)

ren  ?????.?????.*  ?????.?????
  a      ->  a
  a.b    ->  a.b
  a.b.c  ->  a.b
  part1.part2.part3  ->  part1.part2
  123456.123456.123456  (Not renamed because doesn't match sourceMask)

Change characters after last _ in name and attempt to preserve extension. (Doesn't work properly if _ appears in extension)

ren  *_*  *_NEW.*
  abcd_12345.txt  ->  abcd_NEW.txt
  abc_newt_1.dat  ->  abc_newt_NEW.dat
  abcdef.jpg          (Not renamed because doesn't match sourceMask)
  abcd_123.a_b    ->  abcd_123.a_NEW  (not desired, but no simple RENAME form will work in this case)

Any name can be broken up into components that are delimited by . Characters may only be appended to or deleted from the end of each component. Characters cannot be deleted from or added to the beginning or middle of a component while preserving the remainder with wildcards. Substitutions are allowed anywhere.

ren  ??????.??????.??????  ?x.????999.*rForTheCourse
  part1.part2            ->  px.part999.rForTheCourse
  part1.part2.part3      ->  px.part999.parForTheCourse
  part1.part2.part3.part4   (Not renamed because doesn't match sourceMask)
  a.b.c                  ->  ax.b999.crForTheCourse
  a.b.CarPart3BEER       ->  ax.b999.CarParForTheCourse

If short names are enabled, then a sourceMask with at least 8 ? for the name and at least 3 ? for the extension will match all files because it will always match the short 8.3 name.

ren ????????.???  ?x.????999.*rForTheCourse
  part1.part2.part3.part4  ->  px.part999.part3.parForTheCourse

Useful quirk/bug? for deleting name prefixes

This SuperUser post describes how a set of forward slashes (/) can be used to delete leading characters (except .) from a file name. One slash is required for each character to be deleted. I've confirmed the behavior on a Windows 10 machine.

ren "abc-*.txt" "////*.txt"
  abc-123.txt        --> 123.txt
  abc-HelloWorld.txt --> HelloWorld.txt

Unfortunately leading / cannot remove . in a name. So the technique cannot be used to remove a prefix that contains .. For example:

ren "abc.xyz.*.txt" "////////*.txt"
  abc.xyz.123.txt        --> .xyz.123.txt
  abc.xyz.HelloWorld.txt --> .xyz.HelloWorld.txt

This technique only works if both the source and target masks are enclosed in double quotes. All of the following forms without the requisite quotes fail with this error: The syntax of the command is incorrect

REM - All of these forms fail with a syntax error.
ren abc-*.txt "////*.txt"
ren "abc-*.txt" ////*.txt
ren abc-*.txt ////*.txt

The / cannot be used to remove any characters in the middle or end of a file name. It can only remove leading (prefix) characters. Also note this technique does not work with folder names.

Technically the / is not functioning as a wildcard. Rather it is doing a simple character substitution following the c target mask rule. But then after the substitution, the REN command recognizes that / is not valid in a file name, and strips the leading / slashes from the name. REN gives a syntax error if it detects / in the middle of a target name.


Possible RENAME bug - a single command may rename the same file twice!

Starting in an empty test folder:

C:\test>copy nul 123456789.123
        1 file(s) copied.

C:\test>dir /x
 Volume in drive C is OS
 Volume Serial Number is EE2C-5A11

 Directory of C:\test

09/15/2012  07:42 PM    <DIR>                       .
09/15/2012  07:42 PM    <DIR>                       ..
09/15/2012  07:42 PM                 0 123456~1.123 123456789.123
               1 File(s)              0 bytes
               2 Dir(s)  327,237,562,368 bytes free

C:\test>ren *1* 2*3.?x

C:\test>dir /x
 Volume in drive C is OS
 Volume Serial Number is EE2C-5A11

 Directory of C:\test

09/15/2012  07:42 PM    <DIR>                       .
09/15/2012  07:42 PM    <DIR>                       ..
09/15/2012  07:42 PM                 0 223456~1.XX  223456789.123.xx
               1 File(s)              0 bytes
               2 Dir(s)  327,237,562,368 bytes free

REM Expected result = 223456789.123.x

I believe the sourceMask *1* first matches the long file name, and the file is renamed to the expected result of 223456789.123.x. RENAME then continues to look for more files to process and finds the newly named file via the new short name of 223456~1.X. The file is then renamed again giving the final result of 223456789.123.xx.

If I disable 8.3 name generation then the RENAME gives the expected result.

I haven't fully worked out all of the trigger conditions that must exist to induce this odd behavior. I was concerned that it might be possible to create a never ending recursive RENAME, but I was never able to induce one.

I believe all of the following must be true to induce the bug. Every bugged case I saw had the following conditions, but not all cases that met the following conditions were bugged.

  • Short 8.3 names must be enabled
  • The sourceMask must match the original long name.
  • The initial rename must generate a short name that also matches the sourceMask
  • The initial renamed short name must sort later than the original short name (if it existed?)

Similar to exebook, here's a C# implementation to get the target filename from a sourcefile.

I found 1 small error in dbenham's examples:

 ren  *_*  *_NEW.*
   abc_newt_1.dat  ->  abc_newt_NEW.txt (should be: abd_newt_NEW.dat)

Here's the code:

    /// <summary>
    /// Returns a filename based on the sourcefile and the targetMask, as used in the second argument in rename/copy operations.
    /// targetMask may contain wildcards (* and ?).
    /// 
    /// This follows the rules of: http://superuser.com/questions/475874/how-does-the-windows-rename-command-interpret-wildcards
    /// </summary>
    /// <param name="sourcefile">filename to change to target without wildcards</param>
    /// <param name="targetMask">mask with wildcards</param>
    /// <returns>a valid target filename given sourcefile and targetMask</returns>
    public static string GetTargetFileName(string sourcefile, string targetMask)
    {
        if (string.IsNullOrEmpty(sourcefile))
            throw new ArgumentNullException("sourcefile");

        if (string.IsNullOrEmpty(targetMask))
            throw new ArgumentNullException("targetMask");

        if (sourcefile.Contains('*') || sourcefile.Contains('?'))
            throw new ArgumentException("sourcefile cannot contain wildcards");

        // no wildcards: return complete mask as file
        if (!targetMask.Contains('*') && !targetMask.Contains('?'))
            return targetMask;

        var maskReader = new StringReader(targetMask);
        var sourceReader = new StringReader(sourcefile);
        var targetBuilder = new StringBuilder();


        while (maskReader.Peek() != -1)
        {

            int current = maskReader.Read();
            int sourcePeek = sourceReader.Peek();
            switch (current)
            {
                case '*':
                    int next = maskReader.Read();
                    switch (next)
                    {
                        case -1:
                        case '?':
                            // Append all remaining characters from sourcefile
                            targetBuilder.Append(sourceReader.ReadToEnd());
                            break;
                        default:
                            // Read source until the last occurrance of 'next'.
                            // We cannot seek in the StringReader, so we will create a new StringReader if needed
                            string sourceTail = sourceReader.ReadToEnd();
                            int lastIndexOf = sourceTail.LastIndexOf((char) next);
                            // If not found, append everything and the 'next' char
                            if (lastIndexOf == -1)
                            {
                                targetBuilder.Append(sourceTail);
                                targetBuilder.Append((char) next);

                            }
                            else
                            {
                                string toAppend = sourceTail.Substring(0, lastIndexOf + 1);
                                string rest = sourceTail.Substring(lastIndexOf + 1);
                                sourceReader.Dispose();
                                // go on with the rest...
                                sourceReader = new StringReader(rest);
                                targetBuilder.Append(toAppend);
                            }
                            break;
                    }

                    break;
                case '?':
                    if (sourcePeek != -1 && sourcePeek != '.')
                    {
                        targetBuilder.Append((char)sourceReader.Read());
                    }
                    break;
                case '.':
                    // eat all characters until the dot is found
                    while (sourcePeek != -1 && sourcePeek != '.')
                    {
                        sourceReader.Read();
                        sourcePeek = sourceReader.Peek();
                    }

                    targetBuilder.Append('.');
                    // need to eat the . when we peeked it
                    if (sourcePeek == '.')
                        sourceReader.Read();

                    break;
                default:
                    if (sourcePeek != '.') sourceReader.Read(); // also consume the source's char if not .
                    targetBuilder.Append((char)current);
                    break;
            }

        }

        sourceReader.Dispose();
        maskReader.Dispose();
        return targetBuilder.ToString().TrimEnd('.', ' ');
    }

And here's an NUnit test method to test the examples:

    [Test]
    public void TestGetTargetFileName()
    {
        string targetMask = "?????.?????";
        Assert.AreEqual("a", FileUtil.GetTargetFileName("a", targetMask));
        Assert.AreEqual("a.b", FileUtil.GetTargetFileName("a.b", targetMask));
        Assert.AreEqual("a.b", FileUtil.GetTargetFileName("a.b.c", targetMask));
        Assert.AreEqual("part1.part2", FileUtil.GetTargetFileName("part1.part2.part3", targetMask));
        Assert.AreEqual("12345.12345", FileUtil.GetTargetFileName("123456.123456.123456", targetMask));

        targetMask = "A?Z*";
        Assert.AreEqual("AZ", FileUtil.GetTargetFileName("1", targetMask));
        Assert.AreEqual("A2Z", FileUtil.GetTargetFileName("12", targetMask));
        Assert.AreEqual("AZ.txt", FileUtil.GetTargetFileName("1.txt", targetMask));
        Assert.AreEqual("A2Z.txt", FileUtil.GetTargetFileName("12.txt", targetMask));
        Assert.AreEqual("A2Z", FileUtil.GetTargetFileName("123", targetMask));
        Assert.AreEqual("A2Z.txt", FileUtil.GetTargetFileName("123.txt", targetMask));
        Assert.AreEqual("A2Z4", FileUtil.GetTargetFileName("1234", targetMask));
        Assert.AreEqual("A2Z4.txt", FileUtil.GetTargetFileName("1234.txt", targetMask));

        targetMask = "*.txt";
        Assert.AreEqual("a.txt", FileUtil.GetTargetFileName("a", targetMask));
        Assert.AreEqual("b.txt", FileUtil.GetTargetFileName("b.dat", targetMask));
        Assert.AreEqual("c.x.txt", FileUtil.GetTargetFileName("c.x.y", targetMask));

        targetMask = "*?.bak";
        Assert.AreEqual("a.bak", FileUtil.GetTargetFileName("a", targetMask));
        Assert.AreEqual("b.dat.bak", FileUtil.GetTargetFileName("b.dat", targetMask));
        Assert.AreEqual("c.x.y.bak", FileUtil.GetTargetFileName("c.x.y", targetMask));

        targetMask = "*_NEW.*";
        Assert.AreEqual("abcd_NEW.txt", FileUtil.GetTargetFileName("abcd_12345.txt", targetMask));
        Assert.AreEqual("abc_newt_NEW.dat", FileUtil.GetTargetFileName("abc_newt_1.dat", targetMask));
        Assert.AreEqual("abcd_123.a_NEW", FileUtil.GetTargetFileName("abcd_123.a_b", targetMask));

        targetMask = "?x.????999.*rForTheCourse";

        Assert.AreEqual("px.part999.rForTheCourse", FileUtil.GetTargetFileName("part1.part2", targetMask));
        Assert.AreEqual("px.part999.parForTheCourse", FileUtil.GetTargetFileName("part1.part2.part3", targetMask));
        Assert.AreEqual("ax.b999.crForTheCourse", FileUtil.GetTargetFileName("a.b.c", targetMask));
        Assert.AreEqual("ax.b999.CarParForTheCourse", FileUtil.GetTargetFileName("a.b.CarPart3BEER", targetMask));

    }