How to sort by file name the same way Windows Explorer does?
Solution 1:
Here is some very short code (just the $ToNatural
script block) that does the trick with a regular expression and a match evaluator in order to pad the numbers with spaces. Then we sort the input with padded numbers as usual and actually get natural order as a result.
$ToNatural = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20) }) }
'----- test 1 ASCIIbetical order'
Get-Content list.txt | Sort-Object
'----- test 2 input with padded numbers'
Get-Content list.txt | %{ . $ToNatural }
'----- test 3 Natural order: sorted with padded numbers'
Get-Content list.txt | Sort-Object $ToNatural
Output:
----- test 1 ASCIIbetical order
1.txt
10.txt
3.txt
a10b1.txt
a1b1.txt
a2b1.txt
a2b11.txt
a2b2.txt
b1.txt
b10.txt
b2.txt
----- test 2 input with padded numbers
1.txt
10.txt
3.txt
a 10b 1.txt
a 1b 1.txt
a 2b 1.txt
a 2b 11.txt
a 2b 2.txt
b 1.txt
b 10.txt
b 2.txt
----- test 3 Natural order: sorted with padded numbers
1.txt
3.txt
10.txt
a1b1.txt
a2b1.txt
a2b2.txt
a2b11.txt
a10b1.txt
b1.txt
b2.txt
b10.txt
And finally we use this one-liner to sort files by names in natural order:
Get-ChildItem | Sort-Object { [regex]::Replace($_.Name, '\d+', { $args[0].Value.PadLeft(20) }) }
Output:
Directory: C:\TEMP\_110325_063356
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 2011-03-25 06:34 8 1.txt
-a--- 2011-03-25 06:34 8 3.txt
-a--- 2011-03-25 06:34 8 10.txt
-a--- 2011-03-25 06:34 8 a1b1.txt
-a--- 2011-03-25 06:34 8 a2b1.txt
-a--- 2011-03-25 06:34 8 a2b2.txt
-a--- 2011-03-25 06:34 8 a2b11.txt
-a--- 2011-03-25 06:34 8 a10b1.txt
-a--- 2011-03-25 06:34 8 b1.txt
-a--- 2011-03-25 06:34 8 b2.txt
-a--- 2011-03-25 06:34 8 b10.txt
-a--- 2011-03-25 04:54 99 list.txt
-a--- 2011-03-25 06:05 346 sort-natural.ps1
-a--- 2011-03-25 06:35 96 test.ps1
Solution 2:
Allow me to copy and paste my answer from another question.
Powershell Sort-Object Name with numbers doesn't properly
Windows explorer is using a legacy API from shlwapi.dll which called StrCmpLogicalW
, that's the reason seeing different sorting results.
I don't want to pad zeros, so wrote a script.
https://github.com/LarrysGIT/Powershell-Natural-sort
Since I am not a C# expert, pull requests are appreciated if it's not tidy.
Find following PowerShell script, it uses the same API.
function Sort-Naturally
{
PARAM(
[System.Collections.ArrayList]$Array,
[switch]$Descending
)
Add-Type -TypeDefinition @'
using System;
using System.Collections;
using System.Collections.Generic;
using System.Runtime.InteropServices;
namespace NaturalSort {
public static class NaturalSort
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
public static extern int StrCmpLogicalW(string psz1, string psz2);
public static System.Collections.ArrayList Sort(System.Collections.ArrayList foo)
{
foo.Sort(new NaturalStringComparer());
return foo;
}
}
public class NaturalStringComparer : IComparer
{
public int Compare(object x, object y)
{
return NaturalSort.StrCmpLogicalW(x.ToString(), y.ToString());
}
}
}
'@
$Array.Sort((New-Object NaturalSort.NaturalStringComparer))
if($Descending)
{
$Array.Reverse()
}
return $Array
}
Find test results below.
PS> # Natural sort
PS> . .\NaturalSort.ps1
PS> Sort-Naturally -Array @('2', '1', '11')
1
2
11
PS> # If regular sort is being used
PS> @('2', '1', '11') | Sort-Object
1
11
2
PS> # Not good
PS> $t = ls .\testfiles\*.txt
PS> $t | Sort-Object
1.txt
10.txt
2.txt
PS> # Good
PS> Sort-Naturally -Array $t
1.txt
2.txt
10.txt
Solution 3:
Translation from python to PowerShell works pretty well:
function sort-dir {
param($dir)
$toarray = {
@($_.BaseName -split '(\d+)' | ?{$_} |
% { if ([int]::TryParse($_,[ref]$null)) { [int]$_ } else { $_ } })
}
gci $dir | sort -Property $toarray
}
#try it
mkdir $env:TEMP\mytestsodir
1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }
sort-dir $env:TEMP\mytestsodir
Remove-Item $env:TEMP\mytestsodir -Recurse
You can do it even better when you use Proxy function approach. You add -natur
parameter to Sort-Object
and you have pretty beautiful solution.
Update: First I was quite surprised that PowerShell handles comparing arrays in this way. After I tried to create test files ("a0", "a100", "a2") + 1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }
, it turned out that it doesn't work. So, I think there is no elegant solution like, because PowerShell is static under the covers, whereas python is dynamic.
Solution 4:
I prefer @Larry Song's answer because it sorts exactly the way Windows Explorer does. I tried to simplify it a little to make it less intrusive.
Add-Type -TypeDefinition @"
using System.Runtime.InteropServices;
public static class NaturalSort
{
[DllImport("Shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);
public static string[] Sort(string[] array)
{
System.Array.Sort(array, (psz1, psz2) => StrCmpLogicalW(psz1, psz2));
return array;
}
}
"@
Then you can use it like:
$array = ('1.jpg', '10.jpg', '2.jpg')
[NaturalSort]::Sort($array)
which outputs:
1.jpg
2.jpg
10.jpg