How to order list of files by file name with number?
I have a bunch of files in a directory that I am trying to get based off their type. Once I have them I would like to order them by file name (there is a number in them and I would like to order them that way)
My files returned are:
file-1.txt
file-2.txt
...
file-10.txt
file-11.txt
...
file-20.txt
But the order I get them in looks something more closely to this:
file-1.txt
file-10.txt
file-11.txt
...
file-2.txt
file-20.txt
Right now I am using Directory.GetFiles()
and attempting to using the linq OrderBy
property. However, I am failing pretty badly with what I would need to do to order my list of files like the first list above.
Directory.GetFiles()
seems to be returning a list of strings so I am unable to get the list of file properties such as filename
or name
.
Here is my code currently:
documentPages = Directory.GetFiles(documentPath, "*.txt").OrderBy(Function(p) p).ToList()
Would anyone have any ideas?
It sounds like you might be looking for a "NaturalSort" - the kind of display File Explorer uses to order filenames containing numerals. For this you need a custom comparer:
Imports System.Runtime.InteropServices
Partial Class NativeMethods
<DllImport("shlwapi.dll", CharSet:=CharSet.Unicode)>
Private Shared Function StrCmpLogicalW(s1 As String, s2 As String) As Int32
End Function
Friend Shared Function NaturalStringCompare(str1 As String, str2 As String) As Int32
Return StrCmpLogicalW(str1, str2)
End Function
End Class
Public Class NaturalStringComparer
Implements IComparer(Of String)
Public Function Compare(x As String, y As String) As Integer Implements IComparer(Of String).Compare
Return NativeMethods.NaturalStringCompare(x, y)
End Function
End Class
Use it to sort the results you get:
Dim myComparer As New NaturalStringComparer
' OP post only shows the filename without path, so strip off path:
' (wont affect the result, just the display)
Dim files = Directory.EnumerateFiles(path_name_here).
Select(Function(s) Path.GetFileName(s)).ToList
Console.WriteLine("Before: {0}", String.Join(", ", files))
' sort the list using the Natural Comparer:
files.Sort(myComparer)
Console.WriteLine("After: {0}", String.Join(", ", files))
Results (one-lined to save space):
Before: file-1.txt, file-10.txt, file-11.txt, file-19.txt, file-2.txt, file-20.txt, file-3.txt, file-9.txt, file-99.txt
After: file-1.txt, file-2.txt, file-3.txt, file-9.txt, file-10.txt, file-11.txt, file-19.txt, file-20.txt, file-99.txt
One of the advantages of this is that it doesnt rely on a specific pattern or coding. It is more all-purpose and will handle more than one set of numbers in the text:
Game of Thrones\4 - A Feast For Crows\1 - Prologue.mp3
Game of Thrones\4 - A Feast For Crows\2 - The Prophet.mp3
...
Game of Thrones\4 - A Feast For Crows\10 - Brienne II.mp3
Game of Thrones\4 - A Feast For Crows\11 - Sansa.mp3
A Natural String Sort
is so handy, is is something I personally dont mind polluting Intellisense with by creating an extension:
' List<string> version
<Extension>
Function ToNaturalSort(l As List(Of String)) As List(Of String)
l.Sort(New NaturalStringComparer())
Return l
End Function
' array version
<Extension>
Function ToNaturalSort(a As String()) As String()
Array.Sort(a, New NaturalStringComparer())
Return a
End Function
Usage now is even easier:
Dim files = Directory.EnumerateFiles(your_path).
Select(Function(s) Path.GetFileName(s)).
ToList.
ToNaturalSort()
' or without the path stripping:
Dim files = Directory.EnumerateFiles(your_path).ToList.ToNaturalSort()
I'm assuming the file
and .txt
parts are mutable, and just here as placeholders for file names and types that can vary.
I don't use regular expressions very often, so this may need some work yet, but it's definitely the direction you need to go:
Dim exp As String = "-([0-9]+)[.][^.]*$"
documentPages = Directory.GetFiles(documentPath, "*.txt").OrderBy(Function(p) Integer.Parse(Regex.Matches(p, exp)(0).Groups(1).Value)).ToList()
Looking again, I see I missed that you are filtering by *.txt
files, which can help us narrow the expression:
Dim exp As String = "-([0-9]+)[.]txt$"
Another possible improvement brought by the other answer that includes test data is to allow for whitespace between the -
and numerals:
Dim exp As String = "-[ ]*([0-9]+)[.]txt$"
It's further worth noting that the above will fail if there are text files that don't follow the pattern. We can account for that if needed:
Dim exp As String = "-[ ]*([0-9]+)[.][^.]*$"
Dim docs = Directory.GetFiles(documentPath, "*.txt")
documentPages = docs.OrderBy(
Function(p)
Dim matches As MatchCollection = Regex.Matches(p, exp)
If matches.Count = 0 OrElse matches(0).Groups.Count < 2 Then Return 0
Return Integer.Parse(matches(0).Groups(1).Value)
End Function).ToList()
You could also use Integer.MaxValue
as your default option, depending on whether you want those to appear at the beginning or end of the list.