Sort filenames in directory in ascending order [duplicate]
I have a directory with jpgs and other files in it, the jpgs all have filenames with numbers in them. Some may have additional strings in the filename.
For example.
01.jpg
Or it could be
Picture 03.jpg
In Python I need a list of all the jpgs in ascending order. Here is the code snippet for this
import os
import numpy as np
myimages = [] #list of image filenames
dirFiles = os.listdir('.') #list of directory files
dirFiles.sort() #good initial sort but doesnt sort numerically very well
sorted(dirFiles) #sort numerically in ascending order
for files in dirFiles: #filter out all non jpgs
if '.jpg' in files:
myimages.append(files)
print len(myimages)
print myimages
What I get is this
['0.jpg', '1.jpg', '10.jpg', '11.jpg', '12.jpg', '13.jpg', '14.jpg',
'15.jpg', '16.jpg', '17.jpg', '18.jpg', '19.jpg', '2.jpg', '20.jpg',
'21.jpg', '22.jpg', '23.jpg', '24.jpg', '25.jpg', '26.jpg', '27.jpg',
'28.jpg', '29.jpg', '3.jpg', '30.jpg', '31.jpg', '32.jpg', '33.jpg',
'34.jpg', '35.jpg', '36.jpg', '37.jpg', '4.jpg', '5.jpg', '6.jpg',
'7.jpg', '8.jpg', '9.jpg']
Clearly it sorts blindly the most significant number first. I tried using sorted()
as you can see hoping that it would fix it but it makes no difference.
Assuming there's just one number in each file name:
>>> dirFiles = ['Picture 03.jpg', '02.jpg', '1.jpg']
>>> dirFiles.sort(key=lambda f: int(filter(str.isdigit, f)))
>>> dirFiles
['1.jpg', '02.jpg', 'Picture 03.jpg']
A version that also works in Python 3:
>>> dirFiles.sort(key=lambda f: int(re.sub('\D', '', f)))
there is a module natsort
. Just do pip install natsort
.
>>> import natsort
>>> ll = ['Picture 13.jpg', 'Picture 14.jpg', 'Picture 15.jpg','Picture 0.jpg', 'Picture 1.jpg', 'Picture 10.jpg', 'Picture 11.jpg', 'Picture 12.jpg', 'Picture 16.jpg', 'Picture 17.jpg', 'Picture 18.jpg', 'Picture 19.jpg', 'Picture 2.jpg', 'Picture 20.jpg', 'Picture 21.jpg', 'Picture 22.jpg', 'Picture 23.jpg', 'Picture 24.jpg', 'Picture 25.jpg', 'Picture 26.jpg', 'Picture 27.jpg', 'Picture 28.jpg', 'Picture 29.jpg', 'Picture 3.jpg', 'Picture 30.jpg', 'Picture 31.jpg', 'Picture 32.jpg', 'Picture 33.jpg', 'Picture 34.jpg', 'Picture 35.jpg', 'Picture 36.jpg', 'Picture 37.jpg']
>>> print(natsort.natsorted(ll,reverse=True))
['Picture 37.jpg', 'Picture 36.jpg', 'Picture 35.jpg', 'Picture 34.jpg', 'Picture 33.jpg', 'Picture 32.jpg', 'Picture 31.jpg', 'Picture 30.jpg', 'Picture 29.jpg', 'Picture 28.jpg', 'Picture 27.jpg', 'Picture 26.jpg', 'Picture 25.jpg', 'Picture 24.jpg', 'Picture 23.jpg', 'Picture 22.jpg', 'Picture 21.jpg', 'Picture 20.jpg', 'Picture 19.jpg', 'Picture 18.jpg', 'Picture 17.jpg', 'Picture 16.jpg', 'Picture 15.jpg', 'Picture 14.jpg', 'Picture 13.jpg', 'Picture 12.jpg', 'Picture 11.jpg', 'Picture 10.jpg', 'Picture 3.jpg', 'Picture 2.jpg', 'Picture 1.jpg', 'Picture 0.jpg']
I have a directory with jpgs and other files in it.
[...]
['0.jpg', '1.jpg', '10.jpg', '11.jpg', '12.jpg', '13.jpg', '14.jpg', '15.jpg', '16.jpg', '17.jpg', '18.jpg', '19.jpg', '2.jpg', '20.jpg', '21.jpg', '22.jpg', '23.jpg', '24.jpg', '25.jpg', '26.jpg', '27.jpg', '28.jpg', '29.jpg', '3.jpg', '30.jpg', '31.jpg', '32.jpg', '33.jpg', '34.jpg', '35.jpg', '36.jpg', '37.jpg', '4.jpg', '5.jpg', '6.jpg', '7.jpg', '8.jpg', '9.jpg'] Clearly it sorts blindly the most significant number first. I tried using sorted() as you can see hoping that it would fix it but it makes no difference
You can use splitext to get the part without the extension and convert it to an int for the sorting. If the list is named 'l' and the sorted list is named 'lsorted' you can use:
lsorted = sorted(imgs_list, key=lambda x: int(os.path.splitext(x)[0]))
"imgs_list" here is the list of images. If you have a directory of images, simply obtain a list of these images by :
l = os.listdir('/path/to/directory/of/images')
Explanation: os.path.splitext on '10.jpg' returns ['10','.jpg'] so taking the int() of element zero will give you want you want as long as the filenames without the extention only contain strings that can be converted to integers with int(). Otherwise you will run into an Error.