How can I iterate over files in a given directory?

I need to iterate through all .asm files inside a given directory and do some actions on them.

How can this be done in a efficient way?


Solution 1:

Python 3.6 version of the above answer, using os - assuming that you have the directory path as a str object in a variable called directory_in_str:

import os

directory = os.fsencode(directory_in_str)
    
for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
         continue
     else:
         continue

Or recursively, using pathlib:

from pathlib import Path

pathlist = Path(directory_in_str).glob('**/*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)
  • Use rglob to replace glob('**/*.asm') with rglob('*.asm')
    • This is like calling Path.glob() with '**/' added in front of the given relative pattern:
from pathlib import Path

pathlist = Path(directory_in_str).rglob('*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)

Original answer:

import os

for filename in os.listdir("/path/to/dir/"):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue

Solution 2:

This will iterate over all descendant files, not just the immediate children of the directory:

import os

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".asm"):
            print (filepath)

Solution 3:

You can try using glob module:

import glob

for filepath in glob.iglob('my_dir/*.asm'):
    print(filepath)

and since Python 3.5 you can search subdirectories as well:

glob.glob('**/*.txt', recursive=True) # => ['2.txt', 'sub/3.txt']

From the docs:

The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched.

Solution 4:

Since Python 3.5, things are much easier with os.scandir() and 2-20x faster (source):

with os.scandir(path) as it:
    for entry in it:
        if entry.name.endswith(".asm") and entry.is_file():
            print(entry.name, entry.path)

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.