How do I create an incrementing filename in Python?

Solution 1:

I would iterate through sample[int].xml for example and grab the next available name that is not used by a file or directory.

import os

i = 0
while os.path.exists("sample%s.xml" % i):
    i += 1

fh = open("sample%s.xml" % i, "w")
....

That should give you sample0.xml initially, then sample1.xml, etc.

Note that the relative file notation by default relates to the file directory/folder you run the code from. Use absolute paths if necessary. Use os.getcwd() to read your current dir and os.chdir(path_to_dir) to set a new current dir.

Solution 2:

Sequentially checking each file name to find the next available one works fine with small numbers of files, but quickly becomes slower as the number of files increases.

Here is a version that finds the next available file name in log(n) time:

import os

def next_path(path_pattern):
    """
    Finds the next free path in an sequentially named list of files

    e.g. path_pattern = 'file-%s.txt':

    file-1.txt
    file-2.txt
    file-3.txt

    Runs in log(n) time where n is the number of existing files in sequence
    """
    i = 1

    # First do an exponential search
    while os.path.exists(path_pattern % i):
        i = i * 2

    # Result lies somewhere in the interval (i/2..i]
    # We call this interval (a..b] and narrow it down until a + 1 = b
    a, b = (i // 2, i)
    while a + 1 < b:
        c = (a + b) // 2 # interval midpoint
        a, b = (c, b) if os.path.exists(path_pattern % c) else (a, c)

    return path_pattern % b

To measure the speed improvement I wrote a small test function that creates 10,000 files:

for i in range(1,10000):
    with open(next_path('file-%s.foo'), 'w'):
        pass

And implemented the naive approach:

def next_path_naive(path_pattern):
    """
    Naive (slow) version of next_path
    """
    i = 1
    while os.path.exists(path_pattern % i):
        i += 1
    return path_pattern % i

And here are the results:

Fast version:

real    0m2.132s
user    0m0.773s
sys 0m1.312s

Naive version:

real    2m36.480s
user    1m12.671s
sys 1m22.425s

Finally, note that either approach is susceptible to race conditions if multiple actors are trying to create files in the sequence at the same time.

Solution 3:

def get_nonexistant_path(fname_path):
    """
    Get the path to a filename which does not exist by incrementing path.

    Examples
    --------
    >>> get_nonexistant_path('/etc/issue')
    '/etc/issue-1'
    >>> get_nonexistant_path('whatever/1337bla.py')
    'whatever/1337bla.py'
    """
    if not os.path.exists(fname_path):
        return fname_path
    filename, file_extension = os.path.splitext(fname_path)
    i = 1
    new_fname = "{}-{}{}".format(filename, i, file_extension)
    while os.path.exists(new_fname):
        i += 1
        new_fname = "{}-{}{}".format(filename, i, file_extension)
    return new_fname

Before you open the file, call

fname = get_nonexistant_path("sample.xml")

This will either give you 'sample.xml' or - if this alreay exists - 'sample-i.xml' where i is the lowest positive integer such that the file does not already exist.

I recommend using os.path.abspath("sample.xml"). If you have ~ as home directory, you might need to expand it first.

Please note that race conditions might occur with this simple code if you have multiple instances running at the same time. If this might be a problem, please check this question.

Solution 4:

Try setting a count variable, and then incrementing that variable nested inside the same loop you write your file in. Include the count loop inside the name of the file with an escape character, so every loop ticks +1 and so does the number in the file.

Some code from a project I just finished:

numberLoops = #some limit determined by the user
currentLoop = 1
while currentLoop < numberLoops:
    currentLoop = currentLoop + 1

    fileName = ("log%d_%d.txt" % (currentLoop, str(now())))

For reference:

from time import mktime, gmtime

def now(): 
   return mktime(gmtime()) 

which is probably irrelevant in your case but i was running multiple instances of this program and making tons of files. Hope this helps!

Solution 5:

Without storing state data in an extra file, a quicker solution to the ones presented here would be to do the following:

from glob import glob
import os

files = glob("somedir/sample*.xml")
files = files.sorted()
cur_num = int(os.path.basename(files[-1])[6:-4])
cur_num += 1
fh = open("somedir/sample%s.xml" % cur_num, 'w')
rs = [blockresult]
fh.writelines(rs)
fh.close()

This will also keep incrementing, even if some of the lower numbered files disappear.

The other solution here that I like (pointed out by Eiyrioü) is the idea of keeping a temporary file that contains your most recent number:

temp_fh = open('somedir/curr_num.txt', 'r')
curr_num = int(temp_fh.readline().strip())
curr_num += 1
fh = open("somedir/sample%s.xml" % cur_num, 'w')
rs = [blockresult]
fh.writelines(rs)
fh.close()