Python enumerate() tqdm progress-bar when reading a file?

I can't see the tqdm progress bar when I use this code to iterate my opened file:

        with open(file_path, 'r') as f:
        for i, line in enumerate(tqdm(f)):
            if i >= start and i <= end:
                print("line #: %s" % i)
                for i in tqdm(range(0, line_size, batch_size)):
                    # pause if find a file naed pause at the currend dir
                    re_batch = {}
                    for j in range(batch_size):
                        re_batch[j] = re.search(line, last_span)

what's the right way to use tqdm here?


You're on the right track. You're using tqdm correctly, but stop short of printing each line inside the loop when using tqdm. You'll also want to use tqdm on your first for loop and not on others, like so:

with open(file_path, 'r') as f:
    for i, line in enumerate(tqdm(f)):
        if i >= start and i <= end:
            for i in range(0, line_size, batch_size):
                # pause if find a file naed pause at the currend dir
                re_batch = {}
                for j in range(batch_size):
                    re_batch[j] = re.search(line, last_span)

Some notes on using enumerate and its usage in tqdm here.


I ran into this as well - tqdm is not displaying a progress bar, because the number of lines in the file object has not been provided.

The for loop will iterate over lines, reading until the next newline character is encountered.

In order to add the progress bar to tqdm, you will first need to scan the file and count the number of lines, then pass it to tqdm as the total

from tqdm import tqdm

num_lines = sum(1 for line in open('myfile.txt','r'))
with open('myfile.txt','r') as f:
    for line in tqdm(f, total=num_lines):
        print(line)

I'm trying to do the same thing on a file containing all Wikipedia articles. So I don't want to count the total lines before starting processing. Also it's a bz2 compressed file, so the len of the decompressed line overestimates the number of bytes read in that iteration, so...

with tqdm(total=Path(filepath).stat().st_size) as pbar:
    with bz2.open(filepath) as fin:
        for line in fin:
            pbar.update(fin.tell() - pbar.n)
    
    # used this to figure out the attributes of the pbar instance
    # print(vars(pbar))

Thank you Yohan Kuanke for your deleted answer. If moderators undelete it you can crib mine.


In the case of reading a file with readlines(), following can be used:

from tqdm import tqdm
with open(filename) as f:
    sentences = tqdm(f.readlines(),unit='MB')

the unit='MB' can be changed to 'B' or 'KB' or 'GB' accordingly.