tqdm / tqdm

:zap: A Fast, Extensible Progress Bar for Python and CLI

Home Page:https://tqdm.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

what's the decent way to update `desc` and `postfix`

yantaozhao opened this issue · comments

Given below code which is from website of joblib:

Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))

What's the decent way to update the desc and postfix, if I want to trace and show the current processing number in the desc or postfix field?
Is there any way to avoid manually creating a new pbar object?

Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in tqdm(range(10)))

You may want to have a look at the bar_format argument of tqdm. It allows to to completely customize the appearance of the loading bar. The changing variables can be inserted between { and }. Example:

from math import sqrt
from tqdm import tqdm
from joblib import Parallel, delayed

bar = "{desc}: Element number {n_fmt}... | {bar} | [{elapsed}<{remaining}, {rate_fmt}{postfix}]"

a = Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in tqdm(range(10), bar_format=bar, desc="Process A"))
# Process A: Element number 10... | ██████████████████████████████ | [00:00<00:00, 101.86s/it]

The variables are, per documentation: l_bar, bar, r_bar, n, n_fmt, total, total_fmt, percentage, elapsed, elapsed_s, ncols, nrows, desc, unit, rate, rate_fmt, rate_noinv, rate_noinv_fmt, rate_inv, rate_inv_fmt, postfix, unit_divisor, remaining, remaining_s, eta .

Per documentation, the default bar is '{l_bar}{bar}{r_bar}' with

l_bar='{desc}: {percentage:3.0f}%|'
r_bar='| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}{postfix}]'

Hope this helps.

Thanks @CopperEagle , but what I want is a dynamic desc and postfix in realtime.

For example, on data list ['a', 'b', 'c', 'd', 'e'], in bar:
hello d: | ██████████████████████████████ | [00:00<00:00, 101.86s/it] current value d,

the str hello d and current value d are generated on the fly, where d is the instant value in process.

Oh, I see @yantaozhao

In this case, we can use tqdm.set_description and tqdm.set_postfix_str. Here is an example:

# Process a single argument
def do_stuff(arg): 
    return len(arg)

# Process subroutine
def process(arg, pbar): 
    sleep(1)
    pbar.set_description(f"Processing {arg}")
    pbar.set_postfix_str(f"The almightiy {arg} is here...")
    return delayed(do_stuff)(arg)

# ... elsewhere
pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
data = Parallel(n_jobs=2)(process(i, pbar) for i in pbar)

# Processing e: 100%|█████████████████████| 5/5 [00:19<00:00,  3.84s/it, The almighty e is here...]

The notable difference is that the tqdm iterator can no longer be anonymous, as it needs to be passed to the process function and updated. The traditional for loop (without joblib) would be

pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
for i in pbar:
    process(i, pbar)

Cheers

Oh, I see @yantaozhao

In this case, we can use tqdm.set_description and tqdm.set_postfix_str. Here is an example:

# Process a single argument
def do_stuff(arg): 
    return len(arg)

# Process subroutine
def process(arg, pbar): 
    sleep(1)
    pbar.set_description(f"Processing {arg}")
    pbar.set_postfix_str(f"The almightiy {arg} is here...")
    return delayed(do_stuff)(arg)

# ... elsewhere
pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
data = Parallel(n_jobs=2)(process(i, pbar) for i in pbar)

# Processing e: 100%|█████████████████████| 5/5 [00:19<00:00,  3.84s/it, The almighty e is here...]

The notable difference is that the tqdm iterator can no longer be anonymous, as it needs to be passed to the process function and updated. The traditional for loop (without joblib) would be

pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
for i in pbar:
    process(i, pbar)

Cheers

Good idea @CopperEagle .
Is there any further way to avoid manually creating a new pbar object?

what I want is something like below (not real runnable code):

data = ['a', 'b', 'c', 'd', 'e']
Parallel(n_jobs=2)(delayed(foo)(x) for x in tqdm(data, desc=lambda i: f'hello {data[i]}', postfix=lambda i: f'current value {data[i]}'))

where i is assumed as element index.

Sure @yantaozhao, we can make it a wrapper function. Here's how:

def updater(pbar):
    iter = pbar.__iter__()
    class AnonymousPbar:
        def __iter__(self):
            return self
        def __next__(self):
            arg = iter.__next__()
            pbar.set_description(f"Processing {arg}")
            pbar.set_postfix_str(f"The almightiy {arg} is here...")       
            return arg
    return AnonymousPbar()

## Then you can use this anywhere:
for i in updater(tqdm(range(10))):
    process(i)
# Processing 9: 100%|███████████████████| 10/10 [00:10<00:00,  1.00s/it, The almightiy 9 is here...]

However, this may look a bit convoluted when using this with a for loop.
Also, the implementation fixes the description string. To make it look lean like tqdm does, we can pull in the creation of the tqdm object into the wrapper function.

The fallowing is a drop-in replacement for tqdm:

def progress(iterable, desc=None, postfix=None, **kwargs):
    pbar = tqdm(iterable, **kwargs)
    class AnonymousPbar:
        def __init__(self, proc):
            self.iter = proc.__iter__()
            self.pbar = proc
        def __iter__(self):
            return self
        def __next__(self):
            arg = self.iter.__next__()
            if desc is None: 
                desc_str = ""
            elif isinstance(desc, str): # compliance with tqdm
                desc_str = desc # want simple string? you get it.
            else:
                desc_str = desc(arg)
            if postfix is None: 
                postfix_str = ""
            elif isinstance(postfix, str): # compliance with tqdm
                postfix_str = postfix # want simple string? you get it.
            else:
                postfix_str = postfix(arg)
            self.pbar.set_description(desc_str)
            self.pbar.set_postfix_str(postfix_str)
            return arg
    return AnonymousPbar(pbar)

## Then, anywhere in the code:
for i in progress(range(10)):
    sleep(1)
# 100%|█████████████████████████████| 10/10 [00:10<00:00,  1.00s/it]


## progress accepts any keyword argument that tqdm does
for i in progress(range(10), desc=lambda arg: f"Process {arg}", ncols=80):
    sleep(1)
# Process 9: 100%|█████████████████████████████| 10/10 [00:10<00:00,  1.00s/it]


## progress can be used with joblib as expected
data = [6,3,8,2]
Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in progress(data, ncols=80, postfix=lambda arg:f"Process {arg}"))
# 100%|█████████████████████████████| 4/4 [00:00<00:00, 1126.59it/s, Process 2]
# [6.0, 3.0, 8.0, 2.0]

The length of the bars () was edited for readability in both code snippets.

Any argument by tqdm is supported. It has the added feature that you can set desc and postfix to be a lambda function which takes as an argument the current element being processed.

smart wrapper function @CopperEagle