Display One-Line Progress Bar

Question

Display One-Line Progress Bar

OscarIntellico opened this issue a year ago · comments

Hi.

I'm using ploomber in a Machine Learning Pipeline, in particular i'm using Neural Network built in pytorch. The training, validation and test loop print a progress bar with tqdm, since it can give a good estimate of the training time and can also add some granular informations such as current validation loss and so on.

In order to see the outputs, i added to my training task the papermill_params tag, with log_output: True.
While this allows me to see what i'm printing in my task, the progress bar is not interactive, but every time a new update comes, a new line is printed, as you can see in the image.

My desired behaviour would be to just have one single updating line, I don't know if right now is possible to implement this, since my understanding is that ploomber is redirecting the cell output on stdout. Also I think possible problems may arise with parallel task that both use the progress bar, but it would be nice for at least single-process pipelines to not loose the interactivity of tqdm.

Eduardo Blancas · Answer 1 · Tue Mar 14 2023 23:36:13 GMT+0800 (China Standard Time)

Hi @OscarIntellico - thanks for your feedback!

I don't think this will be possible with the current default settings (ploomber uses papermill for executing notebooks by default); however, it might be possible with ploomber-engine (our papermill replacement).

It's possible to use ploomber with ploomber-engine, so if we can get this working, you could switch from papermill to ploomber-engine in your ploomber pipeline.

So I think we have two items here:

check if it's possible to correctly update a progress bar coming from a notebook executed via papermill in a ploomber pipeline
If 1) doesn't work, check if we can get this working from ploomber-engine

Please check this out @mehtamohit013

Eduardo Blancas · Answer 2 · Tue Mar 21 2023 06:49:01 GMT+0800 (China Standard Time)

hey @OscarIntellico, we added support for notebook progress bars to ploomber-engine, can you give it a try and let us know if this is what you're expecting?

context: currently, ploomber executes notebooks with papermill, but you can also switch to use ploomber-engine, which is our drop-in replacement for papermill. If this solves your issue, we can integrate this into ploomber.

to test it:

pip uninstall ploomber-engine

pip install git+https://github.com/ploomber/ploomber-engine

then execute a notebook that displays an inline progress bar:

ploomber-engine input.ipynb out.ipynb --log-output

Oscar Pindaro · Answer 3 · Wed Mar 22 2023 18:53:12 GMT+0800 (China Standard Time)

Hi @edublancas , thank you for your very fast implementation.
I tried two different notebooks with sleep() inside the for loop (below the snippets that i have used).

from tqdm import tqdm, trange import time for i in trange(10): time.sleep(1)

from tqdm import tqdm, trange import time for i in trange(5): for i in trange(3, leave=False): time.sleep(1)

The output is what i expected.
When the script finishes, there is a graphical glitch, as you can see in the attachment.

The tqdm bar of my script is on top of the progress bar of ploomber. While running the two bars are on separated lines, i would say in the intended way.
I like that my bar is still on display, since it may contain some useful information about the loop.

Platform

Right now i'm using VSCode on a windows machine. VSCode is remotely connected to an ubuntu machine on which the code is run.

Eduardo Blancas · Answer 4 · Thu Mar 23 2023 12:30:39 GMT+0800 (China Standard Time)

excellent, thanks for your feedback.

sounds like this works as expected (besides minor graphical glitches)

I just made a new release, you can integrate it in ploomber like this:

# to get the latest version
pip install ploomber-engine --upgrade

Then in your pipeline.yaml:

  - source: fit.py
    name: fit
    product:
        nb: output/nb.html
    papermill_params:
        # this will tell ploomber to use ploomber-engine
        engine_name: embedded

I believe this should work but let me know if you encounter any issues!

we're working on a better integration of ploomber-engine into ploomber, so keep an eye on this: #1082