datapane / datapane

Build and share data reports in 100% Python

Home Page:https://datapane.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dp.cells_to_blocks() misses all figures

mil-ad opened this issue · comments

I'm super new to datapane and was trying to convert a notebook to HTML. I noticed that dp.cells_to_blocks() catches all text and dataframe cells but doesn't capture any outputs that include a matplotlib figure. Is this a bug?

Hello @mil-ad, thanks for raising this issue.

Can you share some example code that reproduces the issue please?

I think I know what's going on. I think I have the common trick of having a ; when calling my plot function (i.e. plot_fn();) to avoid getting double plots but then that means plot is shown by the function rather than cell output. It seems datapane only capture cell outputs.

That sounds right @mil-ad, Datapane uses the return value stored in the cell output for dp.cells_to_blocks(). Here's an example that highlights the issue, where the first plot is returning the object, but the second one isn't.

Screenshot 2023-02-21 at 09 46 19

Datapane also picks up Datapane blocks in cells_to_blocks, so the last line in a cell could be the figure wrapped in dp.Plot(). This also outputs to the cell. Would this help?

that makes sense. So let's say I want to create plots in a loop:

for i in range(10):
  plot_fn(i)

what is the endorsed way to make sure the plots are both shown in the notebook (ideally just once) and are captured by cells_to_blocks()?

That's an interesting problem. At the moment, cells_to_blocks() is designed to work with cell output objects.

I can suggest a workaround for now, whilst we consider how to support multiple outputs from a single cell! I've just written this working proof of concept:

def plot_fn(n):
    fig = plt.figure()
    plt.plot(range(10))
    plt.title(f"plot #{n}")
    plt.close()
    return fig

my_plots = []    
for i in range(10):
    plot = dp.Plot(plot_fn(i))
    my_plots.append(plot)

dp.Group(blocks=my_plots, columns=2)

This should output in your notebook and get picked up by cells_to_blocks().

Screenshot 2023-02-21 at 10 35 15.

Please let me know if that works in the meantime, and I can move this to a feature request w.r.t. supporting multiple outputs :)

Thanks Shahin! I'll give it a go.

I only have to do this for a few cells but the moment I start using dp.Plot() explicitly I won't be able to use cells_to_blocks() anymore right? Is there a way to extract previous cell output? For instance I want to keep markdown cells around and also add them to blocks = [] list that I'm later going to pass to dp.App()

I only have to do this for a few cells but the moment I start using dp.Plot() explicitly I won't be able to use cells_to_blocks() anymore right?

cells_to_blocks() picks up all Datapane objects, including dp.Plot(), and tries to convert supported objects (e.g. Plotly plots, Matplotlib plots, pandas DataFrames) to their corresponding Datapane objects. So if the last line in a cell is dp.Plot(...), this should work fine! The solution above will work with cells_to_blocks too.