functime-org / functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

Home Page:https://docs.functime.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Examples in preprocessing jupyter notebook ShapeError

FangyangJz opened this issue · comments

Describe the bug

ShapeError Traceback (most recent call last)
Cell In[9], line 3
1 transformer = detrend(freq="1mo", method="linear")
2 y_detrended = y.pipe(transformer).collect()
----> 3 figure = plot_forecasts(
4 y_true=y, y_pred=y_detrended.group_by(entity_col).tail(64), height=800, width=1000
5 )
6 figure.show(renderer="svg")

File c:\Users\fangy\miniconda3\envs\avalon311\Lib\site-packages\functime\plotting.py:192, in plot_forecasts(y_true, y_pred, n_series, seed, n_cols, last_n, **kwargs)
189 # Get most recent observations
190 entities = y_true.select(pl.col(entity_col).unique(maintain_order=True)).collect()
--> 192 entities_sample = entities.to_series().sample(n_series, seed=seed)
194 # Get most recent observations
195 y = (
196 y_true.filter(pl.col(entity_col).is_in(entities_sample))
197 .group_by(entity_col)
198 .tail(last_n)
199 .collect()
200 )

File c:\Users\fangy\miniconda3\envs\avalon311\Lib\site-packages\polars\series\utils.py:107, in call_expr..wrapper(self, *args, **kwargs)
105 expr = getattr(expr, namespace)
106 f = getattr(expr, func.name)
...
1934 if background:
1935 return InProcessQuery(ldf.collect_concurrently())
-> 1937 return wrap_df(ldf.collect())

ShapeError: cannot take a larger sample than the total population when with_replacement=false

To Reproduce
Steps to reproduce the behavior:
run docs notebooks preprocessing.ipynb

Expected behavior
pass

Desktop (please complete the following information):

  • OS: windows 11
  • Browser chrome
  • Version
  • functime 0.9.5
  • polars 0.20.16
  • plotly 5.19.0

The same issue on Mac M1:
ShapeError: cannot take a larger sample than the total population when "with_replacement=false"

Set n_series=4:
figure = plot_panel(y=y, height=800, width=1000, n_series=4)
throw error:
TypeError: the truth value of a DataFrame is ambiguous

Ciao! Thanks for reporting. Will review this later this week! 😊

FYI: I am receiving the same ShapeError

Ciao! Sorry for the late reply. Turns out, this maps to #179 and @miroslaavi prepared a PR for this.

Ciao, we might have a fix online for tomorrow 😊

Ciao! Thanks to @miroslaavi, the issue is fixed. Will be in the next release 🚀