Unexpected behavior from train_test_split()

Question

Unexpected behavior from train_test_split()

pcgm-team opened this issue 7 months ago · comments

Nelson Griffiths · Answer 1 · Tue Dec 05 2023 06:39:41 GMT+0800 (China Standard Time)

@pcgm-team Could you provide a reproducible example? I tried to reproduce it with the following but got the correct output:

from functime.cross_validation import train_test_split
import polars as pl
import numpy as np

data = pl.DataFrame({
    "symbol": ["AAPL"] * 10 + ["MSFT"] * 10,
    "timestamp": np.tile(np.arange(1, 11), 2),
    "values": np.random.rand(20)
})

data = data.with_columns(pl.col("timestamp").cast(pl.UInt32))

test_size = 3
y_train, y_test = train_test_split(test_size)(data)
y_test.collect().select('timestamp').min()

gives

u32
8