AutoViML / Auto_TS

Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error: 'could not convert string to float' on datetime column

emobs opened this issue · comments

This error is thrown when fit is initiated: could not convert string to float on the Time columns of my data. This column is of the datetime data type and there are no missing or incorrect values in it. What can be done to fix this?

Thank you for the quick reply!

I did that this way already: data['Time'] = pd.to_datetime(data['Time'], format='%Y.%m.%d %H:%M:%S')
Then I checked the data type of the 'Time' column after conversion which was of the datetime64[ns] type then.
However, the issue persists.

Sure thanks, here's the dummy data.zip.
And this is the python code initiating the model fit:

                    model = auto_timeseries(
                        score_type='rmse',
                        time_interval=Timeframe_frequency,  
                        model_type='best', 
                        verbose=2,
                        forecast_period=2,
                        non_seasonal_pdq=None, 
                        seasonality=True
                    )

                    # Convert the 'Time' column to the desired string format
                    data['Time'] = pd.to_datetime(data['Time'], format='%Y.%m.%d %H:%M:%S')

                    model.fit(
                        traindata=data[:-2],  # Excluding the last 2 rows for training
                        ts_column='Time',
                        target=target_col
                    )

Thanks for your support!

By the way, the csv file is read using data = pd.read_csv(file_path, encoding='utf-16', delimiter=';') and stored into a pandas data frame and then passed as an argument (data) to the function that creates the model and initiates the fit as in the code above.

Any news on this issue yet? We're you able to reproduce the error and/or pinpoint the cause? If you need more details, please let me know.. Thanks.

I just tried: Downloaded it myself from this topic, unzipped and opened the file without problems. Shall I send you a copy by email?

Hello, I sent you 2 emails regarding this issue, but not sure if you read or even received those. Please let me know, thanks in advance.

Hi and thanks for your reply!

Using this code:

file_path = os.path.join(data_path, 'python_input.csv')
data = pd.read_csv(file_path, encoding='utf-16', delimiter=';')
logging.info(f"Data shape: {data.shape}")
logging.info(f"Data sample:\n{data.head()}")  

I get this as the data shape and head after reading the file:

Data shape: (500, 21)
Data sample:
Time Data1 Data2 ... Data3 Data4 Data5
2023.09.07 22:10:00 1.06958 1.06947 ... 42.835099 18.564245 1.071799
2023.09.07 22:15:00 1.06948 1.06949 ... 35.744064 18.600451 1.071745
2023.09.07 22:20:00 1.06948 1.06953 ... 42.293215 20.935565 1.071687
2023.09.07 22:25:00 1.06954 1.06958 ... 41.152948 23.629927 1.071650
2023.09.07 22:30:00 1.06958 1.06954 ... 43.177273 22.100989 1.071612

Looks good to me. What do you think?