yzhao062 / pyod

A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)

Home Page:http://pyod.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

False positive warning when manipulating pandas dataframes

Paroag opened this issue · comments

Scikit learn added compatibility for pandas dataframe with the set_output API update. I have sklearn pipelines in my project that uses pyod models. When fitting/predicting, the following warning is triggered:

UserWarning: X has feature names, but IsolationForest was fitted without feature names

The IForest.fit method does not actually pass the pandas dataframe to the underlying IsolationForest but the associated numpy array. The line of code X = check_array(X) is responsible for the conversion.

Here is a reproducible example:

import pandas as pd
from pyod.models.iforest import IForest


data = pd.DataFrame({
    "col1": [1, 2, 3, 4],
    "col2": [1, 2, 3, 4]
})

forest = IForest()
forest.fit(data)
forest.predict_proba(data)

Any ideas on how to address this issue ?