How will the effectiveness of the model be evaluated, and does the library provide the appropriate methodology?
yuchiu503 opened this issue · comments
Hello, I am now learning data mining and using ABOD for data anomaly detection. After I build the model, I do not know whether the performance and accuracy of the model are good, so I need to evaluate the whole model. Could you help me?
In addition, I see that in your case (abod_example.py), ROC score and Precision@rank n score are used, but there is no y value (label) in the data set, so these two evaluations cannot be used
from pyod.models.abod import ABOD
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
df = pd.read_csv(
r"D:\WorkSpace\apple_quality.csv"
)
df.dropna(axis=0, inplace=True)
df_num = df.select_dtypes(include=np.number)
X_train, X_test = train_test_split(df_num, test_size=0.2, random_state=42)
model = ABOD().fit(X_train)
decision_scores = model.decision_scores_
test_scores = model.decision_function(X_test)
HI
if your dataset doesn't provide any labels associated with data, you cannot use the classic metrics provided by a confusion matrix. I suggest to use Silhouette Scorer (for example) provided by sklearn library or any other metrics described here