14 outlier detection and handling
Amsamms opened this issue · comments
First , I would like to thank you for your Great effort in this repo.
regarding outlier, i suggest to do outlier handling in tow stages, univariate as you did using z score, and multivariate using any technique , like PCA for instance
to clarify my point, imagine a dataset with 3 columns, Age, weight and length of males
univariate will limit the data in the columns for instance to be : the age say from 5 years to 65, and weight from 20 kgs to 200 kgs and length from 0.6 M to 2 meters
but a single instance of age say 7 years with weight 150 kgs and 0.7 meter is highly unlikely and can't be removed by z score only, it needs multivariate analysis to be detected
i hope my point is clear and thanx
thanks, i appreciate your note, and I added the 2 sections univariate and multivariate