yryrgogo / data_mining

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

data_analysis

The code in the directory is not organized at all. Please be patient while preparing...

For Data Mining

0. EDA

Visualize

Feature Importance(gain, split)/Feature_Permutation/Partial Dependence Plot

correlation

Adversarial Distribusion

1. Preprocessing

Outlier

-Construciton

Impute

-Impute by BaseAggregation
-Impute by Regression(LGBM)

Replace

2. Feature Engineering

Categorical Encoding

-Likelihood Encoding(Target Encoding)
-Count Encoding
-One Hot Encoding
-Label Encoding

Base Aggregation

Dimentional Reduction(Embedding)

-PCA
-LDA
-tSNE
-UMAP

Clustering

-Kmeans
-EM Algorithm

3. Feature Selection

Feature Importance (LGBM)

Feature Permutation

4. Feature Management

5. Various Machine Learning Algorithm

6. Explainer

Shap

LIME(Unexecuted)

99. Important

These mean & std is same. dp9y2dauwaaorlq

About


Languages

Language:Jupyter Notebook 97.7%Language:Python 2.3%Language:HTML 0.0%Language:Shell 0.0%