Novel’s method: The original method can be found here.
name importance: Mainly based on AmbrosM's notebook. But added additional information from mygene
corr importance: Top 3 features that correlated with each target.
rf importance: Top 128 important feature of the random forest model.
3. Models
Method
Stacking
GMNN
NN_online
CNN
kernel_rigde
LGBM
Catboost
CV
0.89677
0.89596
0.89580
0.89530
0.89326
0.89270
0.89100
GMNN: Gated Map Neural Network. A NN trying to do something like the Transformers and RNN without using feature vectors.
CNN: Inspired by the tmp method here and also added multidimensional convolution kernel like the Resnet.
NN(Online): A NN model based on a kaggle online notebook
Kernel Rigde: Inspired by the best solution of last year's competition. Used Ray Tune to optimize the hypermeters
Catboost: MultiOutputCatboostRegressor class which can use earlystopping to prevent overfitting when compaired with sklearn.multioutput.MultiOutputRegressor
LBGM: MultiOutputLGBMRegressor which can use earlystopping to prevent overfitting when compaired with sklearn.multioutput.MultiOutputRegressor
Stacking: Used KNN,CNN,ridge,rf,catboost,GMNN for the first layer and only CNN,catboost,GMNN for the second and just a simple MLP for the last layer. To avoid overfitting, I used special CV strategy which can do k-fold by donor and oof predictions together
CV Results
Model Ⅰ (vaild 32606)
Model Ⅱ (vaild 13176)
Model Ⅲ (vaild 31800)
Fold 1
0.8989
0.8967
0.8947
Fold 2
0.8995
0.8967
0.8951
Fold 3
0.8985
0.8959
0.8949
Fold Mean
0.89897
0.89643
0.89490
Model Mean
0.89677
-
-
Ⅱ. Multi
1. Data preprocessing & Feature engineering
inputs:
TF-IDF normalization
np.log1p(data * 1e4)
Tsvd -> 512
targets:
Normalization -> mean = 0, std = 1
Tsvd -> 1024
2. Models
GMNN: Gated Map Neural Network. The output of the model is 1024 dim and make dot product with tsvd.components_(constant) to get the final prediction than use correl_loss to calculate the loss then back propagate the grads.