ArfahUpade / FeatureSelector

A custom feature selection program with various algorithms and their pipelined combination in an easy and descriptive way.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FeatureSelector

A custom feature selection program with various algorithms and their pipelined combination in an easy and descriptive way.

Read Path:
A function to read the path of the CSV file given

 >>> import FeatureSelector as fs
 >>> X,y = fs.read_path('D:\Internship\dataset.csv')
 >>> X,y
(       TSPAN6       FGR       CFH      GCLC    NIPAL3  ...    IFNGR1    SH2D2A  TNFRSF1B    ARNTL2      IBSP
0    3.593240  0.714444  3.119220  2.464426  1.291976  ...  4.462061  0.211963  3.097346  0.179946  0.024523
1    3.399946  0.654546  0.136111  2.003238  2.233912  ...  4.708507  0.100548  2.800072  0.076992  0.000000
2    3.469919  0.873279  0.815520  2.537203  1.894536  ...  3.753501  0.103287  2.079879  0.659337  0.084467
3    2.821973  0.648664  0.515149  2.874994  1.227373  ...  4.472364  1.024236  2.113983  0.131032  0.234907
4    2.113599  3.069690  1.088918  2.917765  1.872202  ...  5.070540  0.802297  5.009232  0.480860  4.219627
..        ...       ...       ...       ...       ...  ...       ...       ...       ...       ...       ...
495  2.035607  1.083142  1.969201  2.342382  2.938695  ...  4.184614  0.393621  3.387899  0.278416  1.163159
496  2.455688  2.287689  3.733934  2.194991  2.633475  ...  5.299762  2.237891  4.531835  1.525031  0.342196
497  3.077339  2.696516  3.127983  2.612161  2.994722  ...  5.178988  2.594080  4.550842  2.330173  0.931543
498  4.398039  2.100173  3.080552  3.052334  2.392759  ...  4.791044  0.457323  3.195818  0.842771  3.581545
499  2.129841  1.899062  0.582394  2.604003  2.158955  ...  4.414640  0.269112  2.584676  0.116136  1.627639

[500 rows x 198 columns], 0      0
1      0
2      1
3      0
4      1
      ..
495    1
496    0
497    1
498    0
499    0
Name: 0, Length: 500, dtype: int64)
>>> X,y = fs.read_path('D:\Internship\data.csv')
>>> X,y
(     mean_radius  mean_texture  mean_perimeter  mean_area  mean_smoothness
0          17.99         10.38          122.80     1001.0          0.11840
1          20.57         17.77          132.90     1326.0          0.08474
2          19.69         21.25          130.00     1203.0          0.10960
3          11.42         20.38           77.58      386.1          0.14250
4          20.29         14.34          135.10     1297.0          0.10030
..           ...           ...             ...        ...              ...
564        21.56         22.39          142.00     1479.0          0.11100
565        20.13         28.25          131.20     1261.0          0.09780
566        16.60         28.08          108.30      858.1          0.08455
567        20.60         29.33          140.10     1265.0          0.11780
568         7.76         24.54           47.92      181.0          0.05263

[569 rows x 5 columns], 0      0
1      0
2      0
3      0
4      0
      ..
564    0
565    0
566    0
567    0
568    1
Name: diagnosis, Length: 569, dtype: int64)
>>> X_feature_selected = fs.ForwardSelector(X,y)
>>> X_feature_selected
       HS3ST1   TSPOAP1      CROT     CRLF1  MAPK8IP2  PRICKLE3   SLC38A5
0    0.283871  1.118907  3.199585  0.602297  0.583563  2.728145  0.236678
1    0.778597  1.428139  1.861609  0.326153  0.860734  1.728958  0.138003
2    0.925642  0.928361  2.316242  0.061074  0.343178  2.808658  0.033121
3    2.460786  1.649740  2.610303  1.118434  1.612719  2.117057  0.101612
4    1.176990  0.769621  2.515964  0.164966  1.731133  2.934570  0.197540
..        ...       ...       ...       ...       ...       ...       ...
495  0.491308  0.722332  1.771241  0.525178  4.345678  2.425843  0.140930
496  0.749313  1.367277  2.886011  0.932400  2.602531  1.879879  1.344855
497  1.156488  1.327671  1.679746  0.672549  1.169465  2.129853  2.522259
498  0.622989  2.347412  4.090374  1.727858  2.867727  1.990572  0.845473
499  0.311338  2.481399  3.411503  0.381369  3.573363  2.402876  1.390478

[500 rows x 7 columns]
>>> X_feature_selected = fs.Pipelined(X,y)
>>> X_feature_selected
         CROT   SLC38A5
0    3.199585  0.236678
1    1.861609  0.138003
2    2.316242  0.033121
3    2.610303  0.101612
4    2.515964  0.197540
..        ...       ...
495  1.771241  0.140930
496  2.886011  1.344855
497  1.679746  2.522259
498  4.090374  0.845473
499  3.411503  1.390478

[500 rows x 2 columns]

About

A custom feature selection program with various algorithms and their pipelined combination in an easy and descriptive way.


Languages

Language:Python 100.0%