- Introduction
- Data
- Decision Tree
- Randomforest
- Adaboost
- Linear Regression
This is a project of data analysis, which relates to the data of summer research in the countryside.
The data of this project, which were collected in summer of 2016, is from the research in 3 villages in SiChuan Province. The folder 'data' contains 3 .csv which refers to the result of research. The features in the tables include 'id','sex','education','financial situation' and so on. ...
The file, 'Decision_Tree.ipynb', implements the desicion tree algorithm, specifically CART. Using numpy and sklearn, this file contains data loading, model fitting, data visualization and also test set predicting. In DT3.0, the accuracy of this model is 75% above in training set while 65% above in the validation set & test set. In conclusion, this algorithm should be improved in upcoming work.
The file, 'Random_forest.ipynb', implements the random forest algorithm. ...
The file, 'Adaboost.ipynb', implements the Adaboost algorithm. ...
The file, 'Linear_Regression.ipynb', implements the Linear Regession algorithm. ...