aryanrzn / Prediction-of-Family-Income-and-Expenditure

in this project, logistic regression, KNN, classification trees, random forests and neural network were used.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prediction-of-Family-Income-and-Expenditure

In this project, I analyze the family income and expenditure data living in three provinces including "Kerman", "Sistan and Baluchistan" and "Hormozgan" in 2019. The plan to predict the cost and income of families living in urban areas in the country has been started since 1968 by the "Statistics Center of Iran". In this research, data has been collected in the form of 91 predictor variables from 2197 family in the form of a questionnaire. I explain the predictor variables in detail, and use data mining steps to analyze the relevant data, so that after performing the initial steps of data mining, i.e. choosing the target which is the classification of family in the above three provinces based on the amount of income by classifying the family into two high-income and low-income classes, which includes the top three deciles and the bottom seven deciles of income and presenting a model for predicting the new family category. Then, I go to the next steps of data mining, i.e. data exploration and cleaning. In addition, I visualize the variables and check their information. Next, I choose the best model by checking the appropriate models on the data. It should be noted that, in this project, logistic regression, KNN, classification trees, random forests and neural network were used. Moreover, Excel software was used for cleaning and R programming language for summarizing, visualizing and modeling.

About

in this project, logistic regression, KNN, classification trees, random forests and neural network were used.


Languages

Language:R 100.0%