We intend to predict age, gender and the big five personality traits of facebook users, given their profile picture, status updates and page likes. The project works in two steps:
- Training the classifiers on the data set of 9500 users
- Predicting the age, gender and personality traits of new users.
Before profiling a new user, there are a couple of initial steps required to setup the system and train the models.
Other than python3.X, the project uses keras with tensorflow. Below are a few installation steps for a linux machine. pip3 install -U scikit-learn pip3 install tensorflow sudo pip3 install keras sudo pip3 install h5py
The project input takes user data in the hierarchy as below:
- LIWC – contains the LIWC.csv file
- Image – contains text files in the form <userid.jpg>
- Profile – contains the profile.csv file
- Relation – contains the relation.csv file
- Text – contains text files in the form <userid.txt>
- Wiki – contains numpy wiki image files.
To train the model, train_model.py file is called. This file take the folder path as input as described above. python3 training\ model/train_model.py
The training process can take a few minutes to a couple of hours depending upon the machine configuration and the number of images, on which the model is trained.
The project can be run using:
- The make_prediction.py file
- The tcss555 script file The program takes two files as input: the folder path for new instances (as explained above) and the folder path where the xml outputs are to be saved. ./tcss555 -i -o
Built With
- PyCharm - The IDE used for developement
- Keras – Neural networks API