In Recommender systems, data representation techniques play a great role as they have the power to entangle, hide and reveal explanatory factors embedded within datasets. Hence, they influence the quality of recommendations. Specifically, in Visual Art (VA) recommendations the complexity of the concepts embodied within paintings, makes the task of capturing semantics by machines far from trivial. In VA recommendation, prominent works commonly use manually curated metadata to drive recommendations. Recent works in this domain aim at leveraging visual features extracted using Deep Neural Networks (DNN). However, such data representation approaches are resource demanding and do not have a direct interpretation, hindering user acceptance. To address these limitations, this work proposes an approach for Personalised Recommendation of Visual arts based on learning latent semantic representation of paintings. This is done by training a Latent Dirichlet Allocation (LDA) model on textual descriptions of paintings. The trained LDA model manages to successfully uncover non-obvious semantic relationships between paintings whilst being able to offer explainable recommendations. Experimental evaluations demonstrate that our method tends to perform better than exploiting visual features extracted using pre-trained Deep Neural Networks.
NumPy | |
SciPy | |
scikit-learn | |
Pandas | |
Matplotlib | |
gensim | |
spaCy |
This code works on Python 3.5 or later.
The Painting_LDA model trained on the National Gallery dataset can be found in /resources/models/
If you want to train with your own painting descrtiption corpus:
use the script text-cleaning.py to clean your corpus. It will save the pre-processed data in resources/datasets/preprocessed/
After claning your courpus use the script lda-training.py to train the Painting-LDA model.
To make recommendation use the script query-lda.py
It generates a list of recommendations that are the most similar to a list of paintings liked or rated by a user.
When you use this work or method for your research, we ask you to cite the following publication: