sisinflab / Formal-MultiMod-Rec

Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Formalizing Multimedia Recommendation through Multimodal Deep Learning

Official repository for the paper Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.

Authors

* Work done while at Politecnico di Bari as a PhD student.

** Work done while at Politecnico di Bari before joining Amazon.

If you wish to cite our paper, here is a reference:

@article{DBLP:journals/corr/abs-2309-05273,
  author       = {Daniele Malitesta and
                  Giandomenico Cornacchia and
                  Claudio Pomo and
                  Felice Antonio Merra and
                  Tommaso {Di Noia} and
                  Eugenio {Di Sciascio}},
  title        = {Formalizing Multimedia Recommendation through Multimodal Deep Learning},
  journal      = {CoRR},
  volume       = {abs/2309.05273},
  year         = {2023}
}

Review

Paper Year Title
Ferracani et al. 2015 A System for Video Recommendation using Visual Saliency, Crowdsourced and Automatic Annotations
Jia et al. Multi-modal learning for video recommendation based on mobile application usage
Li et al. Video recommendation based on multi-modal information and multiple kernel
Nie et al. 2016 Quality models for venue recommendation in location-based social network
Chen et al. Context-aware Image Tweet Modelling and Recommendation
Han et al. 2017 Learning Fashion Compatibility with Bidirectional LSTMs
Oramas et al. A Deep Multimodal Approach for Cold-start Music Recommendation
Zhang et al. Hashtag Recommendation for Multimodal Microblog Using Co-Attention Network
Ying et al. 2018 Graph Convolutional Neural Networks for Web-Scale Recommender Systems
Wang et al. LRMM: Learning to Recommend with Missing Modalities
Liu et al. 2019 User Diverse Preference Modeling by Multimodal Attentive Metric Learning
Chen et al. Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network: Towards Visually Explainable Recommendation
Wei et al. MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video
Cheng et al. MMALFM: Explainable Recommendation by Leveraging Reviews and Images
Dong et al. Personalized Capsule Wardrobe Creation with Garment and User Modeling
Chen et al. POG: Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion
Yu et al. 2020 Vision-Language Recommendation via Attribute Augmented Multimodal Reinforcement Learning
Cui et al. MV-RNN: A Multi-View Recurrent Neural Network for Sequential Recommendation
Wei et al. Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback
Sun et al. Multi-modal Knowledge Graphs for Recommender Systems
Chen et al. Neural Tensor Model for Learning Multi-Aspect Factors in Recommender Systems
Min et al. Food Recommendation: Framework, Existing Solutions, and Challenges
Shen et al. Enhancing Music Recommendation with Social Media Content: an Attentive Multimodal Autoencoder Approach
Yang et al. Learning to Match on Graph for Fashion Compatibility Modeling
Tao et al. MGAT: Multimodal Graph Attention Network for Recommendation
Yang et al. AMNN: Attention-Based Multimodal Neural Network Model for Hashtag Recommendation
Sang et al. 2021 Context-Dependent Propagating-Based Video Recommendation in Multimodal Heterogeneous Information Networks
Liu et al. Pre-training Graph Transformer with Multimodal Side Information for Recommendation
Zhang et al. Mining Latent Structures for Multimedia Recommendation
Vaswani et al. Multimodal Fusion Based Attentive Networks for Sequential Music Recommendation
Lei et al. Is the suggested food your desired?: Multi-modal recipe recommendation with demand-based knowledge graph
Wang et al. Market2Dish: Health-aware Food Recommendation
Zhan et al. 2022 A3-FKG: Attentive Attribute-Aware Fashion Knowledge Graph for Outfit Preference Prediction
Wu et al. MM-Rec: Visiolinguistic Model Empowered Multimodal News Recommendation
Yi et al. Multi-Modal Variational Graph Auto-Encoder for Recommendation Systems
Yi et al. Multi-modal Graph Contrastive Learning for Micro-video Recommendation
Liu et al. Multi-Modal Contrastive Pre-training for Recommendation
Mu et al. Learning Hybrid Behavior Patterns for Multimedia Recommendation
Chen et al. Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation
Yi et al. A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation
Wang et al. 2023 DualGNN: Dual Graph Neural Network for Multimedia Recommendation
Wei et al. Multi-Modal Self-Supervised Learning for Recommendation
Zhou et al. Bootstrap Latent Representations for Multi-modal Recommendation

Benchmarking

First, install all useful dependencies through:

pip install -r requirements.txt
pip install -r requirements_torch_geometric.txt

If you want to train again all models, run the following:

python -u start_experiments.py --config <dataset_name>

where dataset_name is one of the datasets in our benchmarks.

If you just want to run the generations of the results, run the following:

python -u start_experiments.py --config <dataset_name>_results

where dataset_name is one of the datasets in our benchmarks.

Note that the results may slightly differ from the ones provided here and in the paper, depending on the machine you are running the experiments on.

Office (best results)

Models Recall@10 nDCG@10 EFD@10 Gini@10 APLT@10 iCov@10 Recall@20 nDCG@20 EFD@20 Gini@20 APLT@20 iCov@20
VBPR 0.0652 0.0419 0.1753 0.3634 0.2321 93.83% 0.1025 0.0533 0.1479 0.3960 0.2375 97.51%
MMGCN 0.0455 0.0300 0.1140 0.0128 0.0016 3.07% 0.0798 0.0405 0.1027 0.0231 0.0078 4.64%
GRCN 0.0393 0.0253 0.1215 0.4587 0.3438 99.01% 0.0667 0.0339 0.1051 0.4892 0.3469 99.79%
LATTICE 0.0664 0.0449 0.1827 0.2128 0.1752 87.86% 0.1029 0.0566 0.1513 0.2652 0.2039 95.90%
BM3 0.0701 0.0460 0.1837 0.1407 0.1427 77.13% 0.1081 0.0583 0.1550 0.1900 0.1715 91.55%
FREEDOM 0.0560 0.0365 0.1493 0.1922 0.1875 79.12% 0.0884 0.0469 0.1282 0.2439 0.2080 90.64%

Toys (best results)

Models Recall@10 nDCG@10 EFD@10 Gini@10 APLT@10 iCov@10 Recall@20 nDCG@20 EFD@20 Gini@20 APLT@20 iCov@20
VBPR 0.0710 0.0458 0.1948 0.2645 0.1064 84.90% 0.1006 0.0545 0.1527 0.3011 0.1180 92.82%
MMGCN 0.0256 0.0150 0.0648 0.0989 0.0961 37.87% 0.0426 0.0200 0.0570 0.1450 0.1058 52.51%
GRCN 0.0554 0.0354 0.1604 0.3954 0.2368 92.66% 0.0831 0.0436 0.1298 0.4329 0.2482 97.73%
LATTICE 0.0805 0.0512 0.2090 0.1656 0.0546 73.80% 0.1165 0.0617 0.1665 0.2026 0.0684 86.58%
BM3 0.0613 0.0393 0.1582 0.0776 0.0486 56.23% 0.0901 0.0478 0.1270 0.1154 0.0658 73.50%
FREEDOM 0.0870 0.0548 0.2284 0.1474 0.0756 62.09% 0.1249 0.0660 0.1820 0.2007 0.0951 78.42%

Beauty (best results)

Models Recall@10 nDCG@10 EFD@10 Gini@10 APLT@10 iCov@10 Recall@20 nDCG@20 EFD@20 Gini@20 APLT@20 iCov@20
VBPR 0.0760 0.0483 0.2119 0.2076 0.0833 83.06% 0.1102 0.0586 0.1700 0.2376 0.0915 91.41%
MMGCN 0.0496 0.0294 0.1300 0.0252 0.0282 13.75% 0.0772 0.0379 0.1105 0.0423 0.0345 21.37%
GRCN 0.0575 0.0370 0.1817 0.3823 0.2497 94.59% 0.0892 0.0466 0.1498 0.4178 0.2608 98.56%
LATTICE 0.0867 0.0544 0.2272 0.1153 0.0386 65.82% 0.1259 0.0661 0.1830 0.1558 0.0511 81.60%
BM3 0.0713 0.0443 0.1831 0.0245 0.0179 32.31% 0.1051 0.0545 0.1490 0.0414 0.0228 48.75%
FREEDOM 0.0864 0.0539 0.2279 0.0921 0.0486 55.89% 0.1286 0.0666 0.1868 0.1359 0.0653 72.96%

Sports (best results)

Models Recall@10 nDCG@10 EFD@10 Gini@10 APLT@10 iCov@10 Recall@20 nDCG@20 EFD@20 Gini@20 APLT@20 iCov@20
VBPR 0.0450 0.0281 0.1167 0.1501 0.0497 75.77% 0.0677 0.0349 0.0949 0.1722 0.0552 86.54%
MMGCN 0.0342 0.0207 0.0791 0.0095 0.0046 5.10% 0.0551 0.0269 0.0678 0.0168 0.0065 8.39%
GRCN 0.0330 0.0202 0.0885 0.3087 0.2190 91.28% 0.0523 0.0259 0.0746 0.3386 0.2273 97.09%
LATTICE 0.0610 0.0372 0.1465 0.0573 0.0129 48.44% 0.0898 0.0456 0.1185 0.0802 0.0185 64.90%
BM3 0.0548 0.0349 0.1372 0.0776 0.0283 59.13% 0.0825 0.0430 0.1118 0.1120 0.0385 76.75%
FREEDOM 0.0603 0.0375 0.1494 0.0621 0.0319 48.37% 0.0911 0.0465 0.1219 0.0926 0.0442 65.81%

Clothing (best results)

Models Recall@10 nDCG@10 EFD@10 Gini@10 APLT@10 iCov@10 Recall@20 nDCG@20 EFD@20 Gini@20 APLT@20 iCov@20
VBPR 0.0339 0.0181 0.0502 0.2437 0.0809 83.40% 0.0529 0.0229 0.0413 0.2791 0.0915 92.33%
MMGCN 0.0227 0.0119 0.0292 0.0136 0.0044 7.58% 0.0348 0.0150 0.0240 0.0236 0.0066 12.44%
GRCN 0.0319 0.0164 0.0481 0.3990 0.2358 93.37% 0.0496 0.0209 0.0397 0.4368 0.2459 97.77%
LATTICE 0.0502 0.0275 0.0738 0.1022 0.0134 58.49% 0.0744 0.0336 0.0589 0.1384 0.0207 76.20%
BM3 0.0418 0.0226 0.0596 0.1348 0.0319 72.88% 0.0633 0.0281 0.0486 0.1825 0.0449 88.65%
FREEDOM 0.0547 0.0294 0.0805 0.1509 0.0600 65.54% 0.0822 0.0363 0.0652 0.2078 0.0843 81.91%

About

Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.


Languages

Language:Python 99.9%Language:Dockerfile 0.1%