This is the official repository of the paper Knowledge Graph Datasets for Recommendation accepted for publication at KaRS@RecSys2023.
This work covers the enrichment of two widely used recommendation datasets from the movie and book domain, MovieLens 25M and LibraryThing respectively. Specifically:
- we link the items in the LibraryThing (LT) and MovieLens 25M (ML25M) datasets with the entities available in three well-known knowledge graphs: Wikidata, DBpedia, and Freebase;
- starting from item entity linking, we explore the Wikidata and DBpedia knowledge graphs connections up to two hops to collect all the structured information connected to these resources, thus providing persistent and ready-to-use enriched datasets for performing reproducible experiments.
Inspired by the advances in the knowledge graph, Graph Convolutional Networks, Link Prediction, and Recommender Systems research, these augmented datasets aim to meet their cutting-edge research needs. Moreover, these datasets pave the way for further research to investigate different recommendation modalities simultaneously.
All the resources are available here.
Please note that the resources cannot be hosted on GitHub due to GitHub size limits.
Our resources collect:
- links from Item IDs to URI resources on Wikidata, DBpedia and Freebase KGs for both movies and books
- RDF-triples from the Wikidata and DBpedia KGs for both movies and books
The files are split into zip archives as follows:
MovieLens 25M
├── ml25m_linking.zip
│ ├── ml25m_linking.tsv
├── ml25m_subgraphs.zip
│ └── ml25m_wikidata_1hop.tsv
│ └── ml25m_wikidata_2hop.tsv
│ └── ml25m_dbpedia_1hop.tsv
│ └── ml25m_dbpedia_2hop.tsv
LibraryThing
├── lt_linking.zip
│ ├── lt_linking.tsv
│ ├── lt_wikidata_freebase_linking.tsv
│ ├── lt_dbpedia_freebase_linking.tsv
├── lt_subgraphs.zip
│ └── lt_wikidata_1hop.tsv
│ └── lt_wikidata_2hop.tsv
│ └── lt_dbpedia_1hop.tsv
│ └── lt_dbpedia_2hop.tsv
Here we provide a description of the contents of our collection.
File Name | Descriptions |
MovieLens 25M | |
ml25m_linking.tsv | This file contains the link of items in the MovieLens 25M dataset to Wikidata, DBpedia, and FreeBase Knowledge Graphs. This is a tab separated file containing the following fields:
|
ml25m_wikidata_1hop.tsv | This file contains the RDF triples gathered exploring the Wikidata Knowledge Graph up tp 1-hop starting from the uri resources found in the item linking phase concerning the MovieLens 25M dataset. This is a tab separated file containing the following fields:
|
ml25m_wikidata_2hop.tsv | This file contains the RDF triples gathered exploring the Wikidata Knowledge Graph up tp 2-hop starting from the uri resources objects found in the exploration up to 1-hop concerning the MovieLens 25M dataset. This is a tab separated file containing the following fields:
|
ml25m_dbpedia_1hop.tsv | This file contains the RDF triples gathered exploring the DBpedia Knowledge Graph up tp 1-hop starting from the uri resources found in the item linking phase concerning the MovieLens 25M dataset. This is a tab separated file containing the following fields:
|
ml25m_dbpedia_2hop.tsv | This file contains the RDF triples gathered exploring the DBpedia Knowledge Graph up tp 2-hop starting from the uri resources objects found in the exploration up to 1-hop concerning the MovieLens 25M dataset. This is a tab separated file containing the following fields:
|
LibraryThing | |
lt_linking.tsv | This file contains the link of items in the LibraryThing dataset to Wikidata and DBpedia Knowledge Graphs. This is a tab separated file containing the following fields:
|
lt_wikidata_freebase_linking.tsv | This file contains the link of items in the LibraryThing dataset to FreeBase Knowledge Graph from Wikidata. This is a tab separated file containing the following fields:
|
lt_dbpedia_freebase_linking.tsv | This file contains the link of items in the LibraryThing dataset to FreeBase Knowledge Graph from DBpedia. This is a tab separated file containing the following fields:
|
lt_wikidata_1hop.tsv | This file contains the RDF triples gathered exploring the Wikidata Knowledge Graph up tp 1-hop starting from the uri resources found in the item linking phase concerning the LibraryThing dataset. This is a tab separated file containing the following fields:
|
lt_wikidata_2hop.tsv | This file contains the RDF triples gathered exploring the Wikidata Knowledge Graph up tp 2-hop starting from the uri resources objects found in the exploration up to 1-hop concerning the LibraryThing dataset. This is a tab separated file containing the following fields:
|
lt_dbpedia_1hop.tsv | This file contains the RDF triples gathered exploring the DBpedia Knowledge Graph up tp 1-hop starting from the uri resources found in the item linking phase concerning the LibraryThing dataset. This is a tab separated file containing the following fields:
|
lt_dbpedia_2hop.tsv | This file contains the RDF triples gathered exploring the DBpedia Knowledge Graph up tp 2-hop starting from the uri resources objects found in the exploration up to 1-hop concerning the LibraryThing dataset. This is a tab separated file containing the following fields:
|
The table below shows the statistics of the collected resource categorized by dataset and data source.
We welcome any contribution that could improve our datasets. Please contact us by email.
This work was developed by
- Vincenzo Paparella* (vincenzo.paparella@poliba.it)
- Alberto Carlo Maria Mancino* (alberto.mancino@poliba.it)
- Antonio Ferrara (antonio.ferrara@poliba.it)
- Claudio Pomo (claudio.pomo@poliba.it)
- Vito Walter Anelli (vitowalter.anelli@poliba.it)
- Tommaso Di Noia (tommaso.dinoia@poliba.it)
* Corresponding authors
This work is released under APACHE2 License.
Our datasets are constructed thanks to
- MovieLens dataset (https://grouplens.org/)
- LibraryThing dataset(https://cseweb.ucsd.edu/~jmcauley/datasets.html#social_data)
- The Internet Movie Database (https://www.imdb.com/)
- LibraryThing website (https://www.librarything.com/)
- Wikidata (https://www.wikidata.org/wiki/Wikidata:Main_Page)
- DBpedia (https://www.dbpedia.org/)
- Freebase (https://developers.google.com/freebase)