Florian Boudin's repositories
ake-datasets
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
ir-using-kg
Keyphrase Generation for Scientific Document Retrieval
hulth-2003-pre
Preprocessed Inspec keyphrase extraction benchmark dataset
semeval-2010-pre
Preprocessed SemEval-2010 benchmark dataset for keyphrase extraction
duc-2001-pre
Preprocessed DUC 2001 keyphrase extraction benchmark dataset
krapivin-2009-pre
Preprocessed Krapivin keyphrase extraction benchmark dataset
redefining-absent-keyphrases
Code and dataset for the paper "Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness"
marujo-2012-pre
Preprocessed Marujo keyphrase extraction benchmark dataset
cross-language_IR
Un cours de deux heures sur la recherche d'information cross-lingue
boudinfl.github.io
website
wikinews-2013-pre
Preprocessed Wikinews Keyphrase benchmark dataset
corenlp_parser
Minimal CoreNLP XML Parser in Python
gh-pages-minima-starter
A minimal example for running Github Pages with the minima theme.
s2orc-doc2json
Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)
witten-1999-pre
Preprocessed CSTR keyphrase extraction dataset