c00kiemon5ter / LuceneEval

A library and application built on top of Lucene, using trec_eval to evaluate querries for the cacm collection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

#LuceneEval A library and application built on top of Lucene. Convenient methods to load the cacm collection and its relative files, search its queries and test the results using trec_eval

##Configuration Currently there is no external configuration, you need to modify the code.
The file LuceneEval.java holds the configuration variables.

Default configuration is: DATAFILE = "data/cacm/cacm.all"; CACM_XML = "data/results/cacm.all.xml"; QUERYFILE = "data/cacm/query.text"; STOPWORDLIST = "data/cacm/common_words"; CACM_QRELS_FILE = "data/cacm/qrels.text"; TREC_QRELS_FILE = "data/results/trec_qrels"; TREC_SEARCHRESULTS_FILE = "data/results/trec_searchresults"; TREC_RESULTS_FILE = "data/results/trec_results"; RESULTS_LIMIT = 20;

##Dependencies Lucene - provides a Java-based indexing and search implementation, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities
SimpeXML - high performance XML serialization and configuration framework for Java
trec_eval - the standard tool used for evaluating an ad hoc retrieval run, given the results file and a standard set of judged results

##License LuceneEval by Ivan Kanakarakis is licensed under GNU GPLv3 license.
Further see COPYING.

About

A library and application built on top of Lucene, using trec_eval to evaluate querries for the cacm collection

License:GNU Lesser General Public License v3.0


Languages

Language:Java 100.0%