skondo / evaluation_measures

Framework that implements evaluation measures (e.g. nDCG, ERR) for IR systems.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

evaluation_measures is a framework that implements evaluation measures for IR systems. Following algorithm are implicated.

  • MRR (Mean Reciprocal Rank)

E.M. Voorhees (1999). "Proceedings of the 8th Text Retrieval Conference". TREC-8 Question Answering Track Report. pp. 77–82.

  • DCG (Discounted cumulative gain) and nDCG (Normalized Discounted cumulative gain)

Kalervo Jarvelin, Jaana Kekalainen: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4), 422–446 (2002)Cumulated gain-based evaluation of IR techniques

  • ERR (Expected Reciprocal Rank for Graded Relevance)

Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management (CIKM '09). ACM, New York, NY, USA, 621-630. DOI=10.1145/1645953.1646033 http://doi.acm.org/10.1145/1645953.1646033

  • session nDCG

K. J ̈arvelin, S. L. Price, L. M. L. Delcambre, and M. L. Nielsen. Discounted cumulated gain based evaluation of multiple-query ir sessions. In ECIR, pages 4–15, 2008.

  • session ERR

Our original method.

  • q-measure

Tetsuya Sakai. 2004. Ranking the NTCIR systems based on multigrade relevance. In Proceedings of the 2004 international conference on Asian Information Retrieval Technology (AIRS'04), Sung Hyon Myaeng, Ming Zhou, Kam-Fai Wong, and Hong-Jiang Zhang (Eds.). Springer-Verlag, Berlin, Heidelberg, 251-262. DOI=10.1007/978-3-540-31871-2_22 http://dx.doi.org/10.1007/978-3-540-31871-2_22

  • Risk-sensitive measure

L. Wang, P. N. Bennet and K. C-Thompson, Robust Ranking Mpodels via Risk-Sensitive Optimazation. In Proc. of the SIGIR 2012. See also TREC WebTRAC 2013 http://research.microsoft.com/en-us/projects/trec-web-2013/

==================

License

evaluation_measures is BSD 2-Clause licensed.

About

Framework that implements evaluation measures (e.g. nDCG, ERR) for IR systems.

License:Other


Languages

Language:Python 100.0%