michelecafagna26 / cider

Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Consensus-based Image Description Evaluation metric (CIDEr Code) for Image Captioning.

Evaluation code for CIDEr metric. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects

System requirements for testing and PTBTokenizer

  • python 3.6
  • java 1.8.0

Installation

clone this repository then:

pip install .

python3 -m spacy download en_core_web_sm

or install it in your environment from github:

pip install git+https://github.com/michelecafagna26/cider.git#egg=cidereval

Quick usage (PTBTokenizer by default)

Open In Colab

from cidereval import cider, ciderD

# refs and preds are lists of strings, the method will re-format them for you

cider(predictions=preds, references=refs)
#cider_scores is a dict-like object with "avg_score" and "scores"

By default, it uses the coco-val idf. To compute the idfs on your references call cider(predictions=preds, references=refs, df='corpus').

Important Note

In this implementations, we provide an alternative tokenizer to the PTBtokenizer in order to remove the java dependency. The new tokenizer is based on Spacy (SimpleTokenizer.py)

However, we suggest using the original PTBTokenizer as the tokenization is not exactly the same and the former is also faster (about 3x faster) than the spacy tokenizer.

For detail regarding performance look at this Important Note. For more detail regarding the implementation look at the original readme.

References

Acknowledgments

  • MS COCO Caption Evaluation Team
  • Ramakrishna Vedantam (author of the original repo)

About

Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)

License:Other


Languages

Language:Python 66.1%Language:Jupyter Notebook 33.9%