michelecafagna26's repositories
faster-rcnn-bottom-up-py
Extract features and bounding boxes using the original Bottom-up Attention Faster-RCNN in a few lines of Python code
cider
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)
vinvl-visualbackbone
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
HL-dataset
[INLG2023] The High-Level (HL) dataset is a Vision and Language (V&L) resource aligning object-centric descriptions from COCO with high-level descriptions crowdsourced along 3 axes: scene, action, rationale.
bibidy
A simple command line tool for basic manipulations of bib files.
compress-fasttext
Tools for shrinking fastText models (in gensim format)
ELFVisionModule
ELF Vision Module demo
michelecafagna26
Github profile page
Pascal_Sentence_Dataset-downloader
Utility to downlaod the Pascal Sentence Dataset
text_completion_api
A simple text completion API using GPT-J-6B
vl-ablation
Targeted semantic multimodal input ablation. Official implementation of the ablation method introduced in the paper: "What Vision-Language Models 'See' when they See Scenes"
vl-shap-demo
Gradio demo showcasing VL-SHAP. You can generate visually informed explanations of textual outputs for VL models