Beast code in Giters

AImageLab's repositories

mammoth

An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

Language:PythonMIT656 13 44

dress-code

Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022

Language:PythonNOASSERTION604 17 35

LLaVA-MORE

LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning

Language:PythonApache-2.0125 6 9

pacscore

[CVPR 2023] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation

Language:Python61 5 6

mil4wsi

DAS-MIL: Distilling Across Scales for MILClassification of Histological WSIs

Language:PythonMIT56 6 11

awesome-human-visual-attention

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

49 30

CoDE

[ECCV'24] Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Language:PythonMIT34 3 3

ReflectiVA

[CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Language:PythonApache-2.021 4 1

DiCO

[BMVC 2024 Oral ✨] Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

Language:Python17 20

MaPeT

Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training

Language:Python16 5 2

ReT

[CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Language:PythonApache-2.01400

HySAC

Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025

Language:Python1200

Language:Jupyter NotebookNOASSERTION11 1 1

awesome-captioning-evaluation

Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives

600

COGT

[ICLR'25] Causal Graphical Models for Vision-Language Compositional Understanding

Language:Python600

The Ludovico Antonio Muratori (LAM) dataset is the largest line-level HTR dataset to date and contains 25,823 lines from Italian ancient manuscripts edited by a single author over 60 years. The dataset comes in two configurations: a basic splitting and a date-based splitting which takes into account the age of the author. The first setting is intended to study HTR on ancient documents in Italian, while the second focuses on the ability of HTR systems to recognize text written by the same writer in time periods for which training data are not available.

5 20

AImageLab

aimagelab

AImageLab's repositories

mammoth

dress-code

LLaVA-MORE

VATr

pacscore

mil4wsi

awesome-human-visual-attention

CoDE

HWD

ReflectiVA

DiCO

MaPeT

ReT

HySAC

pin

awesome-captioning-evaluation

COGT

LAM

font_square

PASTA

fed-mammoth

DitHub

aimagelab.github.io

coldfront

cvcs2025

HEaD

itserr-wp8-latin-embeddings

open-webui

pipelines

sva2021