Weijian Xu's starred repositories

batch-face

Batch Face Detection and Alignment for Modern Research

Language:PythonLicense:MITStargazers:53Issues:0Issues:0

iglue

[ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"

Language:ShellLicense:MITStargazers:48Issues:0Issues:0

coco-cn

Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks

Language:OpenEdge ABLLicense:MITStargazers:169Issues:0Issues:0
Language:PythonLicense:MITStargazers:81Issues:0Issues:0

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Language:PythonLicense:NOASSERTIONStargazers:1065Issues:0Issues:0

clair

CLAIR: A (surprisingly) simple semantic text metric with large language models.

Language:PythonLicense:NOASSERTIONStargazers:11Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:111Issues:0Issues:0

Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13727Issues:0Issues:0

refer

Referring Expression Datasets API

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:408Issues:0Issues:0

clipscore

CLIPScore EMNLP code

Language:PythonLicense:MITStargazers:167Issues:0Issues:0

FactualSceneGraph

FACTUAL benchmark dataset, the pre-trained textual scene graph parser trained on FACTUAL.

Language:PythonStargazers:85Issues:0Issues:0

FuseCap

FuseCap: Large Language Model for Visual Data Fusion in Enriched Caption Generation

Language:PythonLicense:MITStargazers:44Issues:0Issues:0

label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Language:JavaScriptLicense:Apache-2.0Stargazers:16817Issues:0Issues:0

cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Language:TypeScriptLicense:MITStargazers:11560Issues:0Issues:0

awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT better.

Language:HTMLLicense:CC0-1.0Stargazers:105074Issues:0Issues:0

BARTScore

BARTScore: Evaluating Generated Text as Text Generation

Language:PythonLicense:Apache-2.0Stargazers:300Issues:0Issues:0

captionGAN

Source code for the paper "Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training"

Language:PythonLicense:MITStargazers:64Issues:0Issues:0

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonLicense:Apache-2.0Stargazers:2340Issues:0Issues:0
Language:PythonStargazers:46Issues:0Issues:0

image-paragraph-captioning

[EMNLP 2018] Training for Diversity in Image Paragraph Captioning

Language:PythonStargazers:90Issues:0Issues:0

video-paragraph

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Language:PythonLicense:MITStargazers:64Issues:0Issues:0
Language:PythonStargazers:7Issues:0Issues:0

GRiT

GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)

Language:PythonLicense:MITStargazers:278Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:44590Issues:0Issues:0

viper

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1620Issues:0Issues:0

self-critical.pytorch

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Language:PythonLicense:MITStargazers:987Issues:0Issues:0

diht

Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

Language:PythonLicense:NOASSERTIONStargazers:117Issues:0Issues:0

infinibatch

Efficient, check-pointed data loading for deep learning with massive data sets.

Language:PythonLicense:MITStargazers:188Issues:0Issues:0

Stable-Pix2Seq

A full-fledged version of Pix2Seq

Language:PythonLicense:Apache-2.0Stargazers:233Issues:0Issues:0

Pretrained-Pix2Seq

Replication of Pix2Seq with Pretrained Model

Language:PythonStargazers:59Issues:0Issues:0