Xin Cai (TotalVariation)

TotalVariation

Geek Repo

Company:AIRC, Ulster University

Location:Belfast, United Kingdom

Home Page:https://totalvariation.github.io/

Twitter:@XinCai92

Github PK Tool:Github PK Tool

Xin Cai's starred repositories

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:12051Issues:116Issues:30

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:7947Issues:76Issues:226

multimodal

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Language:PythonLicense:BSD-3-ClauseStargazers:1500Issues:21Issues:40

multimodal-maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

Language:PythonLicense:Apache-2.0Stargazers:1308Issues:18Issues:13

CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

Awesome-TimeSeries-SpatioTemporal-LM-LLM

A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.

Awesome-Segment-Anything

This repository is for the first comprehensive survey on Meta AI's Segment Anything Model (SAM).

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Language:PythonLicense:Apache-2.0Stargazers:727Issues:10Issues:102

Segment-Any-Point-Cloud

[NeurIPS'23 Spotlight] Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing

Awesome-SSL4TS

A professionally curated list of awesome resources (paper, code, data, etc.) on Self-Supervised Learning for Time Series (SSL4TS).

MQ-Det

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)

Language:PythonLicense:Apache-2.0Stargazers:276Issues:2Issues:61
Language:PythonLicense:Apache-2.0Stargazers:269Issues:9Issues:18

OV-DETR

[Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)

CAE

This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"

COMM

Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

Segment-Anything-CLIP

Connecting segment-anything's output masks with the CLIP model; Awesome-Segment-Anything-Works

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:182Issues:4Issues:3

CORA

A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023

Language:PythonLicense:Apache-2.0Stargazers:180Issues:7Issues:30

SAMFeat

The official implementation of “Segment Anything Model is a Good Teacher for Local Feature Learning”.

Language:PythonLicense:MITStargazers:106Issues:5Issues:4

SegCLIP

PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"

Awesome-Unsupervised-Object-Localization

Curated list of awesome works on unsupervised object localization in 2D images.

MaskCLIP

Code Release for MaskCLIP (ICML 2023)

Language:PythonLicense:NOASSERTIONStargazers:58Issues:3Issues:7

betrayed-by-captions

(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

Language:Jupyter NotebookStargazers:45Issues:7Issues:8

CounTX

Includes FSC-147-D and the code for training and testing the CounTX model from the paper Open-world Text-specified Object Counting.

Language:Jupyter NotebookLicense:MITStargazers:35Issues:2Issues:9

minimal-sqvae

A minimal Pytorch Implementation of Stochastically Quantized Variational AutoEncoder (SQ-VAE) by Sony

Language:PythonLicense:MITStargazers:29Issues:2Issues:0

MILA

Memory-Based Instance-Level Adaptation for Cross-Domain Object Detection

Language:Jupyter NotebookStargazers:10Issues:2Issues:0
Language:HTMLLicense:MITStargazers:1Issues:1Issues:0