Sanctuary (Yutong-Zhou-cv)

Yutong-Zhou-cv

Geek Repo

Location:zhou@i.ci.ritsumei.ac.jp

Home Page:https://elizazhou96.github.io/

Github PK Tool:Github PK Tool

Sanctuary's starred repositories

Awesome-Scientific-Language-Models

A Curated List of Language Models in Scientific Domains

License:MITStargazers:288Issues:0Issues:0

SEED-Bench

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

Language:PythonLicense:NOASSERTIONStargazers:261Issues:0Issues:0

FoE-ICLR2024

The implementation of FoE for ICLR 2024

Language:PythonStargazers:2Issues:0Issues:0

clip_dinoiser

Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:141Issues:0Issues:0

rscir

Official PyTorch implementation and benchmark dataset for IGARSS 2024 ORAL paper: "Composed Image Retrieval for Remote Sensing"

Language:PythonLicense:Apache-2.0Stargazers:44Issues:0Issues:0

Recommendations-Diffusion-Text-Image

A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten generation, scene text recognition and scene text detection.

Stargazers:146Issues:0Issues:0

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:277Issues:0Issues:0

Vlogger

[CVPR2024] Make Your Dream A Vlog

Language:PythonLicense:Apache-2.0Stargazers:381Issues:0Issues:0

bioclip

This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral].

Language:PythonLicense:NOASSERTIONStargazers:72Issues:0Issues:0

UDR-Mixer

Towards Ultra-High-Definition Image Deraining: A Benchmark and An Efficient Method

Language:PythonStargazers:18Issues:0Issues:0

MathBench

[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset

License:Apache-2.0Stargazers:61Issues:0Issues:0

pero-pretraining

OCR self-supervised pretraining for paper Kišš et al.: Self-supervised pretraining for text recognition.

Language:PythonLicense:BSD-2-ClauseStargazers:4Issues:0Issues:0

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Language:PythonLicense:NOASSERTIONStargazers:4937Issues:0Issues:0

MM-VUFM4DS

A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios

Stargazers:28Issues:0Issues:0
Language:PythonStargazers:28Issues:0Issues:0

FG-2024-Papers

FG 2024 Papers: Explore a comprehensive collection of research papers presented at one of the premier conferences on automatic face and gesture recognition. Seamlessly integrate code implementations for better understanding. ⭐ Experience the cutting edge of progress in facial analysis, gesture recognition, and biometrics with this repository!

License:MITStargazers:6Issues:0Issues:0

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonLicense:Apache-2.0Stargazers:486Issues:0Issues:0

GroundingDINO

Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:5309Issues:0Issues:0

Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

License:MITStargazers:1929Issues:0Issues:0

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13869Issues:0Issues:0

EdgeSAM

Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:714Issues:0Issues:0

agricultural_textual_classification_ChatGPT

using ChatGPT to classify textual topics/ categories.

Language:Jupyter NotebookStargazers:37Issues:0Issues:0

tokenize-anything

Tokenize Anything via Prompting

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:456Issues:0Issues:0

TPD

This is the official repository for the paper "Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On". CVPR 2024

Language:PythonStargazers:37Issues:0Issues:0

flatten

Pytorch Implementation of FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing (ICLR 2024)

Language:PythonLicense:Apache-2.0Stargazers:155Issues:0Issues:0

RAT

Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".

Language:PythonStargazers:126Issues:0Issues:0

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookStargazers:1562Issues:0Issues:0

vicreg

VICReg official code base

Language:PythonLicense:MITStargazers:497Issues:0Issues:0

BLINK_Benchmark

This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390

Language:PythonLicense:Apache-2.0Stargazers:67Issues:0Issues:0

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Language:PythonLicense:Apache-2.0Stargazers:23959Issues:0Issues:0