dolortaste

dolortaste

Geek Repo

Github PK Tool:Github PK Tool

dolortaste's starred repositories

fanqiang

翻墙-科学上网

Language:KotlinStargazers:37692Issues:0Issues:0

ControlNet

Let us control diffusion models!

Language:PythonLicense:Apache-2.0Stargazers:29142Issues:0Issues:0

pytorch-widedeep

A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

Language:PythonLicense:Apache-2.0Stargazers:1265Issues:0Issues:0

ASL

Official Pytorch Implementation of: "Asymmetric Loss For Multi-Label Classification"(ICCV, 2021) paper

Language:PythonLicense:MITStargazers:711Issues:0Issues:0

QD-DETR

Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)

Language:PythonLicense:NOASSERTIONStargazers:185Issues:0Issues:0

MMSum_model

[CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Language:PythonStargazers:28Issues:0Issues:0

Visionary-Vids

Multi-modal transformer approach for natural language query based joint video summarization and highlight detection

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:11Issues:0Issues:0

HiREST

Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)

Language:PythonLicense:MITStargazers:87Issues:0Issues:0

MultiTaskModel

multi task mode for esmm and mmoe

Language:PythonStargazers:118Issues:0Issues:0

roboflow-100-benchmark

Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets

Language:Jupyter NotebookLicense:MITStargazers:235Issues:0Issues:0

GLIP

Grounded Language-Image Pre-training

Language:PythonLicense:MITStargazers:2086Issues:0Issues:0

CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

Stargazers:1093Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:77Issues:0Issues:0

paper-reading-note

和李沐一起读论文

License:Apache-2.0Stargazers:92Issues:0Issues:0

awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

License:MITStargazers:5694Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:5Issues:0Issues:0

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:25083Issues:0Issues:0

GLIGEN

Open-Set Grounded Text-to-Image Generation

Language:PythonLicense:MITStargazers:1922Issues:0Issues:0

SimREC

A lightweight codebase for referring expression comprehension and segmentation

Language:PythonLicense:Apache-2.0Stargazers:49Issues:0Issues:0

FIBER

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

Language:PythonLicense:MITStargazers:126Issues:0Issues:0

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14249Issues:0Issues:0

awesome-detection-transformer

Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)

Stargazers:1224Issues:0Issues:0

OpenSeeD

[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"

Language:PythonLicense:Apache-2.0Stargazers:618Issues:0Issues:0

DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:2096Issues:0Issues:0

RefTR

Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021

Language:PythonLicense:MITStargazers:64Issues:0Issues:0

DQ-DETR

[AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding

Stargazers:52Issues:0Issues:0

UNINEXT

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Language:PythonLicense:MITStargazers:1472Issues:0Issues:0

Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Language:PythonLicense:MITStargazers:6292Issues:0Issues:0

Deformable-DETR

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

Language:PythonLicense:Apache-2.0Stargazers:3047Issues:0Issues:0

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonLicense:Apache-2.0Stargazers:2376Issues:0Issues:0