yangmin09 (feymanpriv)

feymanpriv

Geek Repo

Company:BUPT

Location:Beijing

Github PK Tool:Github PK Tool

yangmin09's starred repositories

awesome-chatgpt-prompts-zh

ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45103Issues:299Issues:650
Language:PythonLicense:NOASSERTIONStargazers:34456Issues:309Issues:348

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonLicense:BSD-3-ClauseStargazers:25096Issues:219Issues:449

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:24774Issues:701Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:18962Issues:296Issues:1316

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13953Issues:114Issues:369
Language:PythonLicense:Apache-2.0Stargazers:9017Issues:121Issues:98

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4412Issues:34Issues:188

OpenGpt

Create your own ChatGPT App in seconds.

Language:TypeScriptLicense:GPL-3.0Stargazers:3950Issues:34Issues:49

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3527Issues:47Issues:170

EVA

EVA Series: Visual Representation Fantasies from BAAI

Language:PythonLicense:MITStargazers:2049Issues:31Issues:150

visual-openllm

something like visual-chatgpt, 文心一言的开源版

flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Language:PythonLicense:MITStargazers:1159Issues:21Issues:13

unit-minions

《AI 研发提效:自己动手训练 LoRA》,包含 Llama (Alpaca LoRA)模型、ChatGLM (ChatGLM Tuning)相关 Lora 的训练。训练内容:用户故事生成、测试代码生成、代码辅助生成、文本转 SQL、文本生成代码……

Language:Jupyter NotebookStargazers:1003Issues:20Issues:12

VideoX

VideoX: a collection of video cross-modal models

Language:PythonLicense:NOASSERTIONStargazers:939Issues:22Issues:109

SAM-Adapter-PyTorch

Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts

Language:PythonLicense:MITStargazers:836Issues:10Issues:74

Image2Paragraph

[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.

Language:PythonLicense:Apache-2.0Stargazers:767Issues:11Issues:28

Deep-Metric-Learning-Baselines

PyTorch Implementation for Deep Metric Learning Pipelines

Language:PythonLicense:Apache-2.0Stargazers:572Issues:17Issues:24

VLog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.

Language:PythonLicense:MITStargazers:503Issues:6Issues:10

UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Language:PythonLicense:Apache-2.0Stargazers:274Issues:7Issues:75

ovr-cnn

A new framework for open-vocabulary object detection, based on maskrcnn-benchmark

Language:PythonLicense:MITStargazers:214Issues:5Issues:28

Cap4Video

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Language:PythonLicense:MITStargazers:214Issues:9Issues:28

unicom

[ICLR 2023] Unicom: Universal and Compact Representation Learning for Image Retrieval

Text4Vis

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

Language:PythonLicense:MITStargazers:198Issues:6Issues:23

BEVT

PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529

Language:PythonLicense:Apache-2.0Stargazers:151Issues:7Issues:10

BIKE

【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Language:PythonLicense:MITStargazers:150Issues:12Issues:20

TubeViT

An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"

Language:PythonLicense:MITStargazers:76Issues:10Issues:13

DeepLogo2

A brand logo detection system by DETR

Language:PythonLicense:MITStargazers:50Issues:3Issues:8