Beast code in Giters

蓋瑞王's repositories

CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Apache-2.0100

doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Language:PythonApache-2.0100

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.0000

UniTalker

Language:PythonApache-2.0000

AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Language:Jupyter NotebookApache-2.0000

airllm

AirLLM 70B inference with single 4GB GPU

Apache-2.0000

axlearn

An Extensible Deep Learning Library

Language:PythonApache-2.0000

cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

MIT000

Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Language:PythonAGPL-3.0000

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonApache-2.0000

facefusion

Next generation face swapper and enhancer

Language:PythonNOASSERTION000

FLAME-Universe

Summary of publicly available ressources such as code, datasets, and scientific papers for the FLAME 3D head model

000

FruitNeRF

[IROS24] Offical Code for "FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework" - Inegrated into Nerfstudio

000

GenerativePhotomontage

000

insightface

State-of-the-art 2D and 3D Face Analysis Project

000

LongWriter

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Apache-2.0000

Medical-SAM2

Medical SAM 2: Segment Medical Images As Video Via Segment Anything Model 2

Apache-2.0000

MindSearch

a LLM-based Multi-agent Framework of Web Search Engine similar to Perplexity.ai Pro and SearchGPT

Language:PythonApache-2.0000

mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

MIT000

notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.

Language:Jupyter Notebook000

openchat

OpenChat: Advancing Open-source Language Models with Imperfect Data

Language:PythonApache-2.0010

OpenResearcher

Language:HTMLApache-2.0000

ovavss

Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].

Language:Python000

PeriodWave

The official Implementation of PeriodWave and PeriodWave-Turbo

MIT000

PPOCRLabel

PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data.

000

pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Language:PythonNOASSERTION000

RAGFoundry

Framework for specializing LLMs for retrieval-augmented-generation tasks using fine-tuning.

Apache-2.0000

sprite-decompose

Fast Sprite Decomposition from Animated Graphics [ECCV2024]

MIT000

ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

AGPL-3.0000

WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

MIT000

gary109

蓋瑞王's repositories

CogVideo

doctr

segment-anything-2

UniTalker

AI-Scientist

airllm

axlearn

cvat

Deep-Live-Cam

diffusers

facefusion

FLAME-Universe

FruitNeRF

GenerativePhotomontage

insightface

LongWriter

Medical-SAM2

MindSearch

mPLUG-Owl

notebooks

openchat

OpenResearcher

ovavss

PeriodWave

PPOCRLabel

pytorch3d

RAGFoundry

sprite-decompose

ultralytics

WavTokenizer