ThanhPham1987

followers

following

stars

PeterPham's repositories

AlignGPT

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

100

awesome-llm-apps

Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.

Language:PythonCC0-1.0100

Awesome-Mamba-Collection

A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.

MIT100

CV-TRY-ON-DM-VTON

👗 DM-VTON: Distilled Mobile Real-time Virtual Try-On

Language:PythonNOASSERTION100

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonApache-2.0100

InstaDrag

Experiencing lightning fast (~1s) and accurate drag-based image editing

100

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellNOASSERTION100

langroid

Harness LLMs with Multi-Agent Programming

Language:PythonMIT100

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookNOASSERTION100

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level MLLM on Your Phone

Apache-2.0100

MixEval

The official evaluation suite and dynamic data release for MixEval.

Language:Python100

MultiMed

Multilingual Multitask Multipurpose Medical Speech Recognition

Language:Python100

MultimodalOCR

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Language:Python100

ReVideo

100

short-transformers

Prune transformer layers

MIT100

triton

Development repository for the Triton language and compiler

Language:C++MIT100

UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Language:Python100

ComfyUI-OOTDiffusion

ComfyUI custom node that simply integrates the OOTDiffusion.

NOASSERTION000

ComfyUI-YoloWorld-EfficientSAM

Unofficial implementation of YOLO-World + EfficientSAM for ComfyUI

GPL-3.0000

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

MIT000

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

000

LookOnceToHear

A novel human-interaction method for real-time speech extraction on headphones.

NOASSERTION000

masa

Official Implementation of CVPR24 paper: Matching Anything by Segmenting Anything

000

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Apache-2.0000

MotionBooth

The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

000

pytorch-template

PyTorch deep learning projects made easy.

MIT000

SVD_Xtend

Stable Video Diffusion Training Code and Extensions.

000

TokenHMR

[CVPR 2024] TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

NOASSERTION000

tryondiffusion

TryOnDiffusion: A Tale of Two UNets Implementation

000

WeightWatcher

The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

Apache-2.0000