PeterPham's repositories

AlignGPT

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

Stargazers:1Issues:0Issues:0

awesome-llm-apps

Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.

Language:PythonLicense:CC0-1.0Stargazers:1Issues:0Issues:0

Awesome-Mamba-Collection

A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.

License:MITStargazers:1Issues:0Issues:0

CV-TRY-ON-DM-VTON

👗 DM-VTON: Distilled Mobile Real-time Virtual Try-On

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

InstaDrag

Experiencing lightning fast (~1s) and accurate drag-based image editing

Stargazers:1Issues:0Issues:0

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:1Issues:0Issues:0

langroid

Harness LLMs with Multi-Agent Programming

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level MLLM on Your Phone

License:Apache-2.0Stargazers:1Issues:0Issues:0

MixEval

The official evaluation suite and dynamic data release for MixEval.

Language:PythonStargazers:1Issues:0Issues:0

MultiMed

Multilingual Multitask Multipurpose Medical Speech Recognition

Language:PythonStargazers:1Issues:0Issues:0

MultimodalOCR

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Language:PythonStargazers:1Issues:0Issues:0
Stargazers:1Issues:0Issues:0

short-transformers

Prune transformer layers

License:MITStargazers:1Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:1Issues:0Issues:0

UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Language:PythonStargazers:1Issues:0Issues:0

ComfyUI-OOTDiffusion

ComfyUI custom node that simply integrates the OOTDiffusion.

License:NOASSERTIONStargazers:0Issues:0Issues:0

ComfyUI-YoloWorld-EfficientSAM

Unofficial implementation of YOLO-World + EfficientSAM for ComfyUI

License:GPL-3.0Stargazers:0Issues:0Issues:0

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

License:MITStargazers:0Issues:0Issues:0

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Stargazers:0Issues:0Issues:0

LookOnceToHear

A novel human-interaction method for real-time speech extraction on headphones.

License:NOASSERTIONStargazers:0Issues:0Issues:0

masa

Official Implementation of CVPR24 paper: Matching Anything by Segmenting Anything

Stargazers:0Issues:0Issues:0

Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

License:Apache-2.0Stargazers:0Issues:0Issues:0

MotionBooth

The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

Stargazers:0Issues:0Issues:0

pytorch-template

PyTorch deep learning projects made easy.

License:MITStargazers:0Issues:0Issues:0

SVD_Xtend

Stable Video Diffusion Training Code and Extensions.

Stargazers:0Issues:0Issues:0

TokenHMR

[CVPR 2024] TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

License:NOASSERTIONStargazers:0Issues:0Issues:0

tryondiffusion

TryOnDiffusion: A Tale of Two UNets Implementation

Stargazers:0Issues:0Issues:0

WeightWatcher

The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

License:Apache-2.0Stargazers:0Issues:0Issues:0