Long Chen (longcw)

longcw

Geek Repo

Company:Tsinghua University

Location:Beijing, China

Home Page:https://longcw.github.io

Github PK Tool:Github PK Tool

Long Chen's starred repositories

Real3DPortrait

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code

Language:PythonLicense:MITStargazers:716Issues:0Issues:0

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:16322Issues:0Issues:0

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLLicense:Apache-2.0Stargazers:7031Issues:0Issues:0

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:5448Issues:0Issues:0

GFPGAN-onnxruntime-demo

This is the onnxruntime inference code for GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior (CVPR 2021). Official code: https://github.com/TencentARC/GFPGAN

Language:PythonStargazers:105Issues:0Issues:0

Face-Restoration-TensorRT

A simple face restoration TensorRT deployment solution.

Language:C++Stargazers:69Issues:0Issues:0

anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.

Language:JavaScriptLicense:MITStargazers:15373Issues:0Issues:0

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonLicense:Apache-2.0Stargazers:5856Issues:0Issues:0

promptfoo

Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.

Language:TypeScriptLicense:MITStargazers:3156Issues:0Issues:0

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptLicense:NOASSERTIONStargazers:32184Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:PythonStargazers:9517Issues:0Issues:0

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4102Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:4469Issues:0Issues:0

search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

Language:TypeScriptLicense:Apache-2.0Stargazers:7187Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:34531Issues:0Issues:0

TokenFlow

Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)

Language:PythonLicense:MITStargazers:1485Issues:0Issues:0

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonLicense:MITStargazers:3766Issues:0Issues:0

Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language:PythonLicense:Apache-2.0Stargazers:701Issues:0Issues:0

gaussian-grouping

Gaussian Grouping for open-world Anything reconstruction, segmentation and editing.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:470Issues:0Issues:0

GroundingDINO

Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:5267Issues:0Issues:0

LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Language:PythonLicense:Apache-2.0Stargazers:643Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8790Issues:0Issues:0

MVDream

Multi-view Diffusion for 3D Generation

Language:PythonLicense:MITStargazers:681Issues:0Issues:0

MVDream-threestudio

3D generation code for MVDream

Language:PythonLicense:Apache-2.0Stargazers:449Issues:0Issues:0

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4219Issues:0Issues:0

T2I-Adapter

T2I-Adapter

Language:PythonStargazers:3235Issues:0Issues:0

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonLicense:MITStargazers:3352Issues:0Issues:0

EditAnything

Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)

Language:PythonLicense:Apache-2.0Stargazers:3183Issues:0Issues:0

multimodal-garment-designer

This is the official repository for the paper "Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing". ICCV 2023

Language:PythonLicense:NOASSERTIONStargazers:377Issues:0Issues:0

CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

Stargazers:1043Issues:0Issues:0