EnigmaHong's repositories
cookbook
Examples and guides for using the Gemini API.
InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
surya
OCR, layout analysis, reading order, line detection in 90+ languages
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
Open-AnimateAnyone
Unofficial Implementation of Animate Anyone
OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
SoraWebui
SoraWebui is an open-source Sora web client, enabling users to easily create videos from text with OpenAI's Sora model.
doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
AnimateDiff
Official implementation of AnimateDiff.
ProPainter
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
XAgent
An Autonomous LLM Agent for Complex Task Solving
lama-cleaner
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Qwen-7B
The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.
ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
stable-diffusion-webui
Stable Diffusion web UI
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
nunif
misc. contains latest version of waifu2x.
Fay
Fay是一个完整的开源项目,包含Fay控制器及数字人模型,可灵活组合出不同的应用场景:虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。 开源项目,非产品试用!!!
GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
RapidOCR
A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.
openai-cookbook
Examples and guides for using the OpenAI API
stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
llama.cpp
Port of Facebook's LLaMA model in C/C++
Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs
MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models