ドーム's repositories
bytetrack_cpp
This project uses yolov8 combined with bytetrack to achieve multi-target tracking
DiffPPO
Combining Diffusion Models with PPO to Improve Sample Efficiency and Exploration in Reinforcement Learning
dynamicPDB
Dynamic PDB datasets
EAGLE
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
engy
Engy is an AI-powered development tool that generates fully functional web applications from natural language, streamlining the process from idea to working prototype.
Everlyn-1
The first open autoregressive foundational video AI model.
gaio
High performance minimalism async-io(proactor) networking for Golang.
Gengine
Unleashing the Power of Distributed Content Management and Transformation
GitHub-Stats-SVG
A highly customizable GitHub stats SVG generator: Most readme card projects on GitHub look B-O-R-I-N-G, so I made a cool one myself. Cyberpunk style :)
hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
im-server
A high-performance IM server.
Kaggle-4th-Place-Solution-LMSYS-Chatbot-Arena-Human-Preference-Predictions
4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions
LeetCode-Solver-Bot
Effortlessly solve LeetCode problems with the power of automation! LeetCode Solver Bot automates fetching problems, generating solutions, debugging, and submission. No more manual coding or debugging—just sit back and let the bot handle the heavy lifting.
LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Magic-BI
One-stop data intelligence agent, providing insights from all mainstream data formats in a single dialogue box, including documents, databases, business systems, and images.一站式数据智能体,一个对话框提供所有主流格式数据的见解,包括文档、数据库、业务系统和图像。
mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
On-Device-FinLLM
OD-FinLLM is a refined model derived from the LLaMA series, with specific enhancements for Chinese financial knowledge. This model is built by fine-tuning LLaMA using a specialized instruction dataset created from publicly available Chinese financial Q&A data and additional web-scraped financial information.
petereport-zh
PeTeReport中文版,辅助渗透测试过程,让渗透测试报告一键生成,守护网络安全!
PUDM
[A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling, 2024, CVPR]
rag-men
A Contextual RAG Bot Framework
RoboTwin
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
Sutra_QAS
A system demo based on Retrival Argument Generation to answer buddism question
threestudio-dreambeast
🐱🐶🐲🐮🐷Implementation of DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
tx-parser
A powerful library for parsing on-chain transactions into clear, human-readable actions, streamlining blockchain data analysis and interpretation. 🐋
virtual_human_stream
The "virtual_human_stream" project is a real-time digital human system supporting audio-video dialogue. It integrates models like ernerf, musetalk, and wav2lip for voice cloning, video stitching, and streaming via RTMP/WebRTC. It’s optimized for high performance and easy customization, with support for ChatGPT dialogue integration.
vllm-mixed-precision
Support mixed-precsion inference with vllm