hsaigroup's repositories
allenai_molmo_wrapper
正如你所见, allenai molmo wrapper
bolt.new-any-llm
Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
ComfyUI-Molmo
Generate detailed image descriptions and analysis using Molmo models in ComfyUI.
ComfyUI-PixtralLlamaMolmoVision
For loading and running Pixtral models
Computer-Vision-Server
An API server for Molmo 7B - Describe web pages or computer screenshots and point to elements using Molmo 7B - a multimodal vision model which can describe real and virtual images and point at objects
computer_use_ootb
An out-of-the-box (OOTB) version of Anthropic Claude Computer Use for Windows and macOS
cursor-free-vip
[Support 0.46.10](Reset Cursor AI MachineID & Auto Sign Up / In)自动注册 Cursor Ai ,自动重置机器ID , 免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machine. Please upgrade to pro. We have this limit in place to prevent abuse. Please let us know if you believe this is a mistake.
cursor-vip
cursor IDE enjoy VIP
DeepClaude
Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.5 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.0 - Superior Quality! 🔌 OpenAI-Compatible. 🌊 Streaming & Non-Streaming Support. ✨ Experience the Future of AI – Today! Click to Try Now! ✨
Deepseek-with-camera
a framework combining abilities of QwenVL and Deepseek Apis to enable a visual interaction using deepseek model.
DesktopAI
An common framework for voice and text interactions with LLMs
Echo-Voice-Cloning-Soundboard
Echo - Voice Cloning Soundboard for Call of Duty, MW3, Black Ops,
Kmars.ai_AI_Image_Analyzer
AI Image Analyzer for ollama mistral.rs molmo in Mac M2 max (Screen Capture Analyzer ,Camera Capture Analyzer)
llama.cpp
LLM inference in C/C++
Molmo-Finetune
An open-source implementaion for fine-tuning Molmo-7B-D and Molmo-7B-O by allenai.
Open-LLM-VTuber
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
Project_Miao
一起来养一只拥有专属记忆的AI猫猫吧!
py-xiaozhi
python版本的小智ai,主要帮助那些没有硬件却想体验小智功能的人
SAM_Molmo_Whisper
An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
SpeakControl
SpeakControl 是基于ssfrpa开发的可自定义任意指令的语音控制,指令可以是简单的运行某程序、也可以是复杂流程逻辑。
Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
TEN-Agent
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
xiaozhi-esp32-server
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
xiaozhi-VLPro
帮助视觉障碍者识别物品,障碍物提醒