BenJamesbabala's repositories
Agently-Daily-News-Collector
An open-source LLM based automatically daily news collecting workflow showcase powered by Agently AI application development framework.
Automatic_Speech_Annotator
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition
Awesome-AISourceHub
本仓库收集AI科技领域高质量信息源。 可以起到一个同步信息源的作用,避免信息差和信息茧房。
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
barkour_robot
Barkour Robot: Agile Quadruped Robots by Google DeepMind
BrowserGym
BrowserGym, a gym environment for web task automation in the Chromium browser.
Chenyme-AAVT
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
EmoLLM
心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3
gpupixel
Real-Time video and image AI beauty filter library that achieves commercial-grade beauty effects. It is written in C++11 and based on OpenGL/ES.
Hitomi-Downloader
:cake: Desktop utility to download images/videos/music/text from various websites, and more.
HowToLiveLonger
程序员延寿指南 | A programmer's guide to live longer
IntelliQ
Advanced Multi-Turn QA System with LLM and Intent Recognition. 基于LLM大语言模型意图识别、参数抽取结合slot词槽技术实现多轮问答、NL2API. 打造Function Call多轮问答最佳实践
LLaVA-Llama-3
Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.
luna-ai
Luna AI - 全自动的 AI 直播系统
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
NeuScraper
This is the code repo for our paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".
nicer-slam
[3DV'24 Best Paper Honorable Mention] NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM
open-interpreter
A natural language interface for computers
OpenCodeInterpreter
OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophisticated proprietary systems like the GPT-4 Code Interpreter. It significantly enhances code generation capabilities by integrating execution and iterative refinement functionalities.
OpenVoice
Instant voice cloning by MyShell.
ott
Api tool for local offline text translation supporting multiple languages/支持多语言的本地离线文字翻译api
plvs
PLVS is a real-time SLAM system with points, lines, volumetric mapping and 3D unsupervised incremental segmentation.
text_blind_watermark
文本盲水印:把信息隐匿到文本中,put invisible blind watermark into a text.
TexTeller
TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability, enabling it to cover most usage scenarios.
UPOCR
Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)
web-check
🕵️♂️ All-in-one OSINT tool for analysing any website
Windrecorder
Windrecorder is a memory search app by records everything on your screen in small size, to let you rewind what you have seen, query through OCR text or image description, and get activity statistics. Developed as MacOS App Rewind.ai's alternative on Windows platform.