RayJue's repositories
byzer-llm
Easy, fast, and cheap pretrain,finetune, serving for everyone
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
flowgen
AutoGen Visualized - Visual Tools for Multi-Agent Development.
FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
G-LLaVA
Official github repo of G-LLaVA
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
ImagesAnnotation
Image annotation tool using LLaVA
InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
llamafile
Distribute and run LLMs with a single file.
LLaVA-Med
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
LLM-Data-Cleaner
用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
llmware
Unified framework for building enterprise RAG pipelines with small, specialized models
Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
meet-libai
李白 :bust_in_silhouette: 作为唐代杰出诗人,其诗歌作品在**文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。
mlp
The Multilayer Perceptron Language Model
moa
MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
MoA_New
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
Multimodal-Table-Understanding
Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and PT Dataset for table understanding and develop a generalist tabular MLLM Table-LLaVA.
OfflineRL
A collection of offline reinforcement learning algorithms.
Omost
Your image is almost there!
OpenDevin
🐚 OpenDevin: Code Less, Make More
Qwen2-UIE
基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】
Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️
TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
unitycatalog
Open, Multi-modal Catalog for Data & AI
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
wine-label-recognizer
Wine Label Recognizer: Extract wine name, vintage, producer from label images using OpenAI's GPT-4o API.