Beast code in Giters

用大模型批量处理数据，现支持各种大模型做OCR，支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.

MIT000

llmware

Unified framework for building enterprise RAG pipelines with small, specialized models

Apache-2.0000

Math-LLaVA

Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

Apache-2.0000

meet-libai

李白 :bust_in_silhouette: 作为唐代杰出诗人，其诗歌作品在**文学史上具有重要地位。近年来，随着数字技术和人工智能的快速发展，传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入，但在数字化、智能化普及方面仍存在不足。因此，本项目旨在通过构建李白知识图谱，结合大模型训练出专业的AI智能体，以生成式对话应用的形式，推动李白文化的普及与推广。

GPL-3.0000

mlp

The Multilayer Perceptron Language Model

000

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.

GPL-3.0000

MoA_New

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Apache-2.0000

MOELoRA-peft

[SIGIR'24] The official implementation code of MOELoRA.

MIT000

Multimodal-Table-Understanding

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and PT Dataset for table understanding and develop a generalist tabular MLLM Table-LLaVA.

000

OfflineRL

A collection of offline reinforcement learning algorithms.

Apache-2.0000

Omost

Your image is almost there!

Apache-2.0000

OpenDevin

🐚 OpenDevin: Code Less, Make More

MIT000

Qwen2-UIE

基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】

000

small-language-models

Apache-2.0000

Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️

Apache-2.0000

TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Apache-2.0000

unitycatalog

Open, Multi-modal Catalog for Data & AI

Apache-2.0000

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

CC-BY-4.0000

wine-label-recognizer

Wine Label Recognizer: Extract wine name, vintage, producer from label images using OpenAI's GPT-4o API.

000

RayJue

RayJue's repositories

align-anything

byzer-llm

ChatTTS

flowgen

FunClip

G-LLaVA

GPT-SoVITS

ImagesAnnotation

InternVL

llamafile

LLaVA-Med

LLM-Data-Cleaner

llmware

Math-LLaVA

meet-libai

mlp

moa

MoA_New

MOELoRA-peft

Multimodal-Table-Understanding

OfflineRL

Omost

OpenDevin

Qwen2-UIE

small-language-models

Streamer-Sales

TinyLLaVA_Factory

unitycatalog

Video-ChatGPT

wine-label-recognizer