hata8210

followers

following

stars

hata8210's starred repositories

dataherald

Interact with your SQL database, Natural Language to SQL using LLMs

Language:PythonApache-2.0322700

data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

Language:PythonApache-2.0176800

calcite

Apache Calcite

Language:JavaApache-2.0447000

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0240900

superset

Apache Superset is a Data Visualization and Data Exploration Platform

Language:TypeScriptApache-2.06071700

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonApache-2.01220300

transformer-debugger

Language:PythonMIT397600

sqlcoder

SoTA LLM for converting natural language questions to SQL queries

Language:Jupyter NotebookApache-2.0312600

sqlglot

Python SQL Parser and Transpiler

Language:PythonMIT605900

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonMIT163600

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT586500

litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Language:PythonApache-2.0875800

ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Language:PythonApache-2.0580600

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1239400

sql-eval

Evaluate the accuracy of LLM generated outputs

Language:Jupyter NotebookApache-2.045800

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.02916800

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookApache-2.01841500

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03389800

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03836200

MindMap

MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models

Language:Python18900

Grapher

Code that implements efficient knowledge graph extraction from the textual descriptions

Language:PythonApache-2.013500

chroma

the AI-native open-source embedding database

Language:RustApache-2.01365300

LASER

Language-Agnostic SEntence Representations

Language:Jupyter NotebookNOASSERTION356500

Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Language:PythonApache-2.01169600

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonApache-2.01266100

text-generation-webui

A Gradio web UI for Large Language Models.

Language:PythonAGPL-3.03847800

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonApache-2.03099000

fast-stable-diffusion

fast-stable-diffusion + DreamBooth

Language:PythonMIT741100

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:PythonMIT3763800

lora-scripts

LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.

Language:PythonAGPL-3.0410900