MaoXianXin's starred repositories

bonito

A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.

Language:PythonLicense:BSD-3-ClauseStargazers:636Issues:0Issues:0

Scrapegraph-ai

Python scraper based on AI

Language:PythonLicense:MITStargazers:13876Issues:0Issues:0

Perplexica

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

Language:TypeScriptLicense:MITStargazers:12382Issues:0Issues:0

fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

Language:GoStargazers:20453Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7935Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8602Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3898Issues:0Issues:0

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonLicense:MITStargazers:15242Issues:0Issues:0

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:7158Issues:0Issues:0

awesome-SynthText

A curated list of awesome synthetic data for text location and recognition

Stargazers:326Issues:0Issues:0

SynthText

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Language:PythonLicense:Apache-2.0Stargazers:2001Issues:0Issues:0

TextRecognitionDataGenerator

A synthetic data generator for text recognition

Language:PythonLicense:MITStargazers:3188Issues:0Issues:0

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonLicense:MITStargazers:5631Issues:0Issues:0

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonLicense:Apache-2.0Stargazers:1856Issues:0Issues:0

phi3_vision_language

experiments with microsoft phi3 vision language model. Image captioning, OCR, data extraction

Language:Jupyter NotebookStargazers:5Issues:0Issues:0

florence2-finetuning

Quick exploration into fine tuning florence 2

Language:Jupyter NotebookLicense:MITStargazers:233Issues:0Issues:0

annotated-transformer

An annotated implementation of the Transformer paper.

Language:Jupyter NotebookLicense:MITStargazers:5492Issues:0Issues:0

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonLicense:Apache-2.0Stargazers:2814Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18762Issues:0Issues:0

mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Language:PythonLicense:MITStargazers:2276Issues:0Issues:0

textgrad

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.

Language:PythonLicense:MITStargazers:1395Issues:0Issues:0

RAG-chatbot-Speckly

This project presents a RAG chat app for the Speckle Developer Documentation.

Language:Jupyter NotebookLicense:MITStargazers:19Issues:0Issues:0

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:10684Issues:0Issues:0

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22674Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4563Issues:0Issues:0

OCRDatasets

A collection of OCR-related datasets

Stargazers:89Issues:0Issues:0

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

Language:GoLicense:MITStargazers:85481Issues:0Issues:0

Playground

Text WebUI extension to add clever Notebooks to Chat mode

Language:PythonStargazers:126Issues:0Issues:0

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:6919Issues:0Issues:0

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Language:PythonLicense:Apache-2.0Stargazers:2012Issues:0Issues:0