thapecroth's starred repositories

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4290Issues:0Issues:0

llm-twin-course

๐Ÿค– ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป for ๐—ณ๐—ฟ๐—ฒ๐—ฒ how to ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ an end-to-end ๐—ฝ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป-๐—ฟ๐—ฒ๐—ฎ๐—ฑ๐˜† ๐—Ÿ๐—Ÿ๐—  & ๐—ฅ๐—”๐—š ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ using ๐—Ÿ๐—Ÿ๐— ๐—ข๐—ฝ๐˜€ best practices: ~ ๐˜ด๐˜ฐ๐˜ถ๐˜ณ๐˜ค๐˜ฆ ๐˜ค๐˜ฐ๐˜ฅ๐˜ฆ + 12 ๐˜ฉ๐˜ข๐˜ฏ๐˜ฅ๐˜ด-๐˜ฐ๐˜ฏ ๐˜ญ๐˜ฆ๐˜ด๐˜ด๐˜ฐ๐˜ฏ๐˜ด

Language:PythonLicense:MITStargazers:2149Issues:0Issues:0

jsonformer

A Bulletproof Way to Generate Structured JSON from Language Models

Language:Jupyter NotebookLicense:MITStargazers:4231Issues:0Issues:0

dockerc

container image to single executable compiler

Language:ZigLicense:GPL-3.0Stargazers:2623Issues:0Issues:0

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:4026Issues:0Issues:0

FerretDB

A truly Open Source MongoDB alternative

Language:GoLicense:Apache-2.0Stargazers:8868Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:18316Issues:0Issues:0

gptzip

Losslessly encode text natively with arithmetic coding and HuggingFace Transformers

Language:PythonLicense:Apache-2.0Stargazers:63Issues:0Issues:0

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

Stargazers:683Issues:0Issues:0

thefuzz

Fuzzy String Matching in Python

Language:PythonLicense:MITStargazers:2702Issues:0Issues:0

llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Language:PythonLicense:MITStargazers:1036Issues:0Issues:0

xcopa

XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning

License:CC-BY-4.0Stargazers:97Issues:0Issues:0

code_mcts

๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป LLM + MCTS for Humaneval

Language:Jupyter NotebookStargazers:7Issues:0Issues:0

cloudflare-saas-stack

Quickly make and deploy full-stack apps with database, auth, styling, storage etc. figured out for you.

Language:TypeScriptLicense:MITStargazers:2626Issues:0Issues:0

MemGPT

Create LLM agents with long-term memory and custom tools ๐Ÿ“š๐Ÿฆ™

Language:PythonLicense:Apache-2.0Stargazers:11069Issues:0Issues:0

spider

The fastest, most efficient web crawler and scraper written in Rust. Maintained by @a11ywatch.

Language:RustLicense:MITStargazers:877Issues:0Issues:0

crawlee-python

Crawleeโ€”A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language:PythonLicense:Apache-2.0Stargazers:3528Issues:0Issues:0

omniparse

Ingest, parse, and optimize any data format โžก๏ธ from documents to multimedia โžก๏ธ for enhanced compatibility with GenAI frameworks

Language:PythonLicense:GPL-3.0Stargazers:4625Issues:0Issues:0

firecrawl

๐Ÿ”ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

Language:TypeScriptLicense:AGPL-3.0Stargazers:8881Issues:0Issues:0

agentic

AI agent stdlib that works with any LLM and TypeScript AI SDK.

Language:TypeScriptLicense:MITStargazers:16110Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:17Issues:0Issues:0

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonLicense:MITStargazers:591Issues:0Issues:0

langflow

โ›“๏ธ Langflow is a visual framework for building multi-agent and RAG applications. It's open-source, Python-powered, fully customizable, model and vector store agnostic.

Language:JavaScriptLicense:MITStargazers:23929Issues:0Issues:0

swapy

โœจ A framework-agnostic tool that converts any layout into a drag-to-swap one with just a few lines of code https://swapy.tahazsh.com/

Language:TypeScriptLicense:MITStargazers:1941Issues:0Issues:0

RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Language:PythonLicense:Apache-2.0Stargazers:2589Issues:0Issues:0

llama_parse

Parse files for optimal RAG

Language:PythonLicense:MITStargazers:2098Issues:0Issues:0

cake

Distributed LLM inference for mobile, desktop and server.

Language:RustLicense:NOASSERTIONStargazers:2340Issues:0Issues:0

colpali

The code used to train and run inference with the ColPali architecture.

Language:PythonLicense:MITStargazers:245Issues:0Issues:0

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9445Issues:0Issues:0

outlines

Structured Text Generation

Language:PythonLicense:Apache-2.0Stargazers:7767Issues:0Issues:0