sigridjineth

Sigrid Jin (ง'̀-'́)ง oO's repositories

sigridjineth

Language:HTML800

candle-vllm

Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.

MIT100

mpc-uniqueness-check

MPC Uniqueness Check

Language:RustApache-2.0100

rm25

BM25 implementation in Rust

100

smol-vision

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Apache-2.0100

cuda_practice

CUDA Playground

Language:C++000

1.5-Pints

A compact LLM pretrained in 9 days by using high quality data

MIT000

chatbot-starter

Minimal NextJS chatbot starter template

000

ComfyUI-Docker

🐳Dockerfile for 🎨ComfyUI. | 容器镜像与启动脚本

NOASSERTION000

dom-to-semantic-markdown

DOM to Semantic-Markdown for use in LLMs

MIT000

ebpf_exporter

Prometheus exporter for custom eBPF metrics

MIT000

freezegun

Let your Python tests travel through time

Apache-2.0000

gpt_server

gpt_server是一个用于生产级部署LLMs或Embedding的开源框架。

Apache-2.0000

Liger-Kernel

Efficient Triton Kernels for LLM Training

BSD-2-Clause000

llamatutor

An AI personal tutor built with Llama 3.1

000

llm-search

Querying local documents, powered by LLM

Language:Jupyter NotebookMIT000

mako

An extremely fast, production-grade web bundler based on Rust.

Language:RustMIT000

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonApache-2.0000

Minitron

A family of compressed models obtained via pruning and knowledge distillation

000

rank_llm

RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.

Apache-2.0000

scala-example

Language:Scala010

semantic-grep

grep for words with similar meaning to the query

Language:GoMIT000

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonApache-2.0000

sigrid-labs

Language:Python000

SmoothMQ

A drop-in replacement for SQS designed for great developer experience and efficiency.

AGPL-3.0000

spark-instructor

A library for building structured LLM responses with Spark

MIT000

stable-diffusion.cpp

Stable Diffusion in pure C/C++

Language:C++MIT000

swiftide

Fast, streaming indexing and query library for AI (RAG) applications, written in Rust

MIT000

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonApache-2.0000

text-embeddings-inference

A blazing fast inference solution for text embeddings models

Language:RustApache-2.0000