RossSong's repositories
autofaiss
Automatically create Faiss knn indices with the most optimal similarity search parameters.
AutoPrompt
A framework for prompt tuning using Intent-based Prompt Calibration
awesome-japanese-llm
日本語LLMまとめ - Overview of Japanese LLMs
BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
chatgpt_system_prompt
store all agent's system prompt
codel
✨ Fully autonomous AI Agent that can perform complicated tasks and projects using terminal, browser, and editor.
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
embedding_studio
Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
flashinfer
FlashInfer: Kernel Library for LLM Serving
Gemma-EasyLM
Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)
gpt-pilot
Dev tool that writes scalable apps from scratch while the developer oversees the implementation
GPTs
leaked prompts of GPTs
HRC
#인권코퍼스
hugging-chat-api
HuggingChat Python API🤗
Kiwi
Kiwi(지능형 한국어 형태소 분석기)
LLaMA-Factory
Unify Efficient Fine-tuning of 100+ LLMs
MemGPT
Teaching LLMs memory management for unbounded context 📚🦙
metal-flash-attention
Faster alternative to Metal Performance Shaders
minbpe
Minimal, clean, code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
OpenDevin
🐚 OpenDevin: Code Less, Make More
RepoForLLMs
Repository featuring fine-tuning code for various LLMs, complemented by occasional explanations, deep dives.
RingAttention
Transformers with Arbitrarily Large Context
search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
streaming-llm
Efficient Streaming Language Models with Attention Sinks
visionOS_30Days
visionOS 30 days challenge.
ZLUDA
CUDA on AMD GPUs