Myungchul Shin (dsindex)

dsindex

Geek Repo

Company:https://github.com/kakaobrain

Location:Pangyo-dong

Home Page:http://dsindex.github.io

Github PK Tool:Github PK Tool

Myungchul Shin's starred repositories

gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL

Language:TypeScriptLicense:ISCStargazers:18383Issues:118Issues:115

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:14706Issues:138Issues:2102

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12246Issues:99Issues:474
Language:PythonLicense:NOASSERTIONStargazers:8269Issues:157Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7798Issues:76Issues:154

rags

Build ChatGPT over your data, all with natural language

Language:PythonLicense:MITStargazers:6168Issues:55Issues:39

LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Language:PythonLicense:MITStargazers:4375Issues:32Issues:111

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:2510Issues:23Issues:26

prompt2model

prompt2model - Generate Deployable Models from Natural Language Instructions

Language:PythonLicense:Apache-2.0Stargazers:1936Issues:25Issues:167

KwaiAgents

A generalized information-seeking agent system with Large Language Models (LLMs).

Language:PythonLicense:NOASSERTIONStargazers:1062Issues:21Issues:41

ATLAS

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Language:PythonLicense:Apache-2.0Stargazers:893Issues:24Issues:8

GPT-RAG

Sharing the learning along the way we been gathering to enable Azure OpenAI at enterprise scale in a secure manner. GPT-RAG core is a Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.

Language:BicepLicense:MITStargazers:823Issues:15Issues:68

mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍

Language:PythonLicense:Apache-2.0Stargazers:756Issues:3Issues:16

HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Language:PythonLicense:Apache-2.0Stargazers:669Issues:6Issues:20

LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language:PythonLicense:MITStargazers:524Issues:24Issues:69

textbook_quality

Generate textbook-quality synthetic LLM pretraining data

Language:PythonLicense:MITStargazers:473Issues:8Issues:6

deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Language:PythonLicense:Apache-2.0Stargazers:455Issues:6Issues:24

MathPile

Generative AI for Math: MathPile

Language:PythonLicense:Apache-2.0Stargazers:367Issues:7Issues:5

galactic

data cleaning and curation for unstructured text

Language:PythonLicense:Apache-2.0Stargazers:323Issues:8Issues:4

RetNet

Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent, and chunkwise forward.

Language:Jupyter NotebookLicense:MITStargazers:225Issues:5Issues:31

sodaverse

🥤🧑🏻‍🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"

Language:PythonLicense:MITStargazers:217Issues:18Issues:8

T-Eval

[ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step

Language:PythonLicense:Apache-2.0Stargazers:199Issues:3Issues:49

EETQ

Easy and Efficient Quantization for Transformers

Language:C++License:Apache-2.0Stargazers:170Issues:6Issues:22
Language:PythonLicense:Apache-2.0Stargazers:130Issues:6Issues:11

MetaTool

[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

Language:PythonLicense:MITStargazers:56Issues:2Issues:9

TrFr

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

mixtral-vis-moe

Visualize expert firing frequencies across sentences in the Mixtral MoE model

Language:PythonLicense:Apache-2.0Stargazers:17Issues:1Issues:0

Data_StrategyQA

StrategyQA 데이터 세트 번역

Stargazers:12Issues:0Issues:0