Adaxry

Adaxry's starred repositories

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.036177 348 1748

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.034357 340 2908

immersive-translate

沉浸式双语网页翻译扩展 , 支持输入框翻译，鼠标悬停翻译， PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension

NOASSERTION13938 79 1582

mamba

Mamba SSM architecture

Language:PythonApache-2.012179 99 471

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++Apache-2.09958 124 735

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION9699 162 656

WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Language:Python9159 111 189

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.07935 85 1715

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonApache-2.07508 109 152

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell6927 43 725

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT6429 62 79

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:PythonApache-2.04631 52 148

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonApache-2.04489 76 88

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

CC-BY-4.02866 72 38

LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

Language:PythonMIT2858 31 20

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION1793 24 173

codealpaca

Language:PythonApache-2.01404 21 19

Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数，训练数据，评估数据，评估方法。

Language:PythonNOASSERTION1197 23 63

Adaxry

Adaxry's starred repositories

ollama

FastChat

DeepSpeed

immersive-translate

mamba

sentencepiece

Megatron-LM

WizardLM

TensorRT-LLM

TinyLlama

Qwen2

streaming-llm

MiniCPM

RedPajama-Data

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

LLM4Decompile

Megatron-DeepSpeed

codealpaca

Skywork

InfiniteBench

Spec-Bench

optimum-habana

EasyTranslator

self-speculative-decoding

lm_dataformat

PolyLM

fast_robust_early_exit

wmt23-news-systems

Unified_Layer_Skipping

Mango