Adaxry

Adaxry

Geek Repo

Company:WeChat AI, Tencent Inc, China

Location:Beijing

Github PK Tool:Github PK Tool

Adaxry's starred repositories

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:36177Issues:348Issues:1748

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:34357Issues:340Issues:2908

immersive-translate

沉浸式双语网页翻译扩展 , 支持输入框翻译, 鼠标悬停翻译, PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12179Issues:99Issues:471

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++License:Apache-2.0Stargazers:9958Issues:124Issues:735

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:9699Issues:162Issues:656

WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7935Issues:85Issues:1715

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:7508Issues:109Issues:152

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6429Issues:62Issues:79

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:PythonLicense:Apache-2.0Stargazers:4631Issues:52Issues:148

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonLicense:Apache-2.0Stargazers:4489Issues:76Issues:88

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

Language:PythonLicense:MITStargazers:2858Issues:31Issues:20

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1793Issues:24Issues:173
Language:PythonLicense:Apache-2.0Stargazers:1404Issues:21Issues:19

Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。

Language:PythonLicense:NOASSERTIONStargazers:1197Issues:23Issues:63

InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Language:PythonLicense:MITStargazers:228Issues:9Issues:18

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Language:PythonLicense:Apache-2.0Stargazers:136Issues:1Issues:11

optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Language:PythonLicense:Apache-2.0Stargazers:136Issues:43Issues:96

EasyTranslator

Your Companion for Multilingual Reading

Language:PythonLicense:GPL-3.0Stargazers:124Issues:1Issues:1

self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:122Issues:3Issues:17

fast_robust_early_exit

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)

Mango

Code for "Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective"

Language:PythonStargazers:2Issues:1Issues:0