z562's starred repositories

DeepSeek-Coder-V2

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

License:MITStargazers:1746Issues:0Issues:0

LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1400Issues:0Issues:0

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:323Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:11285Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:13055Issues:0Issues:0

pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

Language:PythonLicense:NOASSERTIONStargazers:12396Issues:0Issues:0

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonLicense:Apache-2.0Stargazers:4491Issues:0Issues:0

open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

License:Apache-2.0Stargazers:7329Issues:0Issues:0

LEval

[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Language:PythonLicense:GPL-3.0Stargazers:332Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:1145Issues:0Issues:0

llm-foundry

LLM training code for Databricks foundation models

Language:PythonLicense:Apache-2.0Stargazers:3927Issues:0Issues:0

chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Language:Jupyter NotebookLicense:MITStargazers:2482Issues:0Issues:0

FewCLUE

FewCLUE 小样本学习测评基准,中文版

Language:PythonStargazers:489Issues:0Issues:0

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonLicense:MITStargazers:165960Issues:0Issues:0

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5669Issues:0Issues:0

Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2545Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:12302Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29284Issues:0Issues:0

FLASHQuad_pytorch

FLASHQuad_pytorch

Language:PythonLicense:MITStargazers:65Issues:0Issues:0

Tnn

[ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling

Language:PythonStargazers:70Issues:0Issues:0

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:12131Issues:0Issues:0

knn-box

an easy-to-use knn-mt toolkit

Language:PythonLicense:MITStargazers:101Issues:0Issues:0

Image-Super-Resolution-via-Iterative-Refinement

Unofficial implementation of Image Super-Resolution via Iterative Refinement by Pytorch

Language:PythonLicense:Apache-2.0Stargazers:3501Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:67197Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:130945Issues:0Issues:0

DiffuSeq

[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Language:PythonLicense:MITStargazers:705Issues:0Issues:0
Language:PythonStargazers:28Issues:0Issues:0

CoNT

[NeurIPS'22 Spotlight] Data and code for our paper CoNT: Contrastive Neural Text Generation

Language:PythonStargazers:150Issues:0Issues:0

ParaSCI

a large scientific paraphrase dataset for longer paraphrase generation

Stargazers:37Issues:0Issues:0

cosFormer

[ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

Language:PythonLicense:Apache-2.0Stargazers:176Issues:0Issues:0