conver334

Simiao Zhang's starred repositories

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0773500

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION954300

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++NOASSERTION501500

cs249r_book

Collaborative book Machine Learning Systems

Language:TeXNOASSERTION71000

LLMs_interview_notes

该仓库主要记录大模型（LLMs）算法工程师相关的面试题

Apache-2.0123000

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT78500

awesome-llm-powered-agent

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

MIT124800

LLMAgentPapers

Must-read Papers on LLM Agents.

151900

LLM-Agents-Papers

A repo lists papers related to LLM based agent

Language:Python87800

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT1160600

OpenVoice

Instant voice cloning by MyShell.

Language:PythonMIT2767800

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run.

Language:PythonMIT1215500