yueyericardo

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.

Language:PythonAGPL-3.011888 70 405

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT10797 74 12

phidata

Build AI Assistants with memory, knowledge and tools.

Language:PythonMPL-2.010388 81 130

WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Language:Python9041 111 189

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT8655 81 34

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.07345 84 1524

LWM

Language:PythonApache-2.06987 65 66

mlx-examples

Examples in the MLX framework

Language:PythonMIT5487 57 392

graphcast

Language:PythonApache-2.04358 73 73

llm-viz

3D Visualization of an GPT-style LLM

Language:TypeScript3180 27 14

ompi

Open MPI main development repository

Language:CNOASSERTION2065 118 3565

llama2.mojo

Inference Llama 2 in one file of pure 🔥

Language:MojoMIT2051 27 44

penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

Language:PythonApache-2.01530 17 12

cali.so

Cali 的个人官网开源项目

Language:TypeScript1473 9 31

materials_discovery

Language:PythonApache-2.0835 49 20

benchmark

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

Language:PythonBSD-3-Clause815 227 860

chemcrow-public

Chemcrow

Language:PythonMIT502 15 16

cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Language:Cuda416 180

AgentKit

An intuitive LLM prompting framework for multifunctional agents, by explicitly constructing a complex "thought process" from simple natural language prompts.

Language:PythonCC-BY-4.0268 8 9

westpa

WESTPA: The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis

Language:PythonMIT183 24 68

flash-llm

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Language:CudaApache-2.0155 5 4

fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Language:CudaApache-2.0149 4 7

mm

Language:HTMLMIT107 1 2

ohara

Collection of autoregressive model implementation

Language:Python6100