Abhilash Majumder (abhilash1910)

abhilash1910

Geek Repo

Company:Compilers & Frameworks @NVIDIA | ex- @intel | @MorganStanley | @HSBC

Location:India

Home Page:https://linktr.ee/abhilashmajumder

Twitter:@abhilash1396

Github PK Tool:Github PK Tool

Abhilash Majumder's repositories

llm.tpc

TPC Habana Gaudi2 port of llm.c

Language:C++Stargazers:5Issues:1Issues:0

llama.cpp

Port of Facebook's LLaMA model in C/C++

Language:C++License:MITStargazers:3Issues:1Issues:0

llm.sycl

LLM training in SYCL

Language:C++License:MITStargazers:3Issues:0Issues:0

accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Language:PythonLicense:Apache-2.0Stargazers:1Issues:1Issues:0

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:1Issues:0

ao

PyTorch native quantization and sparsity for training and inference

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:CudaLicense:MITStargazers:0Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

License:MITStargazers:0Issues:0Issues:0

bitsandbytes

8-bit CUDA functions for PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

bitsandbytes-SYCL

Hosts sycl kernels for bitsandbytes for experimental purposes.

Language:C++License:MITStargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

License:NOASSERTIONStargazers:0Issues:0Issues:0

cutlass-fork

CUDA Templates for Linear Algebra Subroutines

License:NOASSERTIONStargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Liger-Kernel

Efficient Triton Kernels for LLM Training

Language:PythonLicense:BSD-2-ClauseStargazers:0Issues:0Issues:0

llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

License:NOASSERTIONStargazers:0Issues:0Issues:0

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ollama

Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.

License:MITStargazers:0Issues:0Issues:0

optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

PaddleCustomDevice

PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

License:Apache-2.0Stargazers:0Issues:0Issues:0

SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

sycl-for-cuda

Codeplay project for contributions to the LLVM SYCL implementation

Language:C++Stargazers:0Issues:0Issues:0
Language:LLVMLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:LLVMLicense:NOASSERTIONStargazers:0Issues:0Issues:0

tgi-gaudi

Large Language Model Text Generation Inference on Habana Gaudi

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0