Xumi (XuMicoder)

XuMicoder

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

Xumi's repositories

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

CuAssembler

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:CudaStargazers:0Issues:0Issues:0

DIS

This is the repo for our new project Highly Accurate Dichotomous Image Segmentation

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models"

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

FGVC-PIM

Pytorch implementation for "A Novel Plug-in Module for Fine-Grained Visual Classification". fine-grained visual classification task.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

gpt4all

gpt4all: an ecosystem of open-source chatbots trained on a massive collections of clean assistant data including code, stories and dialogue

Language:C++License:MITStargazers:0Issues:0Issues:0

how-to-optimize-gemm

RowMajor sgemm optimization

Language:C++License:MITStargazers:0Issues:0Issues:0

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++Stargazers:0Issues:0Issues:0

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

License:NOASSERTIONStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

License:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

License:Apache-2.0Stargazers:0Issues:0Issues:0

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

License:MITStargazers:0Issues:0Issues:0

yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

License:GPL-3.0Stargazers:0Issues:0Issues:0