Mr-Nineteen

Mr-Nineteen's starred repositories

RecSysPapers

推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.

Language:PythonBSD-2-Clause126800

code-samples

Source code examples from the Parallel Forall Blog

Language:HTMLBSD-3-Clause123000

generative-recommenders

Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Language:PythonApache-2.070500

Time-Series-Library

A Library for Advanced Deep Time Series Models.

Language:PythonMIT671000

iTransformer

Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah

Language:PythonMIT122200

parallel-hashmap

A family of header-only, very fast and memory-friendly hashmap and btree containers.

Language:C++Apache-2.0251700

gpu-sum-reduction

CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.

Language:Cuda3500

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:Cuda152100

sampleQAT

Inference of quantization aware trained networks using TensorRT

Language:PythonApache-2.07700

CUDALibrarySamples

CUDA Library Samples

Language:CudaNOASSERTION157800

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Language:C++MIT693900

onnx-simplifier

Simplify your onnx model

Language:C++Apache-2.0382600

CUDA-Programming-Guide-in-Chinese

This is a Chinese translation of the CUDA programming guide

122400

llama

Inference code for Llama models

Language:PythonNOASSERTION5609000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1382800

trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Language:PythonApache-2.0149500

alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Language:PythonMIT10600

optimate

A collection of libraries to optimise AI model performances

Language:PythonApache-2.0837900

TensorRT-Developer_Guide_in_Chinese

22100

Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集

Language:Python302600

Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。

Language:PythonApache-2.0401100

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.014149200

onnxruntime-training-examples

Examples for using ONNX Runtime for model training.

Language:C#MIT30900

onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX

Language:C++Apache-2.0293400

custom-op

Guide for building custom op for TensorFlow

Language:SmartyApache-2.037800

io

Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO

Language:C++Apache-2.070400

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on high-bandwidth memory (HBM) of GPUs and in host memory. It also can be used as a generic key-value storage.

Language:CudaApache-2.013000