Xuweijia-buaa's repositories

alphaFM

Multi-thread implementation of Factorization Machines with FTRL for binary-class classification problem.

Language:C++License:MITStargazers:0Issues:0Issues:0

cpufp

A CPU tool for benchmarking the peak of floating points

Language:C++License:GPL-3.0Stargazers:0Issues:1Issues:0
Language:C++Stargazers:0Issues:0Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

deep-learning-framework-needle

torch-like, can train cnn,lstm network etc.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

graph-rec

Senior Capstone Project: Graph-Based Product Recommendation

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

How_to_optimize_in_GPU-qiqizi

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

License:Apache-2.0Stargazers:0Issues:0Issues:0

KB2E

Knowledge Graph Embeddings including TransE, TransH, TransR and PTransE

Language:C++License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.

Language:C++License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

License:NOASSERTIONStargazers:0Issues:0Issues:0

PaddleRec

Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM,TiSAS,AutoFIS等,

License:Apache-2.0Stargazers:0Issues:0Issues:0

parallel-decoding

Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"

License:MITStargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

License:NOASSERTIONStargazers:0Issues:0Issues:0

pytorch-examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

pytorch-extension-cpp

C++ extensions in PyTorch

Stargazers:0Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable.

License:Apache-2.0Stargazers:0Issues:0Issues:0

tensorRT-learn

tensorRT-learn start-from-trt-comprtition

Language:PythonStargazers:0Issues:0Issues:0

torchrec

Pytorch domain library for recommendation systems

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

License:Apache-2.0Stargazers:0Issues:0Issues:0

VisCPM

Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

Language:PythonStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0