Zhang Jun's repositories

zhangjun.github.io

https://zhangjun.github.io

stable_diffusion_compile

compile stable diffusion to run faster

Language:PythonStargazers:1Issues:1Issues:0

TensorRT-Server

TensorRT Server

Language:C++License:Apache-2.0Stargazers:1Issues:1Issues:0

WeChatMsg

提取微信聊天记录,将其导出成HTML、Word、CSV文档永久保存,对聊天记录进行分析生成年度聊天报告

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

Language:C++License:Apache-2.0Stargazers:0Issues:1Issues:0

Paddle-Lite

Multi-platform high performance deep learning inference engine (『飞桨』多平台高性能深度学习预测引擎)

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

BaiduPCS-Go

iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能

Language:GoLicense:Apache-2.0Stargazers:0Issues:0Issues:0

community

PaddlePaddle Developer Community

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large languages. Release repo for Vicuna and FastChat-T5.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:GoStargazers:0Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLM

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:C++Stargazers:0Issues:2Issues:12

oneflow-diffusers

OneFlow backend for 🤗 Diffusers and ComfyUI

Language:PythonStargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

PaddleFleetX

Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep

License:Apache-2.0Stargazers:0Issues:0Issues:0

PaddleNLP

👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

stable-diffusion-webui-docker

stable diffusion webui docker

Language:ShellLicense:Apache-2.0Stargazers:0Issues:0Issues:0

stable-fast

An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

StableTriton

The first open source triton inference engine for Stable Diffusion, specifically for sdxl

License:Apache-2.0Stargazers:0Issues:0Issues:0

Taipy-Chatbot-Demo

A template to create any LLM Inference Web Apps using Python only

Stargazers:0Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

transformer_framework

framework for plug and play of various transformers (vision and nlp) with FSDP

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:CudaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0