Beast code in Giters

Divebomb's starred repositories

KunQuant

A compiler, optimizer and executor for financial expressions and factors

Language:PythonApache-2.07400

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonApache-2.0189900

Knowledge-Distillation-Zoo

Pytorch implementation of various Knowledge Distillation (KD) methods.

Language:Python156100

Awesome-Quantization-Papers

List of papers related to neural network quantization in recent AI conferences and journals.

MIT37900

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

174100

Faster-LLM-Survey

Language:Python3600

grok-1

Grok open release

Language:PythonApache-2.04920700

Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

87300

UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Language:PythonApache-2.0294400

ET-BERT

The repository of ET-BERT, a network traffic classification model on encrypted traffic. The work has been accepted as The Web Conference (WWW) 2022 accepted paper.

Language:PythonMIT31700

line_profiler

Line-by-line profiling for Python

Language:PythonNOASSERTION259500

alpa

Training and serving large-scale neural networks with auto parallelization.

Language:PythonApache-2.0302000

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION951000

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonApache-2.03225200

gemma

Open weights LLM from Google DeepMind.

Language:Jupyter NotebookApache-2.0227400

Made-With-ML

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Language:Jupyter NotebookMIT3669200

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

48000

BERT-LoRA-TensorRT

This repository contains a custom implementation of the BERT model, fine-tuned for specific tasks, along with an implementation of Low Rank Approximation (LoRA). The models are optimized for high performance using NVIDIA's TensorRT.

Language:Jupyter NotebookApache-2.04700

agents

An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents

Language:PythonApache-2.0498200

KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

MIT25500

data-centric-AI

A curated, but incomplete, list of data-centric AI resources.

101200

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02374900

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonMIT112100

llm-action

本项目旨在分享大模型相关技术原理以及实战经验。

Language:HTMLApache-2.0812400

ann-benchmarks

Benchmarks of approximate nearest neighbor libraries in Python

Language:PythonMIT477400

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03403500

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0769100

jphgxq