Sihan Chen (Spycsh)

Spycsh

Geek Repo

Company:Intel

Location:Shanghai

Github PK Tool:Github PK Tool

Sihan Chen's starred repositories

prometheus-fastapi-instrumentator

Instrument your FastAPI with Prometheus metrics.

Language:PythonLicense:ISCStargazers:882Issues:0Issues:0

GenAIEval

Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination

Language:PythonLicense:Apache-2.0Stargazers:14Issues:0Issues:0

jaeger

CNCF Jaeger, a Distributed Tracing Platform

Language:GoLicense:Apache-2.0Stargazers:19914Issues:0Issues:0

GenAIComps

GenAI components at micro-service level; GenAI service composer to create mega-service

Language:PythonLicense:Apache-2.0Stargazers:24Issues:0Issues:0

docarray

Represent, send, store and search multimodal data

Language:PythonLicense:Apache-2.0Stargazers:2863Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:23414Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:29492Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5179Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5375Issues:0Issues:0

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:9858Issues:0Issues:0

kserve

Standardized Serverless ML Inference Platform on Kubernetes

Language:PythonLicense:Apache-2.0Stargazers:3310Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7614Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5670Issues:0Issues:0

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonLicense:BSD-3-ClauseStargazers:7796Issues:0Issues:0

noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

Language:Jupyter NotebookLicense:MITStargazers:1358Issues:0Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:10990Issues:0Issues:0

text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Language:PythonLicense:Apache-2.0Stargazers:6037Issues:0Issues:0

tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2267Issues:0Issues:0

notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3722Issues:0Issues:0

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonLicense:Apache-2.0Stargazers:2072Issues:0Issues:0

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

Language:C++License:Apache-2.0Stargazers:3531Issues:0Issues:0

optimization-manual

Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"

Language:AssemblyLicense:0BSDStargazers:758Issues:0Issues:0

Awesome-Pruning

A curated list of neural network pruning resources.

Stargazers:2291Issues:0Issues:0

awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

License:MITStargazers:455Issues:0Issues:0
Language:C++License:BSL-1.0Stargazers:237Issues:0Issues:0

oneAPI_course

oneAPI - Data Parallel C++ course for students

Language:C++License:Apache-2.0Stargazers:35Issues:0Issues:0

streamingbook

Code snippets from the Streaming Systems book (streamingbook.net).

Language:JavaLicense:Apache-2.0Stargazers:239Issues:0Issues:0

manim

Animation engine for explanatory math videos

Language:PythonLicense:MITStargazers:60842Issues:0Issues:0

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language:PythonLicense:Apache-2.0Stargazers:2086Issues:0Issues:0
Language:JavaStargazers:7Issues:0Issues:0