mathpopo's repositories
Llama2-Chinese
Llama中文社区,最好的中文Llama大模型,完全开源可商用
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Awesome-Linux-Software-zh_CN
🐧 一个 Linux 上超赞的应用,软件,工具以及其它资源的集中地。
codellama
Inference code for CodeLlama models
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
CV-CUDA
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
CVCUDA_FaceStoreHelper-release
Psyche AI Inc release source "CVCUDA_FaceStoreHelper"
DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
Experiments-with-Gemma-2B
I’ll be testing different Gemma models and sharing the results here and on my Hugging Face space. Stay tuned for updates!
gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
gpt-engineer
Specify what you want it to build, the AI asks for clarification, and then builds it.
infinigen
Infinite Photorealistic Worlds using Procedural Generation
llama
Inference code for LLaMA models
magic-avatar
MagicAvatar: Multimodal Avatar Generation and Animation
MetaTransformer
Meta-Transformer for Unified Multimodal Learning
nvm
Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions
pandas-llm
Pandas-LLM
project-based-learning
Curated list of project-based tutorials
python-docs-samples
Code samples used on cloud.google.com
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Real-Gemini
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。
recognize-anything
Code for the Recognize Anything Model (RAM) and Tag2Text Model
Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
torchexplorer
Interactively inspect module inputs, outputs, parameters, and gradients.
Waifu2x-Extension-GUI
Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, Real-ESRGAN, Real-CUGAN, RTX Video Super Resolution VSR, SRMD, RealSR, Anime4K, RIFE, IFRNet, CAIN, DAIN, and ACNet.
yolo-world-with-efficientvit-sam
YOLO-World + EfficientViT SAM