Faizan Amin's starred repositories
awesome-multimodal-in-medical-imaging
A collection of resources on applications of multi-modal learning in medical imaging.
gemma_pytorch
The official PyTorch implementation of Google's Gemma models
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
LLM-Finetuning
LLM Finetuning with peft
VoiceTyping
通过语音(说话)即可完成实时文本输入。通过PaddleSpeech项目二次开发 完成,支持离线脱网环境部署,支持GPU推理,目前客户端仅支持Windows。
CDCN-Face-Anti-Spoofing.pytorch
Apply Central Difference Convolutional Network (CDCN) for face anti spoofing
WarpFusion
WarpFusion
Flask-React-Google-Login
Google Login using React, Flask, Google OAuth, JWT
generative-models
Generative Models by Stability AI
faster-whisper
Faster Whisper transcription with CTranslate2
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
SoundStorm
The reproduced code for Google's SoundStorm
yolov8-object-tracking
YOLOv8 Object Tracking Using PyTorch, OpenCV and Ultralytics
MAXINE-AR-SDK
NVIDIA AR SDK - API headers and sample applications
Silent-Face-Anti-Spoofing
静默活体检测(Silent-Face-Anti-Spoofing)
FaceBagNet
FaceBagNet - Patch-based Methods for Multi-modal Face Anti-spoofing (FAS)