Awsaf's starred repositories
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
MLQuestions
Machine Learning and Computer Vision Engineer - Technical Interview Questions
uvadlc_notebooks
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
multimodal
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
torchtitan
A native PyTorch Library for large model training
Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks
awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
ml-mobileclip
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
DesignEdit
Code for DesignEdit
ml-tic-clip
Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".
diffusion_memorization
Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)