Mehdi Cherti's starred repositories
TextAttack
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
mlm-scoring
Python library & examples for Masked Language Model Scoring (ACL 2020)
schedule_free
Schedule-Free Optimization in PyTorch
khoj
Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
CounterCurate
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
ml-tic-clip
Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".
DMimageDetection
On the detection of synthetic images generated by diffusion models
SyntheticData
Is synthetic data from generative models ready for image recognition?
generative-robustness
Create generated datasets and train robust classifiers
digital_chirality
Testing the chirality of digital imaging operations.
Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
why-winoground-hard
Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
instruction-datasets
All available datasets for Instruction Tuning of Large Language Models
data-is-better-together
Let's build better datasets, together!
llm-reasoners
A library for advanced large language model reasoning