Suzana Ilić's starred repositories
annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
knockknock
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
promptsource
Toolkit for creating, sharing and using natural language prompts.
awesome-papers
Papers & presentation materials from Hugging Face's internal science day
huggingface_hub
The official Python client for the Huggingface Hub.
bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
biomedical
Tools for curating biomedical training data for large-scale language modeling
data_tooling
Tools for managing datasets for governance and training.
evaluation
Code and Data for Evaluation WG
edgeai-lab-microcontroller-series
This repository is to share the EdgeAI Lab with Microcontrollers Series material to the entire community. We will share documents, presentations and source code of two demo applications.
bigscience-workshop.github.io
Alternative to https://github.com/Dynalon/mdwiki-seed
catalogue_data
Scripts to prepare catalogue data
historical_texts
BigScience working group on language models for historical texts
datasets_stats
Generate statistics over datasets used in the context of BS