Bram Vanroy's repositories
spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
mateo-demo
MAchine Translation Evaluation Online (MATEO)
mai-simplification-nl-2023
Sentence-Level Text Simplification for Dutch
astred-demo
Demo app to illustrate ASTrED
alignment-handbook
Robust recipes to align language models with human and AI preferences
bitsandbytes
8-bit CUDA functions for PyTorch
datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
distilabel
Distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency
distilabel-helm-instruct-adaptable-evaluation-criteria
A repo that implements Stanford CRFM their HELM Instruct with adaptable evaluation criteria
llama.cpp
LLM inference in C/C++
optimum
π Accelerate training and inference of π€ Transformers and π€ Diffusers with easy to use hardware optimization tools
outlines
Generative Model Programming
penman
PENMAN notation (e.g. AMR) in Python
sacrebleu
Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
transformers
π€ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.