KennethEnevoldsen / KennethEnevoldsen

Personal repository

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kenneth Enevoldsen

Researcher, scholar, teacher




The following are projects I am actively maintaining or contributing to. More might have been added since then.

Name Description
MTEB The Massive Text Embedding Benchmark for evaluating document embeddings e.g. for RAG systems.
Scandinavian Embedding Benchmark A Scandinavian Benchmark for evaluating document embeddings
DaCy The State of the Art Danish NLP pipeline for SpaCy
tomsup Theory of Mind Simulation using Python. A package that allows for easy agent-based modeling of recursive Theory of Mind agents
Augmenty An structured augmentation library for augmenting both the texts and the annotations
TextDescriptives A Python library for calculating a large variety of metrics from text
timeseriesflattener for converting irregularly spaced time series, such as electronic health records, into statically shaped data frames.
Asent An educational library for performing transparent sentiment analysis
ScandEval An evaluation benchmark for the Scandinavian and Germanic language models evaluating natural language understanding and generation.
swift-python-cookiecutter The cookie-cutter template I actively use for my packages
UD_Danish-DDT The Danish Universal Dependencies Treebank, a high quality linguistic resource


A selection of contributions to open-source libraries, besides the ones to which I am actively contributing.

Library Contribution
Huggingface Libraries:
datasets Fixes for minor compatibility issue with numpy >=2.0.0
transformers Bugfixes for training masked language models using flax
SpaCy core libraries:
spacy-transformers Allow passing arguments to the transformer backend to obtain attention weights
confection Fixed issue where config where could not be filled
spacy-curated-transformers Added support for ELECTRA tokenizers
curated-transformers  Added ELECTRA


Personal repository