Todor Mihaylov's repositories
overleaf-backup-tool
A script that automatically backups all Overleaf projects to a local folder. It works.
CLIP_prefix_caption
Simple image captioning model
comet-commonsense
Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317
discourse-aware-semantic-self-attention
Repository for code and data from the EMNLP-IJCNLP 2019 paper "Discourse-aware Semantic Self-Attention for Narrative Reading Comprehension"
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
jeopardy_clue_dataset
A dataset containing 473,000 Jeopardy! clues (1984–2023).
natural-instructions
Expanding natural instructions
OpenBookQA-1
Code for experiments on OpenBookQA from the EMNLP 2018 paper "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering"
qa_datasets_converter
Formate converter from one type of qa task datasets to another type
quynh-and-todor
A basic wedding website I created for myself and Bec using the Bulma CSS framework, Particles.js, jQuery.countdown, Google Satisfy Font and FontAwesome icons.
selenium-aws-fargate-demo
Run a python selenium web scraper on AWS Fargate
SentiWordNet
The SentiWordNet sentiment lexicon
serializable-self-attentive-parser
High-accuracy NLP parser with models for 11 languages.
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.