NLP AUEB's repositories
edgar-crawler
The only open-source toolkit that can download EDGAR financial reports and extract textual data from specific item sections into nice and clean JSON files.
greek-bert
A Greek edition of BERT pre-trained language model
deep-relevance-ranking
Deep Relevance Ranking Using Enhanced Document-Query Interactions
bio_image_caption
Biomedical Image Captioning
gr-nlp-toolkit
A Transformer-based natural language processing toolkit for (modern) Greek.
multi-eurlex
MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
bioCaption
Diagnostic Captioning
aueb-bioasq6
AUEB at BioASQ 6: Document and Snippet Retrieval
aueb-bioasq7
AUEB at BioASQ 7: Document and Snippet Retrieval
multiple-choice-mutation
Multiple Choice Mutation (MCM) is a technique for generating good quality domain-specific synthetic data with an LLM.