This repository contains resources from the paper Sentence Embedding Models for Ancient Greek Using Multilingual Knowledge Distillation, including parallel sentence data (Ancient Greek - English) and evaluation datasets for Ancient Greek sentence embeddings.
The parallel sentence data was primarily produced using a modified Bertalign implementation for sentence alignment.
See the shlm-grc-en sentence embedding model which was trained on the parallel sentence data in this repository.