The ONLP Lab's repositories
NEMO-Corpus
Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested mentions, and more.
Hebrew-Sentiment-Data
A cleaned version of the Hebrew Sentiment data set published by Amram, A., Ben-David, A., and Tsarfaty, R. (2018).
he_treebanks
The UD (Universal Dependencies) Hebrew treebank and a modification tool.
Modality-Corpus
A processed version of the GME corpus
morphodetection
code for paper https://arxiv.org/abs/2104.08512
LemmaSplitting
Repo for https://aclanthology.org/2022.acl-short.96.pdf
AlephBERT-demo
Demo for AlephBERT: Language Model Pre-training and Evaluation from Sub-Word to Sentence Level (ACL 2022)
HeTrue
This repository hosts the HeTrue dataset and accompanying credibility assessment model, as presented in our EMNLP publication. Designed for credibility assessment in Hebrew, the dataset is a product of our collaboration with "The Whistle" from Globes. We encourage the academic community to utilize and provide feedback on these resources.