Norod / hebrew-gpt_neo

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 made avilable via TPU Research Cloud Program.

Home Page:https://huggingface.co/models?filter=gpt_neo,he&sort=modified

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hebrew-gpt_neo

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 which was made avilable to me via the TPU Research Cloud Program.

JS Colab notebook Open in Google Colab

Gradio Colab notebook Open in Google Colab

Datasets

  1. An assortment of various Hebrew corpuses - I have made it available here

  2. oscar / unshuffled_deduplicated_he - Homepage | Dataset Permalink

The Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the goclassy architecture.

Models

hebrew-gpt_neo-xl

hebrew-gpt_neo-small

hebrew-gpt_neo-tiny

About

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 made avilable via TPU Research Cloud Program.

https://huggingface.co/models?filter=gpt_neo,he&sort=modified

License:MIT License


Languages

Language:Jupyter Notebook 91.8%Language:Python 5.4%Language:HTML 2.4%Language:Dockerfile 0.2%Language:Shell 0.1%Language:Makefile 0.1%