sbmaruf / roots_data_download

Sample data from the roots corpus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Roots Data Download

Sample data from the roots corpus

[1] Go to the dataset section of https://huggingface.co/bigscience-data

[2] Open each of the data split and accept BigScience Ethical Charter. Otherwise you won't be able to download data.

[3] Open the notebook. Load and Sample data according to your wish. Please note that the notebook supports sampling from multinomial distribution.

About

Sample data from the roots corpus


Languages

Language:Jupyter Notebook 59.1%Language:Python 40.9%