transformerlab / transformerlab-app

Experiment with Large Language Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

More flexible custom dataset file structures

dadmobile opened this issue · comments

Our underlying code uses huggingface load_dataset which allows for flexible file system structures for custom local datasets:

https://huggingface.co/docs/hub/en/datasets-file-names-and-splits

But our app and API code force the user to use a very specific format (exactly one file each of <dataset_id>[train|eval].jsonl)