Broken dataset in LLM 04

Question

Broken dataset in LLM 04

romainfut-db opened this issue 7 months ago · comments

In LLM 04 demo, we call imdb_ds = load_dataset("imdb") as our fine-tuning dataset.
It looks like there was an update to this dataset, and this line will throw an error ExpectedMoreSplits: {'unsupervised'}.

This can be fixed by forcing a re-install of the latest version of Hugging Face's datasets library. However doing so breaks the code further down where it can't find the train and validation splits in the dataset object.

Marco Graziano commented 3 months ago

+1