Cant Load File medical_meadow_small.json
davidlee1102 opened this issue · comments
Please check again on your training code sample.
from datasets import load_dataset
data = load_dataset("json",data_files="/kaggle/working/medAlpaca/medical_meadow_small.json")
---ERROR---
File /opt/conda/lib/python3.10/site-packages/datasets/packaged_modules/json/json.py:150, in Json._generate_tables(self, files)
145 except json.JSONDecodeError:
146 raise e
147 raise ValueError(
148 f"Not able to read records in the JSON file at {file}. "
149 f"You should probably indicate the field of the JSON file containing your records. "
--> 150 f"This JSON file contain the following fields: {str(list(dataset.keys()))}. "
151 f"Select the correct one and provide it as field='XXX'
to the dataset loading method. "
152 ) from None
153 # Uncomment for debugging (will print the Arrow table size and elements)
154 # logger.warning(f"pa_table: {pa_table} num rows: {pa_table.num_rows}")
155 # logger.warning('\n'.join(str(pa_table.slice(i, 1).to_pydict()) for i in range(pa_table.num_rows)))
156 yield (file_idx, batch_idx), self._cast_classlabels(pa_table)
AttributeError: 'list' object has no attribute 'keys'
I use the same dataset you have used, I have checked, and I think the error comes from the env on Google Colab and Kaggle, so would you mind trying it on Google Colab or Kaggle ?