error while loading qed in sourcing
shanyas10 opened this issue · comments
Hi, I'm getting this error while using qed
as the dataset in sourcing view:
ArrowInvalid: Could not convert in with type str: tried to convert to boolean
Traceback:
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/streamlit/script_runner.py", line 354, in _run_script
exec(code, module.__dict__)
File "/Users/s0s0cr3/Documents/GitHub/promptsource/promptsource/app.py", line 260, in <module>
dataset = get_dataset(dataset_key, str(conf_option.name) if conf_option else None)
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/streamlit/caching.py", line 543, in wrapped_func
return get_or_create_cached_value()
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/streamlit/caching.py", line 527, in get_or_create_cached_value
return_value = func(*args, **kwargs)
File "/Users/s0s0cr3/Documents/GitHub/promptsource/promptsource/utils.py", line 49, in get_dataset
builder_instance.download_and_prepare()
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/builder.py", line 607, in download_and_prepare
self._download_and_prepare(
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/builder.py", line 697, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/builder.py", line 1106, in _prepare_split
num_examples, num_bytes = writer.finalize()
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/arrow_writer.py", line 456, in finalize
self.write_examples_on_file()
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/arrow_writer.py", line 325, in write_examples_on_file
pa_array = pa.array(typed_sequence)
File "pyarrow/array.pxi", line 222, in pyarrow.lib.array
File "pyarrow/array.pxi", line 110, in pyarrow.lib._handle_arrow_array_protocol
File "/Users/s0s0cr3/Library/Python/3.9/lib/python/site-packages/datasets/arrow_writer.py", line 121, in __arrow_array__
out = pa.array(cast_to_python_objects(self.data, only_1d_for_numpy=True), type=type)
File "pyarrow/array.pxi", line 305, in pyarrow.lib.array
File "pyarrow/array.pxi", line 39, in pyarrow.lib._sequence_to_array
File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
I've been able to use this dataset fine while prompting but not sure what's breaking now. I also tried deleting the cache and downloading again but it's still the same.
Maybe a temporary thing? It just downloaded for me.
I just tested and realized that it's a new issue since datasets==1.15.0
.
We may have to pin to datasets==1.14.0
i am running into the same issue and i have re-opened the issue on datasets
(see huggingface/datasets#3346 (comment)).
@shanyas10 let's put that dataset aside for now
fixed huggingface/datasets#3417
you'll need to be on datasets' master @shanyas10