select fields in scan does not works

Question

select fields in scan does not works

djouallah opened this issue 4 months ago · comments

Mimoune commented 4 months ago

Apache Iceberg version

0.6.0 (latest release)

Please describe the bug 🐞

running this

table.scan(selected_fields=('file'),).to_pandas()
i get this error

KeyError: 'f'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/pyiceberg/schema.py](https://localhost:8080/#) in select(self, case_sensitive, *names)
    308                 ids = {self._lazy_name_to_id_lower[name.lower()] for name in names}
    309         except KeyError as e:
--> 310             raise ValueError(f"Could not find column: {e}") from e
    311 
    312         return prune_columns(self, ids)

ValueError: Could not find column: 'f'

the field exist for sure, i tried with other columns , same error

Kevin Liu · Answer 1 · Sun May 12 2024 01:25:51 GMT+0800 (China Standard Time)

KeyError: 'f'

Feels like it's treating the string as a list of characters. I think selected_fields should be a tuple of strings.

table.scan(selected_fields=('file',),).to_pandas()

Notice the , after 'file', which turns it into a tuple

Kevin Liu · Answer 2 · Sun May 12 2024 01:37:41 GMT+0800 (China Standard Time)

>>> type(('file'))
<class 'str'>
>>> type(('file',))
<class 'tuple'>

Because python :)

Mimoune · Answer 3 · Sun May 12 2024 07:48:52 GMT+0800 (China Standard Time)

Thanks !!! sorry for the noise