xhochy / fletcher

Pandas ExtensionDType/Array backed by Apache Arrow

Home Page:https://fletcher.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected type `FletcherChunkedArray` when converting `pa.Table`

dhirschfeld opened this issue · comments

Repro:

In [42]: import io
    ...: import pandas as pd
    ...: import pyarrow as pa
    ...: import fletcher as fr

In [43]: csv = io.BytesIO(b"""
    ...: Simulation,Date,Period,Value
    ...: 1,11/03/2021,1,37.7589267496136
    ...: 1,11/03/2021,2,38.1660319483619
    ...: 1,11/03/2021,3,37.7823226220566
    ...: 1,11/03/2021,4,40.0238438477384
    ...: """)

In [44]: schema = pa.schema([
    ...:     ("Simulation", pa.int32()),
    ...:     ("Date", pa.timestamp('s')),
    ...:     ("Period", pa.int8()),
    ...:     ("Value", pa.float64()),
    ...: ])

In [45]: convert_options = pa.csv.ConvertOptions(timestamp_parsers=["%d/%m/%Y"], column_types=schema)

In [46]: table = pa.csv.read_csv(csv, convert_options=convert_options)

In [47]: df = fr.pandas_from_arrow(table)
Traceback (most recent call last):

  File "<ipython-input-47-597dd8da5169>", line 1, in <module>
    df = fr.pandas_from_arrow(table)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\fletcher\base.py", line 1744, in pandas_from_arrow
    return pd.DataFrame(data)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\frame.py", line 529, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\construction.py", line 287, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\construction.py", line 95, in arrays_to_mgr
    return create_block_manager_from_arrays(arrays, arr_names, axes)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\managers.py", line 1706, in create_block_manager_from_arrays
    raise construction_error(len(arrays), arrays[0].shape, axes, e)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\managers.py", line 1701, in create_block_manager_from_arrays
    blocks = _form_blocks(arrays, names, axes)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\managers.py", line 1781, in _form_blocks
    for i, _, array in items_dict["DatetimeTZBlock"]

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\managers.py", line 1781, in <listcomp>
    for i, _, array in items_dict["DatetimeTZBlock"]

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\blocks.py", line 2732, in make_block
    return klass(values, ndim=ndim, placement=placement)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\blocks.py", line 1693, in __init__
    super().__init__(values, placement, ndim=ndim)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\blocks.py", line 139, in __init__
    self.values = self._maybe_coerce_values(values)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\internals\blocks.py", line 2345, in _maybe_coerce_values
    values = self._holder(values)

  File "C:\Users\dhirschfeld\envs\dev\lib\site-packages\pandas\core\arrays\datetimes.py", line 245, in __init__
    f"Unexpected type '{type(values).__name__}'. 'values' must be "

ValueError: Unexpected type 'FletcherChunkedArray'. 'values' must be a DatetimeArray ndarray, or Series or Index containing one of those.

py37 on win64

In [53]: pd.__version__
Out[53]: '1.2.2'

In [54]: pa.__version__
Out[54]: '3.0.0'

In [55]: fr.__version__
Out[55]: '0.7.2'

This project has been archived as development has ceased around 2021.
With the support of Apache Arrow-backed extension arrays in pandas, the major goal of this project has been fulfilled.