Annotate API-exposed Items from pandas.core.api
WillAyd opened this issue · comments
We currently expose all of the following items from pandas.core.api via the API:
from pandas.core.api import (
# dtype
Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype, UInt8Dtype,
UInt16Dtype, UInt32Dtype, UInt64Dtype, CategoricalDtype,
PeriodDtype, IntervalDtype, DatetimeTZDtype,
# missing
isna, isnull, notna, notnull,
# indexes
Index, CategoricalIndex, Int64Index, UInt64Index, RangeIndex,
Float64Index, MultiIndex, IntervalIndex, TimedeltaIndex,
DatetimeIndex, PeriodIndex, IndexSlice,
# tseries
NaT, Period, period_range, Timedelta, timedelta_range,
Timestamp, date_range, bdate_range, Interval, interval_range,
DateOffset,
# conversion
to_numeric, to_datetime, to_timedelta,
# misc
np, Grouper, factorize, unique, value_counts, NamedAgg,
array, Categorical, set_eng_float_format, Series, DataFrame,
Panel)
A pseudo-prioritized list of annotations I think we need out of this would be the below. Open to suggestions on how to prioritize and obviously community PRs are very welcome!
- DataFrame
- Series
- Index
- MultiIndex
- Categorical
- CategoricalIndex
- Datetimelike indices
- Numeric indices
...
These don't necessarily need to be completed in order. Will continue to expand checklist as we tackle more items so if you see something you'd like to tackle feel free to call it out
@WillAyd I am not sure what exactly needs to be done. Do we need to annotate every attribute/method of class DataFrame, Index, etc.? Can you provide some example(mypy docs or SO link maybe)
Yes that's correct - would want to add annotations to the methods for these objects (and attributes where inference may not work)
I'm unpinning this to make room for #31879. Will re-pin tomorrow.
@WillAyd I am not quite sure that I understand the meaning of this issue.
Is it about creating something like this?
class MyDataFrame(pandas.DataFrame):
col_foo: datetime.datetime
def func(df: MyDataFrame):
df['col_foo'].dt
func(MyDataFrame()) # mypy passes
func(pd.Dataframe(columns=['col_foo'] , dtype=np.datetime64)) # mypy passes?
func(pd.Dataframe(columns=['col_foo'])) # mypy raises error?
I was looking for something that imitates the dataclass
\ NamedTuple
usage api
We can close this issue; I haven't tracked it in quite some time
@WillAyd which issue tracks development of something similar to the type annotations I have mentioned above?
Sounds good. Feel free to submit PRs to improve annotations - they are always welcome
I think this has served its purpose. Closing.