HazyResearch / meerkat

Creative interactive views of any dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] from_pandas without reset_index

seyuboglu opened this issue · comments

When using the meerkat from_pandas, things break if you just ran a filter a do not call reset_index(). You get an ambiguous key error when calling from_pandas . I would add some check with a better error message if a user has a non-sequential index of the dataframe.

Trying to reproduce this with:

df = pd.DataFrame({"a": np.arange(16), "b": np.arange(16)})
df =df.filter([0,3,4,5], axis="index")
dp = mk.DataPanel.from_pandas(df)

but things seem to be working fine.

@lorr1 any ideas on how to reproduce this?

Interesting. Ya, I think it's more complex. Try this

df = pd.DataFrame({"a": np.arange(16), "b": np.arange(16), "c": ["abcdefghijklmnopqrstuvwxyz"[i] for i in range(16)]})
df2 = df[(df["a"] < 12) & (df["b"] > 1)]
df3 = df2[["a", "c"]]
dp = mk.DataPanel.from_pandas(df3)

This seems to be working now, so I'm going to close for now