databricks / koalas

Koalas: pandas API on Apache Spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow index in replace(to_replace)

amueller opened this issue · comments

Hey. I ran series.replace(some_things, "other") which works with an index in pandas, but not in Koalas.
Is there a plan to support it?
In particular, this was n Int64Index that came out of a value_count operation.

My pandas code was

series = pd.Series(np.random.randint(0, 10, 40))
counts = series.value_counts()
series.replace(counts.index[counts < 4], -1)

You can't subscript an index, but I expected

series = ks.Series(np.random.randint(0, 10, 40))
counts = series.value_counts()
series.replace(counts[counts < 4].index, -1)

but it gives "to_replace should be one of str, list, dict, int, float".

Koalas version is 1.4.0.

I wanted to use .tolist() as a workaround but that is not available in 1.4.0, in which version was it added?

Potentially related to #1516.

Looks like it's added from 1.5.0 (https://github.com/databricks/koalas/releases/tag/v1.5.0). But yeah we should ideally make it working, I agree.