databricks / koalas

Koalas: pandas API on Apache Spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is koalas still being worked on? or is the project on pause at the moment?

Trodenn opened this issue · comments

As written in the title above, I want to know a bit more on what is the current status of the koalas package? the project my team is working is looking for to transition to spark, but since most of the people have pandas experience, koalas seemed the perfect thing to fit in. However, other than the simple data manipulations, I have noticed that there are still big differences where sometime what works in pandas does not work in koalas instead.

Is it a good idea to continue using koalas? or would it be better to convert towards Spark's Pandas package?

It's encouraged to migrate to PySpark itself since PySpark has it now. Releases might happen per security issues or critical issues. Otherwise, nothing much will be updated.