py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Home Page:https://www.pywhy.org/dowhy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support polars data frames

krz opened this issue · comments

Polars is a high-performance data frame library in python, renowned for its blazing-fast data processing capabilities and efficient, less cumbersome syntax. It stands out with its multi-threaded query engine and seamless integration with the Python ecosystem, making it an excellent choice for handling large datasets

While many popular libraries such as scikit-learn and seaborn support polars data frames, dowhy currently does not.
The current way to use a polars data frames is to convert them to pandas before using them with dowhy (e.g. polars_df.to_pandas()

Please support polars natively, as its popularity is increasing.

Thanks for raising this @krz. Can you give more details on how scikit-learn supports polars DFs? Do they have a common API that can support both pandas and polars (if installed)?

Also, we'd love to have contributions. Would you like to start a PR to support polars?

Thanks for your reply. scikit-learn made sure that all their code supports the Python dataframe interchange protocol. See commits scikit-learn/scikit-learn#26464 and scikit-learn/scikit-learn#27315 and discussion scikit-learn/scikit-learn#25896.

I think an important first step for dowhy would be to remove functionality that solely relies on pandas, such as #1135