pwwang / datar

A Grammar of Data Manipulation in python

Home Page:https://pwwang.github.io/datar/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] if_else() is evaluating `true statement` even when the `condition` is `false`

GitHunter0 opened this issue · comments

datar version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of datar and its backends.

Issue Description

Hi @pwwang , it seems if_else() is trying to evaluate the true statement even when the condition is false.

Consider this MWE

import pandas as pd
import datar.all as d
from datar import f

df = pd.DataFrame(dict(date = ["2023-01-01", "2023-01-25"]))

(df 
    >> d.mutate(date = d.if_else(
        d.is_numeric(f['date']),
        true = f['date'].astype(int),
        false = 'Not numeric',
        )
    )
)

It is returning:
#> ValueError: invalid literal for int() with base 10: '2023-01-01'

Expected Behavior

f['date'].astype(int) is expected to generate an error but since the condition was not true, it should be bypassed and not be evaluated, therefore no error should be thrown. Am I missing something?

Installed Versions

python : 3.11.3 | packaged by Anaconda, Inc. | (main, May 15 2023, 15:41:31) [MSC v.1916 64 bit (AMD64)]
datar : 0.15.0
simplug : 0.3.2
executing : 2.0.0
pipda : 0.13.0
datar-numpy : 0.3.1
numpy : 1.25.0
datar-pandas: 0.5.0
pandas : 2.0.2

It's because f['date'].astype(int) is not evaluated lazily. It's the same behavior in R.

The difference is that, NAs will be produced in R for as.integer(), but error raised for .astype(int) in python.

Thank you, I thought the evaluation was lazy.