unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library

Home Page:https://www.union.ai/pandera

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incorrect validation passes pandera=0.19.0b3

obiii opened this issue · comments

Describe the bug
Panders: 0.19.0b3
Python: 3.11
polars: 0.20.23

We are using DataFrameModel to perform some data validation. The validation does not work as expected:

import polars as pl
import pandera.polars as pa
from datetime import date

class CaseSchema(pa.DataFrameModel):
    case_id: str = pa.Field(nullable=False, unique=True)
    gdwh_portfolio_id: str = pa.Field(nullable=False, unique=True)

lf = pl.LazyFrame({
    "case_id": ["case1", "case1", None],
    "gdwh_portfolio_id": ["portfolio1", "portfolio2", "portfolio3"]
})

CaseSchema.validate(lf).collect()

even with
CaseSchema.validate(lf)

It returns nothing

Observed behaviour

It returns nothing, so assume it passes the validation.

Expected behavior

Validation should fail because ecase_id is not unique and containers None

Additional context

We are utilizing polars.