'drop_invalid_rows' always False with from_json() and to_json()
Nico-VC opened this issue · comments
Describe the bug
'drop_invalid_rows: false' argument at a DataFrameSchema level gets set to False using .from_json().
Creating json from .py with 'drop_invalid_rows=True' does not work either.
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandera.
- (optional) I have confirmed this bug exists on the master branch of pandera.
Using .to_json() with this inferred schema ignores the True 'drop_invalid_rows' argument
from pandera import DataFrameSchema, Column, Check, Index, MultiIndex
import numpy as np
import pandas as pd
schema = DataFrameSchema(
columns={
"Model": Column(
dtype=np.int32,
checks=None,
nullable=False,
unique=False,
coerce=True,
required=True,
regex=True,
description=None,
title=None,
),
"ID": Column(
dtype=np.int32,
checks=None,
nullable=False,
unique=False,
coerce=False,
required=True,
regex=False,
description=None,
title=None,
),
},
checks=None,
index=Index(
dtype="int64",
checks=[],
nullable=False,
coerce=False,
name=None,
description=None,
title=None,
),
dtype=None,
coerce=True,
strict=True,
name=None,
ordered=False,
unique=None,
report_duplicates="all",
unique_column_names=False,
add_missing_columns=False,
title=None,
description=None,
drop_invalid_rows=True
)
schema.to_json()
This same behavior is observed if set a Column level.
I end up having to manually set this value to True in the schema class.