tensorflow / data-validation

Library for exploring and validating machine learning data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Custom Data Validation

mmithra opened this issue · comments

Hi, I was wondering if I want to write any custom data validation, for example if I want to write a KL divergence function to detect data skew and drift how would I do that?
I couldn't find anything in stackoverflow or the documentations.
If you have any resources or code samples, I would really appreciate it

Hi -- TFDV does not currently provide support for doing custom data validation, but we are planning to do so in the future.

Just in case you haven't already seen it in the TFDV docs, TFDV currently provides support for detecting skew/drift in numeric features using Jensen-Shannon Divergence (which is based on KL divergence), and l-infinity for categorical/string features. See https://www.tensorflow.org/tfx/data_validation/get_started#checking_data_skew_and_drift for more info.