ydataai / ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

Home Page:https://docs.profiling.ydata.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request

gonzalezhomar opened this issue · comments

Missing functionality

After exporting reports (html or json) an option to read back from file.
This would be extremely useful for comparing data long after updates, not on data of the same session.

Proposed feature

comparison_report = transformed_report.compare('original_report.html', from_file=True)
... or something like that

Alternatives considered

No response

Additional context

I think it's a great idea to read back a profile report from file. When comparing 2 "live" dataframes on memory, the code is very straightforward:
original_report = ProfileReport(df, title="Original Data")
transformed_report = ProfileReport(df_transformed, title="Transformed Data")
comparison_report = original_report.compare(transformed_report)
comparison_report.to_file("original_vs_transformed.html")

But what if the report is on the same data, but on diferent time. Today i run the original report, i save it, and after my findings, the original data gets updated to solve my observations. Then i would like to compare the report back to the original to see if the changes were efective, etc.