The validateDataFrame function pulls entire dataset into Driver
joshvfleming opened this issue · comments
The following code attempts to pull the entire dataset into the Driver before running validations:
We should instead map over the DataFrame rows so that the validations run in a distributed fasion, and then collect
the results.
Resolve by #307