Creating multiple instances with different generalization rules fails with a "KeyError" exception
iflow opened this issue · comments
Anton Jungwirth commented
Steps to reproduce
- Anonymize data
generalization_rules = {
'sex': GenRule([]) # 1 level
}
adult_anonymised = anonymize(adult, generalization_rules=generalization_rules, k=2, max_sup=0.0, info_loss=entropy_loss)
- Anonymize data with different rule set
generalization_rules = {
'race': GenRule([])
}
adult_anonymised = anonymize(adult, generalization_rules=generalization_rules, k=2, max_sup=0.0, info_loss=entropy_loss)
Result:
KeyError: 'sex'
Expected:
Anonymized data with new rule set.
Solution
Search for the function _k_min
and replace the first lines with this:
def _k_min(b_node, t_node, k, max_sup, k_min_set=None):
""" Core of OLA's operation: build k-minimal set with binary search in generalization
strategies of lattice """
if k_min_set is None:
k_min_set = set()
Leonardo Mazzone commented
Hi @iflow, thank you for your report. Would you be open to contributing a PR?
Anton Jungwirth commented
Yes, I would have done it yesterday already :)
However I could not reproduce the issue with your example.
I will try it today again and commit the solution.
Anton Jungwirth commented
With the previous commits, the issue seems to be fixed. I extended your example to test the behaviour.