Attribute Attack should report confidence that training set is not more vulnerable than test

Question

Attribute Attack should report confidence that training set is not more vulnerable than test

jim-smith opened this issue a year ago · comments

At moment we effectively run a worst-case attack where a simulated attacker has the model which outputs probabilities, and has a record with the target label and with just the value for one feature missing.
A `competent' published model may increase the likelihood that an attacker can estimate the missing value for a record more reliably than they could without the model.

So this uses is, is this risk different for items that were in the training set than it is for the general population?

We assess this risk separately for each attribute - assuming the TRE may set a different risk appetite for each.

Procedure:

Compute the number of vulnerable train and test records ($v_{tr}, v_{te}$ respectively)
Assess the proportion $p_{tr}$ of 'vulnerable' training set items: $p_{tr} = v_{tr}/ $n_{tr}$
Assess the proportion of 'vulnerable' test set items $p_{te} = v_{te}/n_{te}$

Currently we report the ratio of the two fractions$ \frac { p_{tr} }}{p_{te}}$

We should report the probability that the observed differences of proportions is significant

using a one tailed test I.e. is the training data more vulnerable

-- some code examples in metrics.py for pdf, or description here

Null hypothesis $p_{tr} > p_{te}$
pooled proportion $p = \frac{ v_{tr} + v_{te}} / {n_{tr} + n{te}} $
standard error $SE = \sqrt{ p * ( 1 - p ) * [ (1/n_{tr}) + (1/n_{te}) ] }
test statistic $z = (p1 - p2) / SE $
P-value is the probability that the z-score is less than $z$

using norm from scipy.stats,

probability = norm.cdf(z, loc=0,scale=SE)

Then for report we have to decide whether to use 95% or 99% confidence