tensorflow / privacy

Library for training machine learning models with privacy for training data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exposure computation in secret sharer colab

lxuechen opened this issue · comments

Hi,

I'm attempting to use the secret sharer utilities that was just released (located here).

I'm reading the colab notebook, and I am somewhat confused by the perplexity computation.

Specifically, I'm referring to this chunk of code in the function compute_perplexity_for_secret

  cce = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
  ppls = [{} for _ in range(len(secrets))]
  ppls_reference = [None] * len(secrets)
  for i, secret in enumerate(secrets):
    for r in secret.datasets.keys():
      d = secret.datasets[r]
      loss = np.mean(cce(d['label'], prediction_model.predict(d['data'])),
                     axis=1)
      loss = list(loss)
      if r == 0:
        ppls_reference[i] = loss
      else:
        ppls[i][r] = loss

The perplexity variables are directly assigned with the cross entropy losses. Clearly, cross entropy and perplexity are entirely different concepts.

It appears also that this isn't just an issue with the terminology. The Approximation by distribution model to exposure in section 4.3 is based on estimating the density of the random variable Px_theta (s[t]), where t is uniformly drawn from the space R. Fitting a skewed normal distribution to the negative logprobs (essentially cross entropy) would result in different distributions compared to fitting one to the actual probabilities values, and this will likely alter the cdf value obtained thereafter.

Would appreciate some insight on this issue in the example. Thanks in advance.

Hi,

Thank you for you interest and sorry for the confusion!
It was actually the log-perplexity (instead of the perplexity) that we're computing and using for the distribution modeling. We should have made it clear in the code and will change the variable name soon. I think log-perplexity is the cross entropy loss, and according to the paper, it might follow the skew-normal distribution. Does that sound correct to you?

Best,
Shuang

Thanks for the quick reply. In the paper, the skewed normal density approximates P[ P_xtheta( s[t] ) = v ], and thus it seems the skewed normal approximation should be fitted using the sampled values of P_xtheta( s[t] ). This quantity is the exponentiated negative cross entropy, and is not the cross entropy per se. The code that I'm confused with is this line.

Does that make sense?

Sorry I didn't quite get it. I think P_xtheta is the log-perplexity in the paper, and in the code we passed the cross entropy loss to compute_exposure_extrapolation. Do you mean we should use perplexity instead of log-perplexity in the function?

Thanks for the quick response. I think you're right. I misread Pxtheta as the probability. It seems there's also another minor caveat: The logprob is summed across time, as opposed to being the mean. This would not matter if all canaries are of the same length, which in the present case is indeed true.

Thanks again.