Exposure computation in secret sharer colab

Question

Exposure computation in secret sharer colab

lxuechen opened this issue 3 years ago · comments

Xuechen Li commented 3 years ago

Hi,

I'm attempting to use the secret sharer utilities that was just released (located here).

I'm reading the colab notebook, and I am somewhat confused by the perplexity computation.

Specifically, I'm referring to this chunk of code in the function compute_perplexity_for_secret

  cce = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
  ppls = [{} for _ in range(len(secrets))]
  ppls_reference = [None] * len(secrets)
  for i, secret in enumerate(secrets):
    for r in secret.datasets.keys():
      d = secret.datasets[r]
      loss = np.mean(cce(d['label'], prediction_model.predict(d['data'])),
                     axis=1)
      loss = list(loss)
      if r == 0:
        ppls_reference[i] = loss
      else:
        ppls[i][r] = loss

The perplexity variables are directly assigned with the cross entropy losses. Clearly, cross entropy and perplexity are entirely different concepts.

It appears also that this isn't just an issue with the terminology. The Approximation by distribution model to exposure in section 4.3 is based on estimating the density of the random variable Px_theta (s[t]), where t is uniformly drawn from the space R. Fitting a skewed normal distribution to the negative logprobs (essentially cross entropy) would result in different distributions compared to fitting one to the actual probabilities values, and this will likely alter the cdf value obtained thereafter.

Would appreciate some insight on this issue in the example. Thanks in advance.

Shuang Song · Answer 1 · Tue Jul 13 2021 03:09:12 GMT+0800 (China Standard Time)

Hi,

Thank you for you interest and sorry for the confusion!
It was actually the log-perplexity (instead of the perplexity) that we're computing and using for the distribution modeling. We should have made it clear in the code and will change the variable name soon. I think log-perplexity is the cross entropy loss, and according to the paper, it might follow the skew-normal distribution. Does that sound correct to you?

Best,
Shuang

Xuechen Li · Answer 2 · Tue Jul 13 2021 13:10:46 GMT+0800 (China Standard Time)

Thanks for the quick reply. In the paper, the skewed normal density approximates P[ P_xtheta( s[t] ) = v ], and thus it seems the skewed normal approximation should be fitted using the sampled values of P_xtheta( s[t] ). This quantity is the exponentiated negative cross entropy, and is not the cross entropy per se. The code that I'm confused with is this line.

Does that make sense?

Shuang Song · Answer 3 · Tue Jul 13 2021 14:02:53 GMT+0800 (China Standard Time)

Sorry I didn't quite get it. I think P_xtheta is the log-perplexity in the paper, and in the code we passed the cross entropy loss to compute_exposure_extrapolation. Do you mean we should use perplexity instead of log-perplexity in the function?

Xuechen Li · Answer 4 · Tue Jul 13 2021 16:05:26 GMT+0800 (China Standard Time)

Thanks for the quick response. I think you're right. I misread Pxtheta as the probability. It seems there's also another minor caveat: The logprob is summed across time, as opposed to being the mean. This would not matter if all canaries are of the same length, which in the present case is indeed true.

Thanks again.