kumarkrishna / fastssl

Fast SSL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ranges used in function fit_powerlaw() and stringer_get_powerlaw()

SirNader opened this issue · comments

Dears,

I have been recently following your work "Assessing Representation Quality in Self-Supervised Learning by measuring eigenspectrum decay" and I was trying to apply to compute alpha in order to assess the representation quality of Self-Supervised Model.

I computed the covariance matrix of the model features then I applied torch.linalg.eigvals() to calculate the eigenspectrum using the following code

cov = torch.zeros(768, 768)
N = len(val_data_loader)
for i,x in enumerate(val_data_loader):
    inputs,batchLabels = x
    features = backbone.features(inputs.to('cuda'))

    features = features.detach().cpu()
    cov += torch.mm(features.T, features)/N
eigenspectrum = torch.linalg.eigvals(cov).detach().cpu().numpy()

Now to calculate alpha from functions fit_powerlaw() and stringer_get_powerlaw(), from your code , I can see that both functions take eigenspectrum and other argument that implies the range.

My question is: What does this range imply? and what is the suitable range I should use to calculate alpha properly to assess the model's features' quality?

Thanks!
Nader

Hi Nader,

Thanks for your interest in our work. :)

We recommend using the stringer_get_powerlaw() function. If you plot your eigenspectrum (sorted by highest to lowest eigenvalues), you might see the first few eigenvalues not following a powerlaw like behavior, as opposed to the rest of the eigenvalues. This is typical of long-tailed distributions. In our work, we measure the powerlaw fit to tail of the eigenspectrum and the range argument effectively defines the eigenvalue indices that correspond to the tail.

For our work, we generally used np.arange(10,100) as the range. But maybe it's best to plot the eigenspectrum and decide the start and end of the tail. It's recommended to keep a decent range (roughly one order of magniture, the more the better) of values for a good powerlaw fit.

Hope this helps. Please reach out if this wasn't clear or have more questions. :)

Thanks a lot Arna!
I appreciate your answer 😄, I will let you know if something else comes up

Cheers,
Nader