lxuechen / private-transformers

A codebase that makes differentially private training of transformers easy.

Home Page:https://arxiv.org/abs/2110.05679

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Set another seed won't change the result

JunyiZhu-AI opened this issue · comments

Hi Xuechen,

I have another issue with the training seed. I would like to relax the random seed so that I can get some statistical results. Tried many different ways but even comment out the set_seed() function, the eva acc is the same until the last digit. May I ask how to relax the random seed? I'm doing experiments on examples/classification.

Thanks!

set_seed should be the only place where the seed (and randomness) is controlled. Have you tried using larger and more diverse values for the seed argument?

yes, here is what I have tried:

 # Set seed
 seed = np.random.randint(0, 1000000)
 set_seed(seed)
...
 set_seed(seed)

I have run several times, the eval acc is always the same.

That sounds really strange. Could you remove the set_seed functions altogether?

It's really hard to pinpoint the problem without additional context. Does this still happen if you use different output_dir for different seeds?

One additional thing to check is if the model checkpoint stored in output_dir is updated after you re-run your script. This may affect evaluation since the checkpoint is restored before evaluation.

Indeed, I have tried commenting out set_seed and rm -fr $output_dir, then running the algorithm. The eval acc is still the same. But when I claim the argument seed and give a large random number, the result is changed. This is done after commenting out the set_seed function, so I guess the seed has been used somewhere else.

@dataclass
class DynamicTrainingArguments(TrainingArguments):
    seed: int = field(
        default=0,
        metadata={"help": "Seed."}
    )

I'm closing this comment since the problem is solved. Thanks for the response!

I think I've pinpointed the issue. If you're using the latest transformers library, then Trainer also sets seed in its __init__ function; see this line.

I think that should be the reason of this issue, thanks for the information.