automl / NASLib

NASLib is a Neural Architecture Search (NAS) library for facilitating NAS research for the community by providing interfaces to several state-of-the-art NAS search spaces and optimizers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [1, 512, 8, 8]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead.

thompsondd opened this issue · comments

I have run Nasbench101 in the zero-cost Naslib and got an error
image

Have anyone tackled this problem?

Hi @thompsondd,

Could you please tell us which proxy you were using? Looks to me like removing an inplace relu operation somewhere in the Nasbench101 graph will fix the issue.

Thanks!

Thank you for your reply, @Neonkraft.

I am trying to use the Synflow proxy in NAS101 but the arch "(0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 4, 3, 3, 1)" raises this error (This is just one of the error cases that I have met).

Following the source code, I have removed the inplace of relu operation in https://github.com/automl/NASLib/blob/8c45f19dc259956c3bd253071135c798ad3df8ce/naslib/search_spaces/nasbench101/base_ops.py#L18C3-L19C10, but nothing changed.

Could you please tell me what you have modified the code?

Hi @thompsondd,

I have tried to reproduce your error and had no problem evaluating the zero-cost score for the architecture. Here's a snippet of code that I tried. You can correct me if it doesn't exactly match your case.

import logging 
from naslib.predictors import ZeroCost
from naslib import utils
from naslib.utils import setup_logger, get_dataset_api
from naslib.search_spaces.nasbench101.conversions import convert_tuple_to_spec
from naslib.search_spaces import NasBench101SearchSpace


config = utils.get_config_from_args(config_type="zc")
logger = setup_logger(config.save + "/log.log")
logger.setLevel(logging.INFO)

utils.set_seed(config.seed)
utils.log_args(config)

dataset_api = get_dataset_api("nasbench101", config.dataset)
graph = NasBench101SearchSpace(n_classes=10)
test_arch = (0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0,
             0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 
             0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
             0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 4, 3, 3, 1)
spec = convert_tuple_to_spec(test_arch)
graph.set_spec(spec)

predictor = ZeroCost(method_type="synflow")
train_loader, _, test_loader, _, _ = utils.get_train_val_loaders(config)
graph.parse()
score = predictor.query(graph, train_loader)
print("Zero cost score:", score)
logger.info('Test experiment complete.')

I had a synflow score of 125.99:

image

Is it possible that I missed something or a version-related problem? Maybe you can also try running the same snippet, and tell what you are getting.