The predict method of the base task expects to build a logger from a temporary folder

Question

The predict method of the base task expects to build a logger from a temporary folder

samuelbortolinUH opened this issue 2 years ago · comments

Samuel Bortolin commented 2 years ago

I'm submitting a

bug report

Issue Description

When Issue Happens

When loading a api_task and launching predictions on another device, or also the same one but in the case when the temporary folder is cleaned.

Steps To Reproduce
1. Store a api_task using pickle.dump
2. Load the api_task in another device use pickle.load
3. Launch the .predict method

Expected Behavior

Do the prediction correctly.

Current Behavior

It is showing an error since it is not finding the file in a temporary folder and so it is not able to configure handler 'distributed_logfile'.

Possible Solution

Remove the line:

self._logger = self._get_logger("Predict-Logger")

from the base task predict method since the logger should be already configured or configure only if not present.

Error message

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/autoPyTorch_tmp_27bb5c73-aed0-11ed-ba38-0242ac130004/distributed.log'

Full Traceback:

FileNotFoundError                         Traceback (most recent call last)
File /usr/lib/python3.8/logging/config.py:563, in DictConfigurator.configure(self)
    562 try:
--> 563     handler = self.configure_handler(handlers[name])
    564     handler.name = name

File /usr/lib/python3.8/logging/config.py:744, in DictConfigurator.configure_handler(self, config)
    743 try:
--> 744     result = factory(**kwargs)
    745 except TypeError as te:

File /usr/lib/python3.8/logging/__init__.py:1147, in FileHandler.__init__(self, filename, mode, encoding, delay)
   1146 else:
-> 1147     StreamHandler.__init__(self, self._open())

File /usr/lib/python3.8/logging/__init__.py:1176, in FileHandler._open(self)
   1172 """
   1173 Open the current base file with the (original) mode and encoding.
   1174 Return the resulting stream.
   1175 """
-> 1176 return open(self.baseFilename, self.mode, encoding=self.encoding)

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/autoPyTorch_tmp_27bb5c73-aed0-11ed-ba38-0242ac130004/distributed.log'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[71], line 1
----> 1 y_pred = loaded_model.predict(X_test)

File ~/.local/lib/python3.8/site-packages/autoPyTorch/api/tabular_classification.py:490, in TabularClassificationTask.predict(self, X_test, batch_size, n_jobs)
    486     raise ValueError("predict() is only supported after calling search. Kindly call first "
    487                      "the estimator search() method.")
    489 X_test = self.input_validator.feature_validator.transform(X_test)
--> 490 predicted_probabilities = super().predict(X_test, batch_size=batch_size,
    491                                           n_jobs=n_jobs)
    493 if self.input_validator.target_validator.is_single_column_target():
    494     predicted_indexes = np.argmax(predicted_probabilities, axis=1)

File ~/.local/lib/python3.8/site-packages/autoPyTorch/api/base_task.py:1851, in BaseTask.predict(self, X_test, batch_size, n_jobs)
   1838 """Generate the estimator predictions.
   1839 Generate the predictions based on the given examples from the test set.
   1840 
   (...)
   1846     Array with estimator predictions.
   1847 """
   1849 # Parallelize predictions across models with n_jobs processes.
   1850 # Each process computes predictions in chunks of batch_size rows.
-> 1851 self._logger = self._get_logger("Predict-Logger")
   1853 if self.ensemble_ is None and not self._load_models():
   1854     raise ValueError("No ensemble found. Either fit has not yet "
   1855                      "been called or no ensemble was fitted")

File ~/.local/lib/python3.8/site-packages/autoPyTorch/api/base_task.py:516, in BaseTask._get_logger(self, name)
    511 logger_name = 'AutoPyTorch:%s:%d' % (name, self.seed)
    513 # Setup the configuration for the logger
    514 # This is gonna be honored by the server
    515 # Which is created below
--> 516 setup_logger(
    517     filename='%s.log' % str(logger_name),
    518     logging_config=self.logging_config,
    519     output_dir=self._backend.temporary_directory,
    520 )
    522 # As AutoPyTorch works with distributed process,
    523 # we implement a logger server that can receive tcp
    524 # pickled messages. They are unpickled and processed locally
    525 # under the above logging configuration setting
    526 # We need to specify the logger_name so that received records
    527 # are treated under the logger_name ROOT logger setting
    528 context = multiprocessing.get_context(self._multiprocessing_context)

File ~/.local/lib/python3.8/site-packages/autoPyTorch/utils/logging_.py:41, in setup_logger(output_dir, filename, distributedlog_filename, logging_config)
     37     distributedlog_filename = logging_config['handlers']['distributed_logfile']['filename']
     38 logging_config['handlers']['distributed_logfile']['filename'] = os.path.join(
     39     output_dir, distributedlog_filename
     40 )
---> 41 logging.config.dictConfig(logging_config)

File /usr/lib/python3.8/logging/config.py:808, in dictConfig(config)
    806 def dictConfig(config):
    807     """Configure logging using a dictionary."""
--> 808     dictConfigClass(config).configure()

File /usr/lib/python3.8/logging/config.py:570, in DictConfigurator.configure(self)
    568             deferred.append(name)
    569         else:
--> 570             raise ValueError('Unable to configure handler '
    571                              '%r' % name) from e
    573 # Now do any that were deferred
    574 for name in deferred:

ValueError: Unable to configure handler 'distributed_logfile'

Version installed of autoPyTorch

autoPyTorch==0.2.1