Warning printed too many times (`RuntimeWarning: invalid value encountered in scalar divide ....`)
npatki opened this issue · comments
Neha Patki commented
Environment Details
- SDV version: 1.2.0 (latest)
- Python version: 3.10
- Operating System: Darwin (MacOS)
Error Description
Sometimes, I see a RuntimeWarning
repeated many times during the fitting phase.
- In
HMASynthesizer
, it interrupts the progress bar - The warning is not useful to me. It seems to be related to the mathematics in copulas, so there's nothing I can do to get rid of it.
We should silence this warning since the software is still working as intended. We can consider logging it (logger.INFO
) instead.
Root Cause
I suspect this is coming from the Gaussian Copula synthesizer. In this synthesizer, we are silencing warnings coming from scipy in this line. For some reason, the RuntimeWarning
is still coming through.
This only appears to happen for the 'truncnorm'
distribution.
Steps to reproduce
from sdv.datasets.demo import download_demo
from sdv.multi_table import HMASynthesizer
real_data, metadata = download_demo(
modality='multi_table', dataset_name='fake_hotels')
synthesizer = HMASynthesizer(metadata)
synthesizer.set_table_parameters(
table_name='hotels',
table_parameters={ 'default_distribution': 'truncnorm'})
synthesizer.fit(real_data)
Output:
Preprocess Tables: 100%|███████████████| 2/2 [00:00<00:00, 19.58it/s]
Learning relationships:
(1/1) Tables 'hotels' and 'guests' ('hotel_id'): 100%|███████████████| 10/10 [00:01<00:00, 6.78it/s]
Modeling Tables: 0%| | 0/1 [00:00<?, ?it/s]/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/truncated_gaussian.py:45: RuntimeWarning: invalid value encountered in scalar divide
a = (self.min - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/truncated_gaussian.py:46: RuntimeWarning: divide by zero encountered in scalar divide
b = (self.max - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/truncated_gaussian.py:45: RuntimeWarning: divide by zero encountered in scalar divide
a = (self.min - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/truncated_gaussian.py:46: RuntimeWarning: invalid value encountered in scalar divide
b = (self.max - loc) / scale
Modeling Tables: 100%|███████████████| 1/1 [00:00<00:00, 13.64it/s]