sdv-dev / Copulas

A library to model multivariate data using copulas.

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Warning printed too many times (`RuntimeWarning: invalid value encountered in scalar divide ....`)

npatki opened this issue · comments

Environment Details

  • SDV version: 1.2.0 (latest)
  • Python version: 3.10
  • Operating System: Darwin (MacOS)

Error Description

Sometimes, I see a RuntimeWarning repeated many times during the fitting phase.

  1. In HMASynthesizer, it interrupts the progress bar
  2. The warning is not useful to me. It seems to be related to the mathematics in copulas, so there's nothing I can do to get rid of it.

We should silence this warning since the software is still working as intended. We can consider logging it (logger.INFO) instead.

Root Cause

I suspect this is coming from the Gaussian Copula synthesizer. In this synthesizer, we are silencing warnings coming from scipy in this line. For some reason, the RuntimeWarning is still coming through.

This only appears to happen for the 'truncnorm' distribution.

Steps to reproduce

from sdv.datasets.demo import download_demo
from sdv.multi_table import HMASynthesizer

real_data, metadata = download_demo(
  modality='multi_table', dataset_name='fake_hotels')   

synthesizer = HMASynthesizer(metadata)

  table_parameters={ 'default_distribution': 'truncnorm'})


Preprocess Tables: 100%|███████████████| 2/2 [00:00<00:00, 19.58it/s]

Learning relationships:
(1/1) Tables 'hotels' and 'guests' ('hotel_id'): 100%|███████████████| 10/10 [00:01<00:00,  6.78it/s]

Modeling Tables:   0%|                                                                                                                                        | 0/1 [00:00<?, ?it/s]/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/ RuntimeWarning: invalid value encountered in scalar divide
  a = (self.min - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/ RuntimeWarning: divide by zero encountered in scalar divide
  b = (self.max - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/ RuntimeWarning: divide by zero encountered in scalar divide
  a = (self.min - loc) / scale
/Users/npatki/Documents/DataCebo/SDV/misc/env/lib/python3.10/site-packages/copulas/univariate/ RuntimeWarning: invalid value encountered in scalar divide
  b = (self.max - loc) / scale
Modeling Tables: 100%|███████████████| 1/1 [00:00<00:00, 13.64it/s]