huawei-noah / trustworthyAI

Trustworthy AI related projects

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

the data of causal discovery

sususnow opened this issue · comments

Hi, I want to know the data requirement for causal discovery in the web GUI. The following data1 & data3 are my local dataset's data format. Data2 is the data format generated by the data generation task.
I can successfully use causal discovery when I choose data1 from the external training dataset and data2 from the built-in training dataset.
When I choose data3 from the external training dataset, it always prompts "校验结果为false" in the web GUI, and the terminal shows "POST /task/check_dataset HTTP/1.1" 200 -". When I download the file of data2 to to my local, then choose data2 from the external training dataset, the WEB GUI and terminal show the same fail text.

data1:[0 1 0 0 0 0 0 1 0 0 0 0 0 -1]
data2:[-0.516906642 0.498168803 -0.228214563 0.752834357 0.592922701]
data3:[0.363636364 0 0.090909091 0 0 0 0.090909091 0 0.090909091 0.363636364 0 0 0 -3]

Hello,

I believe this could be a bug in how the data format is checked here:

data_type = str(data_df.dtypes.unique()[0])
if len(data_df.dtypes.unique()) == 1 and ('float' in data_type or 'int' in data_type):
return data_df.shape[1]

The current code version only allows for a single data type in the dataframe (either float or int), not both at the same time.
For now, you can try to change the data type of all data to float and it should work. For example, you can add .0 to the first numbers in any column that is of int type.

We will update the code for the next version of the package to allow a mix of int and float columns.

A fix for this issue has now been added. Using a mix of int and float columns should now work as expected.