mortonjt / q2-differential

A collection of common differential abundance techniques

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'InferenceData' object has no attribute 'observed_data'

jindongmin opened this issue · comments

Hi, after ran https://github.com/jindongmin/q2-differential/blob/main/scripts/disease_parallel.py, it created an inference object, but it does not include the observed_data.

Hi @jindongmin do you mind if you could provide everything required to reproduce your error? It looks like the command line options and datasets all need to be specified no?

Although, I'm a little confused by the following

    dat = {
        "F":table.shape[0],#number of features
        "N":table.shape[1],#number of samples
    }
    obs = az.from_dict(
    #    observed_data={"observed": dat["y"]},
        coords={"tbl_sample": table.ids(axis="sample")},
        dims={"observed": ["tbl_sample","feature"]}
    )
    inference = az.concat(samples, obs)

I'm not exactly sure what this is going to do -- I would think that the inference object should already have the simulated data returned. Let me dig.

Alright, so I think your inference objects already have the simulated data -- it is found under inference.posterior_predictive

Below is the command that you probably want

    y_pred = inference.posterior_predictive.stack(sample=("chain", "draw"))
    y_pred = y_pred.stack(feature=('feature', 'tbl_sample'))['y_predict']
    y_pred = y_pred.fillna(0)  # get rid of nans if any pop up

Your simulation dataset may be large (since you are combining chains with draws). But see if you can run these commands.

If you want to compute r2_score, you can run the following

table = biom.load_table('<your biom table')
y_obs = xr.DataArray(table.matrix_data.todense())

Alright, so I think your inference objects already have the simulated data -- it is found under inference.posterior_predictive

Below is the command that you probably want

    y_pred = inference.posterior_predictive.stack(sample=("chain", "draw"))
    y_pred = y_pred.stack(feature=('feature', 'tbl_sample'))['y_predict']
    y_pred = y_pred.fillna(0)  # get rid of nans if any pop up

Your simulation dataset may be large (since you are combining chains with draws). But see if you can run these commands.

there is an error from the second line:

In [49]: y_pred = inference.posterior_predictive.stack(sample=("chain", "draw"))

In [50]: y_pred = y_pred.stack(feature=('feature', 'tbl_sample'))['y_pred']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-50-e32f7aa2cb5b> in <module>
----> 1 y_pred = y_pred.stack(feature=('feature', 'tbl_sample'))['y_pred']

~/miniconda3/envs/birdman3.8/lib/python3.8/site-packages/xarray/core/dataset.py in stack(self, dimensions, **dimensions_kwargs)
   3882         result = self
   3883         for new_dim, dims in dimensions.items():
-> 3884             result = result._stack_once(dims, new_dim)
   3885         return result
   3886 

~/miniconda3/envs/birdman3.8/lib/python3.8/site-packages/xarray/core/dataset.py in _stack_once(self, dims, new_dim)
   3828                     shape = [self.dims[d] for d in vdims]
   3829                     exp_var = var.set_dims(vdims, shape)
-> 3830                     stacked_var = exp_var.stack(**{new_dim: dims})
   3831                     variables[name] = stacked_var
   3832                 else:

~/miniconda3/envs/birdman3.8/lib/python3.8/site-packages/xarray/core/variable.py in stack(self, dimensions, **dimensions_kwargs)
   1554         result = self
   1555         for new_dim, dims in dimensions.items():
-> 1556             result = result._stack_once(dims, new_dim)
   1557         return result
   1558 

~/miniconda3/envs/birdman3.8/lib/python3.8/site-packages/xarray/core/variable.py in _stack_once(self, dims, new_dim)
   1506 
   1507         if new_dim in self.dims:
-> 1508             raise ValueError(
   1509                 "cannot create a new dimension with the same "
   1510                 "name as an existing dimension"

ValueError: cannot create a new dimension with the same name as an existing dimension

Since this looks to be working, I'm going to close this. Thanks @jindongmin