Harmonypy results with anndata
321356766 opened this issue · comments
Once the adjusted PCs are calculated...
data_mat = adata.obsm['X_pca']
meta_data = adata.obs
vars_use = ['batch']
ho = hm.run_harmony(data_mat, meta_data, vars_use)
...how do I integrate the results back into the anndata object (adata) to proceed with the workflow? I am attempting to use harmonypy in the scanpy workflow but am very new at this.
Thanks in advance!
Have you tried replacing the original PCs with the harmonized PCs?
adjusted_pcs = pd.DataFrame(ho.Z_corr)
adata.obsm['X_pca'] = adjusted_pcs
You might also be able to add a new entry in the obsm
slot:
adata.obsm['X_pca_harmonized'] = adjusted_pcs
I did not test these snippets, so I don't know if they will work or not.
Thanks for your quick response!
In both cases I get a length error:
adjusted_pcs = pd.DataFrame(ho.Z_corr)
adata.obsm['X_pca'] = adjusted_pcs
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-93-d54e748e808d> in <module>
1 adjusted_pcs = pd.DataFrame(ho.Z_corr)
----> 2 adata.obsm['X_pca'] = adjusted_pcs
~/anaconda3/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __setitem__(self, key, value)
148
149 def __setitem__(self, key: str, value: V):
--> 150 value = self._validate_value(value, key)
151 self._data[key] = value
152
~/anaconda3/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in _validate_value(self, val, key)
206 hasattr(val, "index")
207 and isinstance(val.index, cabc.Collection)
--> 208 and not (val.index == self.dim_names).all()
209 ):
210 # Could probably also re-order index if it’s contained
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in cmp_method(self, other)
103 if isinstance(other, (np.ndarray, Index, ABCSeries, ExtensionArray)):
104 if other.ndim > 0 and len(self) != len(other):
--> 105 raise ValueError("Lengths must match to compare")
106
107 if is_object_dtype(self) and isinstance(other, ABCCategorical):
ValueError: Lengths must match to compare
---------------------------------------------------------------------------
Same thing also happens if I try to use
adata.obsm['X_pca'] = adjusted_pcs.values
ValueError: Value passed for key 'X_pca' is of incorrect shape. Values of obsm must match dimensions (0,) of parent. Value had shape (50, 29552) while it should have had (29552,).
Thanks for your time.
I think a simple transpose fixed the issue:
adjusted_pcs = pd.DataFrame(ho.Z_corr).T
adata.obsm['X_pca']=adjusted_pcs.values
Thanks!
Thanks for sharing the results. Glad you got it working.