mvinyard / torch-adata

Create PyTorch Datasets from AnnData

Home Page:https://torch-adata.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

torch-adata-logo

PyPI pyversions PyPI version Documentation Status Code style: black

Create PyTorch Datasets from AnnData

Installation

Install from PYPI (current version: 0.0.24):

pip install torch-adata

Install the developer version:

git clone https://github.com/mvinyard/torch-adata.git; cd torch-adata;
pip install -e .

The main API

The primary class is the AnnDataset. This is a subclass of the widely-used torch.utils.data.Dataset. The PyTorch Dataset module enables us to take advantage of built-in multiprocessing and other organizational tricks that ultimately standardize workflows and enable reproducibility.

torch-adata-concept-overview

import anndata as a
import torch_adata

adata = a.read_h5ad("/path/to/data.h5ad")
dataset = torch_adata.AnnDataset(adata, use_key="X_pca", groupby="time", obs_keys=["affinity"])
[ torch-adata ]: AnnDataset object with 7131 samples
----------------------------------------------------
Grouped by: 'time' with attributes:
 - X (use_key = 'X_pca') torch.Size([3, 7131, 50])
 - obs: affinity: torch.Size([3, 7131, 1])

There is an additional approach to this dubbed AnnLoader, highlighted by Sergei Rybakov in Interfacing pytorch models with anndata

For more information, please visit the documentation!

Problem? Open an issue

About

Create PyTorch Datasets from AnnData

https://torch-adata.readthedocs.io/en/latest/index.html

License:GNU Affero General Public License v3.0


Languages

Language:Jupyter Notebook 89.3%Language:Python 10.7%