Exploring of the scLUCA dataset

For building coabundance networks.

We are basing ourselves on the wonderful work by Salcher, Sturm, Horvath et al. 2022

The workflow for getting from the raw AnnData files to the the coabundance graphs is this one:

Folder structure

nb_filter Filters the the cells and genes by predefined quality control metrics
nb_annot Annotates tissue from every study with Lunga Atlas reference maps
nb_ikarus Runs the ikarus prediction for every sample. A prediciton that uses alog regression and network projection to predict tumor cells
nb_infercnv Runs InferCNV on every sample. This infers from transcripts plcaes in the chromosmes where there sould be copy number variations
nb_DE Extracts marker genes from (TODO) hardoced cell annotations. It aslo enriches for Hallmark gene ontologies
nb_tumorUMAP Notebook to check th tumor predictions, It has the DE part integrated. Tumor_Annot.ipyn contains explanations of the methods used.
outputARACNE has all the files for the generation and the output of the networks by ARACNE but alos functionally enriched
metadata contains info about the studies used and data about the groups
utils contians cutsom plotting and analyiss functions

Troubleshooting

Due to the long training and annealing times, jlab sometimes cannot connect. Use this to get a console to the kernel: ipython console --existing /root/.local/share/jupyter/runtime/kernel-50c440a3-554d-4d98-bb90-bdda9a8923d5.json

This could be easier in a newer version of jlab. To locate the corresponding json you can use htop with option to not display user branches and seeing the memory it is using.

In this image: python:3.11.4. It is important to install the docker nvidia package to transfer your cuda installation to the containers.

We are using an image generated by the Dockerfile in this repo.

About

Exploring the luca dataset for building coabundance networks

Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%Language:R 0.0%Language:Dockerfile 0.0%Language:Shell 0.0%