AlexsLemonade / refinebio-examples

Example workflows for refine.bio data

Home Page:https://www.refine.bio

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update analysis example: Switch WGCNA dataset to something that doesn't have technical replicates

cansavvy opened this issue · comments

Background

#364 (comment)
There's some technical replicates in SRP133573.

Problem

We could deal with replicates by collapsing them, but I think this example is already pretty long and complicated as it is (even though it is an advanced topics example). I think we can switch this out for a dataset that is less complicated and then deal with the collapsing replicates issue separately.

What potential "gotchas" do we know of?

The dataset should be sufficiently large (bigger than 15) but not so large someone couldn't run it locally.
For reference

What are the recommended next steps?

Step 0) After #363 and #364 are merged, this can be addressed. (Easier to take it one step at a time).
Step 1) Find a suitable dataset replacment
Step 2) Try running it in the notebook. If there's not an R^2 above 0.80 than probably no to that dataset.
Step 3) Change module explorations -- see how the plots look.
Step 4) If that dataset otherwise looks good, update all the wording and dataset descriptions.

There's this dataset that has a two time point variable that seems reasonable to use for our differential expression step. It also has 62 samples which should be plenty: https://www.refine.bio/experiments/SRP140558/acute-viral-bronchiolitis-pbmc

It's a bit metadata poor otherwise, but that's going to be the case for a lot of the RNA-seq datasets.

Another dataset with two time points: https://www.refine.bio/experiments/SRP132018/in-vitro-stimulation-of-healthy-donor-blood-with-il-3-cytokine

It has more metadata labels than that previous dataset but still has 56 samples.

Another dataset with two time points: https://www.refine.bio/experiments/SRP132018/in-vitro-stimulation-of-healthy-donor-blood-with-il-3-cytokine

It has more metadata labels than that previous dataset but still has 56 samples.

I looked at some more datasets, but this one seems like it should be fine so I'm going to give it a whirl.

Edit: It looks like its a 2x2 model, two time points and treatment/control. So nevermind. Will try https://www.refine.bio/experiments/SRP140558/acute-viral-bronchiolitis-pbmc now.

This has been wrapped up by #379