rarefaction for first quartile?
marwa38 opened this issue · comments
I want to rarefy my samples to the first quartile.
i don't understand how mp_cal_rarecurve()
works with this?
mouse.time.mpse %<>%
mp_cal_rarecurve(
.abundance = RareAbundance,
chunks = 400
)
I tried to do the rarefaction before converting my phyloseq to mpse but I could not go futher the downstream analysis from there as I get an error
summary(sample_sums(ps.prev))
ps.prev.rare <- rarefy_even_depth(ps.prev, sample.size = 3501, rngseed = 50)
Min. 1st Qu. Median Mean 3rd Qu. Max.
641 3501 20304 52969 68311 321517
`set.seed(50)` was used to initialize repeatable random subsampling.
Please record this for your records so others can reproduce.
Try `set.seed(50); .Random.seed` for the full vector
...
32 samples removedbecause they contained fewer reads than `sample.size`.
Up to first five removed samples are:
M-T1-Rep4M-T5-Rep2M-T5-Rep3M-T5-Rep4M-T5-Rep6
...
2025OTUs were removed because they are no longer
present in any sample after random subsampling
mpse.rare <- ps.prev.rare %>% as.MPSE()
mpse.rare %>% print(width=180)
# A MPSE-tibble (MPSE object) abstraction: 286,732 × 20
# OTU=2956 | Samples=97 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order,
# Family, Genus, Species
OTU Sample Abundance sample.name sample.no Sample.x Phase Region Regime
<chr> <chr> <dbl> <chr> <int> <fct> <fct> <fct> <fct>
1 ASV1 feed-M-Rep1 6 M1 170 feed feed feed M
2 ASV3 feed-M-Rep1 424 M1 170 feed feed feed M
3 ASV5 feed-M-Rep1 0 M1 170 feed feed feed M
4 ASV6 feed-M-Rep1 0 M1 170 feed feed feed M
5 ASV7 feed-M-Rep1 0 M1 170 feed feed feed M
6 ASV8 feed-M-Rep1 0 M1 170 feed feed feed M
7 ASV9 feed-M-Rep1 0 M1 170 feed feed feed M
8 ASV10 feed-M-Rep1 0 M1 170 feed feed feed M
9 ASV11 feed-M-Rep1 0 M1 170 feed feed feed M
10 ASV12 feed-M-Rep1 0 M1 170 feed feed feed M
Sample_Regime sample_regime sample sample_or_control Kingdom
<fct> <chr> <chr> <chr> <chr>
1 feed.M feed.M feed sample k__Bacteria
2 feed.M feed.M feed sample k__Bacteria
3 feed.M feed.M feed sample k__Bacteria
4 feed.M feed.M feed sample k__Bacteria
5 feed.M feed.M feed sample k__Bacteria
6 feed.M feed.M feed sample k__Bacteria
7 feed.M feed.M feed sample k__Bacteria
8 feed.M feed.M feed sample k__Bacteria
9 feed.M feed.M feed sample k__Bacteria
10 feed.M feed.M feed sample k__Bacteria
Phylum Class Order Family Genus Species
<chr> <chr> <chr> <chr> <chr> <chr>
1 p__Firmicutes c__Clostridia o__Oscillospirales f__Ru… g__u… s__un_…
2 p__Firmicutes c__Bacilli o__Lactobacillales f__La… g__L… s__un_…
3 p__Proteobacteria c__Alphaproteobacteria o__Rhodobacterales f__Rh… g__P… s__un_…
4 p__Proteobacteria c__Gammaproteobacteria o__Aeromonadales f__Ae… g__A… s__un_…
5 p__Bacteroidota c__Bacteroidia o__Cytophagales f__Sp… g__A… s__un_…
6 p__Proteobacteria c__Gammaproteobacteria o__Burkholderiales f__Co… g__P… s__un_…
7 p__Verrucomicrobiota c__Verrucomicrobiae o__Verrucomicrobia… f__Ru… g__L… s__un_…
8 p__Proteobacteria c__Alphaproteobacteria o__Rhodobacterales f__Rh… g__R… s__un_…
9 p__Proteobacteria c__Gammaproteobacteria o__Enterobacterales f__En… g__E… s__un_…
10 p__Proteobacteria c__Gammaproteobacteria o__Burkholderiales f__Su… g__A… s__un_…
# … with 286,722 more rows
# ℹ Use `print(n = ...)` to see more rows
mpse.rare %>%
mp_plot_rarecurve(
.rare = Abundance,
.alpha = Observe,
)
Error in `dplyr::select()`:
! Can't subset columns that don't exist.
✖ Column `Abundance` doesn't exist.
Backtrace:
1. mpse.rare %>% ...
8. dplyr:::select.data.frame(., !!rlang::sym("Sample"), !!.rare)
Error in dplyr::select(., !!rlang::sym("Sample"), !!.rare) :
update: I used mp_rrarefy(raresize = 3073) but still getting the error Error in if (obj %>% rowSums %>% var != 0 && !force) { :
missing value where TRUE/FALSE needed
ps.prev <- readRDS("ps.prev.f3.rds")
ps.prev.intes <- subset_samples(ps.prev, Sample == "intestine")
ps.prev.intes <- prune_taxa(taxa_sums(ps.prev.intes) > 0, ps.prev.intes)
summary(sample_sums(ps.prev.intes))
mpse <- ps.prev.intes %>% as.MPSE()
mpse %<>% mp_rrarefy(raresize = 3073)
mpse %<>%
mp_cal_rarecurve(
.abundance = RareAbundance,
chunks = 400
)
Error in if (obj %>% rowSums %>% var != 0 && !force) { :
missing value where TRUE/FALSE needed
mpse %>% print(width=180)
Sorry, I can not generate the same error without your input. Would you mind seeding the input object to me? Or you can set force = TRUE
in mp_cal_rarecurve
since you rarefy your samples to the first quartile, which causes the sum reads of each sample still is unequal. You can refer to the following codes.
> data(mouse.time.mpse)
> mouse.time.mpse
# A MPSE-tibble (MPSE object) abstraction: 4,142 × 11
# OTU=218 | Samples=19 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
OTU Sample Abundance time Kingdom Phylum Class Order Family Genus Species
<chr> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 OTU_1 F3D0 579 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
2 OTU_2 F3D0 345 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
3 OTU_3 F3D0 449 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
4 OTU_4 F3D0 430 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
5 OTU_5 F3D0 154 Early k__Bac… p__Ba… c__B… o__B… f__Ba… g__B… s__un_…
6 OTU_6 F3D0 470 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
7 OTU_7 F3D0 282 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
8 OTU_8 F3D0 184 Early k__Bac… p__Ba… c__B… o__B… f__Ri… g__A… s__un_…
9 OTU_9 F3D0 45 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
10 OTU_10 F3D0 158 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
# … with 4,132 more rows
# ℹ Use `print(n = ...)` to see more rows
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% colSums %>% quantile()
0% 25% 50% 75% 100%
2518.0 4048.0 5017.0 6603.5 16835.0
> mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance, raresize=4048) %>% mp_cal_rarecurve(.abundance=RareAbundance, force=T)
# A MPSE-tibble (MPSE object) abstraction: 4,142 × 13
# OTU=218 | Samples=19 | Assays=Abundance, RareAbundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
OTU Sample Abund…¹ RareA…² time RareAb…³ Kingdom Phylum Class Order Family
<chr> <chr> <int> <int> <chr> <list> <chr> <chr> <chr> <chr> <chr>
1 OTU_1 F3D0 579 343 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
2 OTU_2 F3D0 345 204 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
3 OTU_3 F3D0 449 291 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
4 OTU_4 F3D0 430 277 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
5 OTU_5 F3D0 154 85 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Ba…
6 OTU_6 F3D0 470 290 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
7 OTU_7 F3D0 282 179 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
8 OTU_8 F3D0 184 117 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Ri…
9 OTU_9 F3D0 45 27 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
10 OTU_… F3D0 158 100 Early <tibble> k__Bac… p__Ba… c__B… o__B… f__Mu…
# … with 4,132 more rows, 2 more variables: Genus <chr>, Species <chr>, and
# abbreviated variable names ¹Abundance, ²RareAbundance,
# ³RareAbundanceRarecurve
# ℹ Use `print(n = ...)` to see more rows.
sorry, I thought I attached it earlier
Kindly find attached one of my phyloseq object
ps.prev.rooted.zip