avanwallendael / mash_sim

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mash_sim

Part 1: simulated reads

We simulated reads in python using ipcoal and toytree. Scripts were run in a Jupyter notebook: mash_sim_github.ipynb
We cleaned these reads for analysis in R: process_sim_polyploid.R
These outputs were analyzed in two pipelines: 1: mash, and 2: alignment

  1. Simulated haploid and polyploid data analyzed with mash: mash_sim_github.bash
  2. Simulated haploid and polyploid data analyzed with alignment tools: polymiss_github.bash
    Next we removed reads from simulated polyploid data and tested the performance of each pipeline again: make_missing_github.R

Finally we calculated distances for aligned data and compared results using polymiss_dist_github.R

Part 2: real reads

We downloaded reads from published datasets for three studies in Panicum, Capsella, and Reynoutria. SRA projects: PRJNA622568, PRJNA299253, PRJNA574173.
We performed mash distance estimation ex. capsella_mash.bash Then cleaned and visualized results by incorporating published metadata ex. mash_clean_viz.R

About


Languages

Language:Jupyter Notebook 93.0%Language:R 4.9%Language:Shell 2.1%