shishenyxx / PASM

Scripts for PASM, TAS/TASeq, and MPAS. We provided perl+R versions, python versions, and an extended snakemake version with all additional annotations. The scripts are useful for the calculation of variant allelic fractions for the validation and quantification of mosaic mutations.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PASM, TAS/TASeq, and MPAS

1. Overview:

Here are scripts initially written for Postzygotic Amplicon Sequencing for Mosaicism (PASM), Targeted Amplicon Sequencing (TAS/TASeq), and some codes for the method we now define as Massive Parallel Amplicon Sequencing (MPAS). We provided a perl+R version, two standalone python versions, and a Snakemake pipeline. The scripts and pipelines are useful for the calculation of variant allelic fraction (AF) based on not only amplicon based deep sequencing data, but also the AF estimation as well as annotations for postzygotic mosaic variant studies from all kinds of Next Generation Sequencing (NGS) data.


2. Versions and updates:

For the calculation of confidence intervals, you can choose exact binomial confidence interval in R (Clopper-Pearson interval by default), or the iterative methods which considers the base qualities from each base, described in Xu , Yang, and Wu et al. Wei and Zhang. 2015 and Yang and Liu et al. Wu, Wei, and Zhang. 2017, different versions of scripts are available.

Snakemake pipelines:

A Snakemake pipeline with exact binomial CIs and detailed annotations was implemented by Xin Xu and Xiaoxu Yang, with great input form Martin Breuss, the pipeline is based on the Python scripts written by Xin Xu and a previous Snakemake pipeline written by Martin Breuss and Renee D. George. (2019-08-12)

Python versions:

A new python version with exact binomial CIs was implemented by Xin Xu supervised by Xiaoxu Yang (2019-07-24) and fixed by Jiawei Shen (2022-04-22).

A python version was implemented by Xianing Zheng supervised by Xiaoxu Yang. (2016-04-17)

Perl and R versions:

Before you start the Perl + R version:

Note that the Perl package Statistics::R is used to call the yyxMosaicHunter package in R written by Adam Yongxin Ye.
Dependencies of yyxMosaicHunter 0.1.4 are: Rcpp pryr. (2014-11-11)

Instructions for the Perl + R version:

The first part of the perl version is a pileup filter, it takes in SAMTools mpileup results and calculate different characters to count the bases, written by Jiarui Li, modified by Xiaoxu Yang and Xianing Zheng. (2015-03-24)

You can also only output the base qualtiy and deal with the base qualities in R.

If you want to calculate the CIs with PASM Bayesian model, you can use this perl script, or a older version perl script.


3. Example usage:

For the Perl version: samtools mpileup -r ${chr}:${pos}-${pos} -f <reference_file> -Q0 -q0 -AB -d3000 <input_bam> | ./get_ref_alt_baseQ_corrected_calculate_only_2016_12_03.pl


4. Experimental design:

Primers for the amplicons are designed based on the Primer3 Command Line version.


5. Related publications:


6. Contact:

📧 Xiaoxu Yang: u6055394@utah.edu, xiaoxuyanglab@gmail.com


7. Cite the code:

About

Scripts for PASM, TAS/TASeq, and MPAS. We provided perl+R versions, python versions, and an extended snakemake version with all additional annotations. The scripts are useful for the calculation of variant allelic fractions for the validation and quantification of mosaic mutations.

License:GNU General Public License v3.0


Languages

Language:Python 82.4%Language:Perl 16.9%Language:Cython 0.7%