Zilong-Li / BioScripts

Cool Bioinformatics Scripts

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cool Bioinformatics Scripts

Table of Content

You can use make a QQ plot in the following ways.

  • one-liner for reading tons of millions of P values from the pipe
# python 
zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"
# julia (recommand to run it in the REPL)
zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly"

warning : If you have 100 billion P values to process you should definitely use qqplot.jl instead of qqplot.py. The hourly processed number of lines of julia version is 5 billion while python is only 700 million on my server.

  • running in a julia REPL (recommanded)
include("qqplot.jl")
cmd = pipeline(`zcat pval.gz`, `awk 'NR>1{print $10}'`)
sigp, expp = qqfly("test", cmd=cmd)
  • use qqplot.py in your script
import numpy as np
from qqplot import qq
p = np.random.random(1000000)
qq(x=p, figname="test.png")

image/qqplot.png

Usage: downsample.sh [-b <bamlist>] [-d <depth>] [-n <cores>] [-o <outdir>]
Usage: beagle3-imputation.sh [options]
Pipeline of genotype refinement for median depth sequencing data using beagle3

-h,          Display help
-i,          Input VCF/BCF file
-o,          Output folder
-f,          MAF filters before imputation

When you run imputation analysis with BEAGLE (or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.

usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT

warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.

image/calc_imputed_gt_discord.png

plot INFO/R2 after imputation by BEAGLE etc.

image/r2-vs-maf.png

image/njtree-circular.png

Before running bcftools merge, you maybe need to fix the ref and alt and corresponding genotypes, otherwise bcftools will surprise you.

usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF

About

Cool Bioinformatics Scripts

License:GNU General Public License v3.0


Languages

Language:Python 48.6%Language:R 27.3%Language:Shell 14.5%Language:Julia 5.8%Language:C++ 3.8%