mebh / pypgen

A population genetics module written in Python

Home Page:www.ngcrawford.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Welcome to Pypgen (v0.2.1) BETA

Pypgen provides various utilities for estimating standard genetic diversity measures including Gst, G'st, G''st, and Jost's D from large genomic datasets (Hedrick, 2005; Jost, 2008; Masatoshi Nei, 1973; Nei & Chesser, 1983). Pypgen operates both on individual SNPs as well as on user defined regions (e.g., five kilobase windows tiled across each chromosome). For the windowed analyses pypgen estimates the multi-locus versions of each estimator.

Features:

  • Handles multiallelic SNP calls
  • Allows a single VCF file to contain multiple populations
  • Operates on standard VCF (Variant Call Format) formatted SNP calls
  • Uses bgziped input for fast random access
  • Takes advantage of multiple processor cores
  • Calculates additional metrics:
    • snp count per window
    • mean read depth (+/- STDEV) per window
    • populations with fixed alleles per SNP
    • more as I think of them

Important Note:

PYPGEN IS STILL IN ACTIVE DEVELOPMENT AND ALMOST CERTAINLY CONTAINS BUGS. If you find a bug please file a report in the issues section of the github repository and I'll address it as soon as I can.

Enclosed Scripts:

  • Sliding window analysis (vcf_sliding_window.py)
  • Per SNP analysis (vcf_snpwise_fstats.py)

Dependancies:

Documentation:

Detailed documentation is available on ReadTheDocs. It includes a tutorial and installation instructions.

About

A population genetics module written in Python

www.ngcrawford.com

License:BSD 3-Clause "New" or "Revised" License