cjfields / genomediff-python

GenomeDiff (*.gd) file parser for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

genomediff-python

genomediff-python parses files in the GenomeDiff format generated by the breseq variant caller for haploid microbial organisms.

Installation

pip3 install genomediff

Only Python 3.x is tested.

Usage

GenomeDiff files are read using GenomeDiff.read(file). The GenomeDiff object contains a metadata dict with the meta data, as well as mutations, evidence and validation lists—each containing records of that type. Records can be accessed through this list or by id. GenomeDiff is iterable and iterating will return all records of all types.

>>> from genomediff import *
>>> document = GenomeDiff.read(open('MyDiff.gd', 'r', encoding='utf-8'))
>>> document.metadata
{'GENOME_DIFF': '1.0', 'AUTHOR': ''}
>>> document.mutations[0]
Record('SNP', 1, [191], new_seq='A', seq_id='NC_000913', snp_type='intergenic',  position=12346)
>>> document.mutations[0].parent_ids
[191]
>>> document[191]
Record('RA', 191, None, tot_cov='46/42', new_base='A', insert_position=0, ref_base='G', seq_id='NC_000913', quality=252.9, position=12345)
>>> document.mutations[0].parents
[Record('RA', 191, None, tot_cov='46/42', new_base='A', insert_position=0, ref_base='G', seq_id='NC_000913', quality=252.9, position=12345)]

Contribution

Contribution to this project is welcomed. Wishlist:

  • Writing GD files
  • Python 2.x support

About

GenomeDiff (*.gd) file parser for Python

License:MIT License


Languages

Language:Python 100.0%