allista / GetIsolationSources

This is a small command line utility that generates distribution of isolation sources given fasta files containing GenBank IDs in sequence descriptions.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GetIsolationSources is a small command line utility that, given fasta files containing GenBank IDs in sequence descriptions, generates a per sequence list of isolation sources and their distribution (i.e. number of sequences per isolation source).

It searches for IDs using regular expressions in accordance with NCBI specifications, so the format of description strings does not matter.

To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.

It is distributed as a source code supporting python setup tools.

GetIsolationSources uses BioPython. So if you're using source code distribution, the latest version of BioPython should be installed.

Downaloads


GetIsolationSources by Allis Tauri is licensed under the MIT license.

About

This is a small command line utility that generates distribution of isolation sources given fasta files containing GenBank IDs in sequence descriptions.

License:MIT License


Languages

Language:Python 100.0%