mmferrari / SequenceExtractor

Utility to extract MIC subsequences from mds_ies_db

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sequence Extractor

Description

Extract MIC subsequences from the <mds_ies_db>.

Input file must contain coordinates in BED format, while the output will use 1-based coordinates system.

Usage

usage: sequence_extractor.py [-h] -i INPUT_FILE -o OUTPUT_FILE -f
                             INPUT_FILE_FORMAT [-min MIN_LENGTH]
                             [-max MAX_LENGTH] [-p PREFIX_LENGTH]
                             [-s SUFFIX_LENGTH] -d FOLDER

Extract subsequence

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        File containing DNA sequence info with coordinates in
                        BED format (0-based exclusive)
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Output file with coordinates in 1-based inclusive
                        system
  -f INPUT_FILE_FORMAT, --input-file-format INPUT_FILE_FORMAT
                        Input file format
  -min MIN_LENGTH, --min-length MIN_LENGTH
                        Minimum subsequence length
  -max MAX_LENGTH, --max-length MAX_LENGTH
                        Maximum subsequence length
  -p PREFIX_LENGTH, --prefix-length PREFIX_LENGTH
                        Length of the prefix to be added to subsequence
  -s SUFFIX_LENGTH, --suffix-length SUFFIX_LENGTH
                        Length of the suffix to be added to subsequence
  -d FOLDER, --folder FOLDER
                        Folder with full sequences. File names format is
                        "<SEQUENCE_NAME>_sequences.fasta"

About

Utility to extract MIC subsequences from mds_ies_db

License:GNU General Public License v3.0


Languages

Language:Python 100.0%