nordhuang / MutScan

Detect important mutations by scanning FastQ files directly

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MutScan

Detect important mutations by scanning FastQ files directly

  • Ultra sensitive
  • 20X+ faster than normal pipeline (i.e. BWA + Samtools + GATK/VarScan/Mutect)
  • Very easy to use. Need nothing else. No alignment, no reference assembly, no variant call, no pileup...
  • Beautiful HTML report
  • Multi-threading support
  • Support both single-end and pair-end data
  • For pair-end data, MutScan will try to merge each pair, and do quality adjustment and error correction

Download

# download use http
https://github.com/OpenGene/MutScan/archive/master.zip

# or download use git
git clone https://github.com/OpenGene/MutScan.git

Build

cd MutScan
make

#Usage

usage: mutscan -1 <read1_file> -2 <read2_file> -m <mutation_file> -h <html_report_file> -t <thread> 
options:
  -1, --read1       read1 file name (string)
  -2, --read2       optional, read2 file name (string)
  -m, --mutation    optional, mutation file name (string)
  -h, --html        optional, filename of html report, no html report if not specified (string)
  -?, --help        print this message
  -t, --thread      thread number, default 4 (int)

The plain text result, contains the detected mutations and their support reads, will be printed directly. You can use > to redirect output to a file, like:

mutscan -1 <read1_file_name> -2 <read2_file_name> -m <mutation_file_name> > result.txt

And you can make a HTML file report with -h argument, like:

mutscan -1 <read1_file_name> -2 <read2_file_name> -m <mutation_file_name> -h report.html

single-end and pair-end

For single-end sequencing data, -2 argument is omitted:

mutscan -1 <read1_file_name> -m <mutation_file_name>

multi-threading

-t argument specify how many worker threads will be launched. The default thread number is 4. Suggest to use a number less than the CPU cores of your system.

Mutation file

A CSV file with columns of name, left_seq_of_mutation_point, mutation_seq and right_seq_of_mutation_point

#name, left_seq_of_mutation_point, mutation_seq, right_seq_of_mutation_point
NRAS-neg-1-115258748-2-c.34G>A-p.G12S-COSM563, GGATTGTCAGTGCGCTTTTCCCAACACCAC, T, TGCTCCAACCACCACCAGTTTGTACTCAGT
NRAS-neg-1-115252203-2-c.437C>T-p.A146V-COSM4170228, TGAAAGCTGTACCATACCTGTCTGGTCTTG, A, CTGAGGTTTCAATGAATGGAATCCCGTAAC
BRAF-neg-7-140453136-15-c.1799T>A -V600E-COSM476, AACTGATGGGACCCACTCCATCGAGATTTC, T, CTGTAGCTAGACCAAAATCACCTATTTTTA
EGFR-pos-7-55241677-18-c.2125G>A-p.E709K-COSM12988, CCCAACCAAGCTCTCTTGAGGATCTTGAAG, A, AAACTGAATTCAAAAAGATCAAAGTGCTGG
EGFR-pos-7-55241707-18-c.2155G>A-p.G719S-COSM6252, GAAACTGAATTCAAAAAGATCAAAGTGCTG, A, GCTCCGGTGCGTTCGGCACGGTGTATAAGG
EGFR-pos-7-55241707-18-c.2155G>T-p.G719C-COSM6253, GAAACTGAATTCAAAAAGATCAAAGTGCTG, T, GCTCCGGTGCGTTCGGCACGGTGTATAAGG

A default CSV file contains important actionable cancer gene targets is already provided in mutation/cancer.csv. If you want to use this mutation file directly, the argument mutation_file_name can be omitted:

mutscan -1 <read1_file_name> -2 <read2_file_name>

HTML output

If -h or --html argument is given, then a HTML report will be generated, and written to the given filename. A sample report is given here:

image

  • The color of each base indicates its quality, and the quality will be shown when mouse over.
  • Click on any row, the original read/pair will be displayed
  • In first column, d means the edit distance of match, and --> means forward, <-- means reverse

About

Detect important mutations by scanning FastQ files directly

License:MIT License


Languages

Language:C 58.3%Language:C++ 41.5%Language:Makefile 0.2%