xjtu-omics / Mako

A graph-based pattern growth approach for CSV discovery

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mako_logo

Mako is a bottom-up guided model-free CSV detection tool. It first builds a mutational signal graph and utilizes pattern growth to detect maximal subgraphs as CSVs.

mako_workflow

Please check the wiki page for more details.

License

SVision is free for non-commercial use by academic, government, and non-profit/not-for-profit institutions. A commercial version of the software is available and licensed through Xi’an Jiaotong University. For more information, please contact with Jiadong Lin (jiadong324@stu.xjtu.edu.cn) or Kai Ye (kaiye@xjtu.edu.cn).

Citation

Please cite the original paper if you are using the results and software.

Jiadong Lin, Xiaofei Yang, Walter Kosters, Tun Xu, Yanyan Jia, Songbo Wang, Qihui Zhu, Mallory Ryan, Li Guo, Chengsheng Zhang, Charles Lee, Scott E. Devine, Evan E. Eichler, Kai Ye, Mako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variants, Genomics, Proteomics & Bioinformatics, 2021

Mako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variants

Install and run

Mako requires Java JDK (>=1.8), we provide a prebuilt JAR package Mako.jar for directly usage. Please check release.

Dependency

  • htsjdk (https://github.com/samtools/htsjdk): A Java API for processing high-throughput sequencing (HTS) data.
  • Python (V>=3.6): This is required for creating Mako configuration file.
    • Required package: pysam, pandas, numpy

Usage

NOTE: BAM file should under your working directory.

# Configuration
python ParseMako.py config -b sample.bam -n 30000 -w ./working_dir/ -s sampleName -f /path/to/ref.fa.fai
# Detection
java -jar Mako.jar -R /path/to/ref.fa -F /path/to/sampleName.mako.cfg
# Convert to VCF format (optional)
python ParseMako.py tovcf -m sampleName_mako_calls.txt -o sampleName_mako.vcf

Run demo data

# Create configuration file
python ParseMako.py config -b NA19240.30X.chr20.1000K-2000K.bam -n 30000 -w ./working_dir/ -s NA19240 -f /demo.fa.fai

# Run Mako
java -jar /path/to/Mako.jar -R /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa -F /path/to/NA19240.mako.cfg

Known issues

  1. Please make sure the reference used for running Mako is identical to the alignment one.
  2. ...

Contact

If you have any questions, please feel free to contact with Jiadong Lin (jiadong66@stu.xjtu.edu.cn) or Kai Ye (kaiye@xjtu.edu.cn)

About

A graph-based pattern growth approach for CSV discovery


Languages

Language:Java 93.2%Language:Python 6.8%