slw287r / kmsk

Mask homologous sequences in query fasta(s) by kmers of subject fasta

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kmsk

Mask homologous sequences in query fasta(s) by kmers (k≤31) of subject fasta

Build

git clone https://github.com/slw287r/kmsk.git
cd kmsk
make

Usage

  • Output masked query fastas
kmsk -s t2t.fna -p t2tmsk -o . -t 8 562.fna 1280.fna 5476.fna
# output
562.t2tmsk.fna
1280.t2tmsk.fna
5476.t2tmsk.fna
  • Output masked region bed file
kmsk -r -s t2t.fna -p t2tmsk -o . -t 8 562.fna 1280.fna 5476.fna
# output
562.t2tmsk.bed
1280.t2tmsk.bed
5476.t2tmsk.bed

* ~40GB memory is used for the Homo sapiens subject.

About

Mask homologous sequences in query fasta(s) by kmers of subject fasta

License:MIT License


Languages

Language:C 99.7%Language:Makefile 0.3%