MACREL Benchmarking (2019/20)

This repository includes code for benchmarking MACREL.

This is a companion repository to:

Santos-Júnior CD, Pan S, Zhao X, Coelho LP. 2020. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ 8:e10555. DOI: 10.7717/peerj.10555

It contains the rules to rebuild the benchmarks in the paper.

However, instead just running the code, we strongly recommend you read it, as some steps depended on inputs obtained from manual curation

To evaluate benchmarking results over tested AMP and hemolytic peptides prediction models, please refer to the "train" folder in Macrel.

The other results showed in the MACREL benchmarking can be reproduced using the scripts in the following order:

(1) Benchmark.sh

(2) Macrel_in_real_metagenomes.sh

(3) Annotation_rules.sh

-- To generate Figure 3, please run:

$ python3 Figure_3_rendering.py

-- To generate Figure 4, please run:

$ ./python3 Figure_4_rendering.py

Homology effect

In order to check homology in the training and testing data sets, please go to "homology effects" folder and follow the command:

$ ./retrain_complete.sh

This will retrain all models from MACREL, iAMP-2L and AMP Scanner v.2 with the non-redundant data sets, previously clustered with cd-hit at 80% of identity. The measures of accuracy, precision, and the confusion matrices will also be available. Be aware some of them can be generated in different time and will be printed in the screen.

Third party softwares

In order to run all the codes, you will need besides MACREL:

Spurio
ArtMountRainier
BlastAll+
pigz
R v3.5+
samtools
Conda
Macrel
Python 3+

BigDataBiology / macrel2020benchmark

MACREL Benchmarking (2019/20)

Contents

Homology effect

Third party softwares

About

Languages