Bioinformatic Tools |
OmicTools |
Collection of many many tools that can be useful for some bioinformatic anlyses |
4 |
16S pipeline |
Gloor Lab dada2 pipeline |
This pipeline will take your paired fastq reads (from Illumina MiSeq or HiSeq) and generate an OTU counts table with an approximate taxonomy assignment. The reads have to have been generated using Gloor Lab Illumina SOP so that the reads are paired, overlapping, and contain the barcode and primer information (have not been demultiplexed or had primers or barcodes removed). |
8 |
Metagenomics |
SingleM |
SingleM is a tool to find the abundances of discrete operational taxonomic units (OTUs) directly from shotgun metagenome data, without heavy reliance on reference sequence databases. It is able to differentiate closely related species even if those species are from lineages new to science. |
13 |
Gene annotation |
Pulpy |
An automated, reproducible and scalable prediction of Polysaccharide Utilisation Loci (PUL) in 5414 public Bacteroidetes genomes. The predictions are fully open and can be accessed and used by any researcher, commercial or otherwise. |
17, 18, 19; preprint 20 |
16S pipeline |
mare |
The mare R package is an easy-to-use pipeline for microbiota analysis based on 16S-amplicon reads. It takes the raw reads, creates taxonomic tables, visualises the results, and finally identifies organisms significantly associated with variables of interest. For read processing, OTU clustering, and taxonomic annotation |
32 |
WGS assembly pipeline |
pgap |
The official bacterial whole genome assembly pipeline of NCBI |
33, 674 |
r-package |
picante |
Phylocom integration, community analyses, null-models, traits and evolution in R |
39 |
tree-modeling |
iq-tree |
Fast and effective stochastic algorithm to reconstruct phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihood while requiring similar amount of computing time |
45 |
modeling |
PartitionFinder2 |
PartitionFinder2 is a program for selecting best-fit partitioning schemes and models of evolution for nucleotide, amino acid, and morphology alignments. |
47 |
Function Prediction |
PICRUST |
Predicts functions of total genomes based on 16S sequences |
49 |
Function Prediction |
Tax4Fun |
Predicts functions of total genomes based on 16S sequences |
50 |
ML-classifier |
MicroPheno |
is a reference- and alignment-free approach for predicting the environment or host phenotype from microbial community samples based on k-mer distributions in shallow sub-samples of 16S rRNA data. |
54, 55 |
OTU-generator |
DiTaxa |
alignment- and reference- free subsequence based 16S rRNA data analysis, as a new paradigm for microbiome phenotype and biomarker detection |
56 |
OTU-geneartor |
HmmUFOtu |
An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool for microbiome amplicon sequencing studies |
58 |
OTU-generator |
otu2ot |
Oligotyping for R |
59 |
Microbiomics SOP |
Microbiome_helper |
Microbiome Helper is a repository that contains several resources to help researchers working with microbial sequencing data |
62 |
16S Pipeline |
SeekDeep |
is one command line program that contains several programs within that all combined together make up the SeekDeep targeted sequencing analysis pipeline |
67, 68 |
R Package - ShinyApp |
FastqCleaner |
An interactive web application for quality control, filtering and trimming of FASTQ files. |
81, 82 |
Preprocessing tool |
fastp |
A tool designed to provide fast all-in-one preprocessing for FastQ files mainly used to correct R1 and R2 reads for better merging |
83, 84 |
Python tool |
ncbi-genome-download |
Some script to download bacterial and fungal genomes from NCBI after they restructured their FTP a while ago. |
85 |
Pipeline |
phyloFlash |
phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an Illumina (meta)genomic or transcriptomic dataset. |
86 |
Tool |
DUDE-Seq |
DUDE-Seq: Fast, flexible, and robust denoising of nucleotide sequences |
92, 93 |
Python tool |
RAMBL |
A tool for the assembly of full-length 16S genes in metagenomic shotgun data |
100, 101 |
Classification tool |
CAMITAX |
Taxonomic assignment workflow based on multiapproach |
105, 106 |
Docker container |
speciesprimer |
The SpeciesPrimer pipeline is intended to help researchers finding specific primer pairs for the detection and quantification of bacterial species in complex ecosystems |
111 |
tool |
EnaBrowserTools |
enaBrowserTools is a set of scripts that interface with the ENA web services to download data from ENA easily, without any knowledge of scripting required |
116 |
Toolkit |
NCBI Toolkit |
NCBI C++ Toolkit provides free, portable, public domain libraries with no restrictions use - on Unix, MS Windows, and Mac OS platforms |
119 |
tool |
FastANI |
Fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI) |
120, 121 |
toolbox |
EzBio tools |
OrthoANI, UBCG and other useful tools for WGS analyses |
122 |
data wrangling |
Bioinformatics one-liners |
Useful bash one-liners useful for bioinformatics |
133 |
web-workbench |
imngs |
Integrated Microbial NGS platform |
143, 144 |
Pipeline |
Roary |
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome. |
147 |
Tool |
OrthoFinder |
It finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of the gene duplcation events in those gene trees. |
149 |
Tool |
CarveMe |
CarveMe is a python-based tool for genome-scale metabolic model reconstruction. |
152, 153, 154 |
Tool |
SMETANA |
Species METabolic interaction ANAlysis is a python-based command line tool to analyse microbial communities |
155, 156 |
Tool |
FRAMED |
a python package for analysis and simulation of metabolic models. The main focus is to provide support for different modeling approaches |
157 158 162 |
Tool |
cobrapy |
COBRA methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. cobrapy is a constraint-based modeling package that is designed to accommodate the biological complexity of the next generation of COBRA models and provides access to commonly used COBRA methods, such as flux balance analysis, flux variability analysis, and gene deletion analyses |
159 |
Tool |
GPRTransform |
It contains an implementation of the method that transforms an SBML model by integrating the GPR associations directly into the stoichiometric matrix. This enables gene-based analysis using several constraint-based methods 163 164 |
|
Tool |
eggnog-mapper |
a tool for fast functional annotation of novel sequences (genes or proteins) using precomputed eggNOG-based orthology assignments |
165 166 |
Pipeline |
miQTL-cookbook |
This is the cookbook for performing the GWAS analysis of microbial abundance based on analysis of 16S rRNA sequencing dataset |
167 |
Tool |
DuctApe |
The final purpose of the program is to combine the genomic informations (encoded as KEGG pathways) with the results of phenomic experiments (Phenotype Microarrays) and highlight the genes that may be responsible for phenotypic variations |
170 |
Tool |
VFFVA |
FVA is the workhorse of metabolic modeling. It allows to characterize the boundaries of the solution space of a metabolic model and delineates the bounds for reaction rates |
174 175 |
Pipeline |
BACTpipe |
Automatic Assembly and Annotation from raw reads in a very clean implemented nextflow pipeline |
178 |
Pipeline |
MAG core |
Automatic assembly and annotation from raw reads of metagenomic data implemented in nextflow pipeline |
179 |
Pipeline |
Tychus Nextflow |
Automatic whole genome assembly and annotation of isolate strain. Uses multiple assemblers and takes consensus |
180 |
Pipeline |
IMP |
Reference-independent metagenomic and metatranscriptomic bacterial assembly |
182, 183 |
Tool |
DESMAN |
de novo extraction of strains from metagenomes, enables strain inference from frequency counts on contigs across multiple samples |
184 185 |
SOP |
MicroBiome Quality Control (MBQC) |
MBQC is a collaborative effort to comprehensively evaluate methods for measuring the human microbiome |
187 |
Pipeline |
MIDAS |
an integrated pipeline that leverages >30,000 reference genomes to estimate bacterial species abundance and strain-level genomic variation, including gene content and SNPs, from shotgun metagnomes |
196 197 |
Tool |
MAGpurify |
algorithms to identify contamination in metagenome-assembled genomes (MAGs) |
198 |
Tool |
MicrobeCensus |
a fast and easy to use pipeline for estimating the average genome size (AGS) of a microbial community from metagenomic data |
199 |
Tool |
IGGsearch |
it accurately quantifies species presence-absence and species abundance by mapping reads to a database of species-specific marker genes |
200 |
Tool |
MIDAS-strains |
Estimate strains from reads mapped to pan-genomes from the MIDAS database |
201 |
Tool |
AssemblyEvaluator |
Evaluate the completedness and precision of a (meta)genomic assembly by mapping contigs to a complete reference genome |
202 |
Tools |
Biobakery Workflows |
Set of tools by Huttenhower that can be fairly easily executed with pre-defined workflows, useful for metagenomics and metatranscriptomics |
204 |
Tools |
Anvi'o |
Anvi’o is an open-source, community-driven analysis aation platform for ‘omics data |
208 209 210 211 |
Tool |
WAFFLE |
the Workflow to Annotate Assemblies and Find Lateral Gene Transfer (LGT) Events |
212 |
Tool |
AUTOGRAPH |
AUtomatic Transfer by Orthology of Gene Reaction Associations for Pathway Heuristics, is a semi-automatic approach to accelerate the process of genome-scale metabolic network reconstruction by taking full advantage of already manually curated networks |
214 |
Tool |
pyTARG |
a library that contains functions to work with Genome Scale Metabolic Models with the goal of finding drug targets against cancer |
223 224 |
Assembler |
Unicycler |
An assembler for short and long read hybrid assembly, works with SPADES and then something else for long reads. |
227 |
R package |
microclass |
an R-package for 16S taxonomy classification |
231 232 |
Tool |
Prodigal |
Fast, reliable protein-coding gene prediction for prokaryotic genomes |
233 234 |
Tool |
STAMP |
a graphical software package that provides statistical hypothesis tests and exploratory plots for analysing taxonomic and functional profiles |
235 236 |
Tool |
CheckM |
an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes |
237 238 |
R script |
consenTRAIT |
Phylogenetic conservatism of functional traits in microorganisms. a phylogenetic metric that estimates the clade depth where organisms share a trait |
239 240 |
NIH Tools |
NIH Genome Inforamtics Section |
Tools for various bioinformatic tasks, assembly, Mash, metagenomes, Krona, MUMmer alignment |
242 |
R package |
mmgenome |
Tools for extracting individual genomes from metagneomes |
243 244 |
Tool |
SPAdes |
St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines |
254,365 |
Tool |
SqueezeMeta |
a fully automated metagenomics pipeline, from reads to bins |
261 262 |
Tool |
MetaWRAP |
a flexible pipeline for genome-resolved metagenomic data analysis |
263 264 |
R Package |
HiMap |
High-resolution Microbial Analysis Pipeline to Strain level with dada2 and curated HiMapDB |
273 274 |
Research Group |
van nimwegenlab |
a range of software tools, web-services, and databases in regulatory and comparative genomics for WGS |
275 |
Tool |
Rnammer |
predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in full genome sequences |
278 |
Tool |
RANGER-DTL |
Rapid ANalysis of Gene family Evolution using ReconciliationDTL is a software package for inferring gene family evolution by speciation, gene duplication, horizontal gene transfer, and gene loss |
279 |
Tool |
Darkhorse |
a bioinformatic method for rapid, automated identification and ranking of phylogenetically atypical proteins on a genome-wide basis |
280 |
Tool |
ABRicate |
Mass screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with multiple databases: Resfinder, CARD, ARG-ANNOT, NCBI BARRGD, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB |
286 334 |
Tool |
MetaCompare |
MetaCompare is a computational pipeline for prioritizing resistome risk by estimating the potential for ARGs to be disseminated into human pathogens from a given environmental sample based on metagenomic sequencing data |
287 |
Tool |
DeepARG |
DeepARG is a machine learning solution that uses deep learning to characterize and annotate antibiotic resistance genes in metagenomes |
288 |
Tool |
SSTAR |
Sequence Search Tool for Antimicrobial Resistance combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying antimicrobial resistance (AR) genes from genomic data |
289 290 |
Tool |
ProtCNN ProtENN |
Predicting the function of a protein from its raw amino acid sequence is the critical step for understanding the relationship between genotype and phenotype |
295 |
Benchmarking |
Long-read-assembler-comparison |
Benchmarking of long-read assembly tools for bacterial whole genomes |
298 |
conda |
bioconvert |
is a collaborative project to facilitate the interconversion of life science data from one format to another |
299 |
Tool |
bin3C |
Extract metagenome-assembled genomes (MAGs) from metagenomic data using Hi-C |
303 304 |
Tool |
MAGpy |
Snakemake pipeline for downstream analysis of metagenome-assembled genomes (MAGs) (pronounced mag-pie) |
305 306 |
Tool |
graftM |
a tool for scalable, phylogenetically informed classification of genes within metagenomes |
307 308 |
Tool |
GFinisher |
a tool for refinement and finalization of prokaryotic genomes assemblies using the bias of GC Skew to identify assembly errors and organizes the contigs/scaffolds with genomes references |
311 312 |
Tool |
Autometa |
automated extraction of microbial genomes from individual shotgun metagenomes |
314 315 |
Tool |
iMGEins |
detecting novel mobile genetic elements inserted in individual genomes (MGEs) |
316 317 |
Tool |
McClintock |
an Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data (MGEs) |
320 321 |
Webtool |
PHASTER |
a better, faster version of the PHAST phage search tool |
322 323 |
Tool |
ISQuest |
identifies bacterial ISs and their sequence elements—inverted and direct repeats—in raw read data or contigs using flexible search parameters (MGEs) |
324 325 |
Tool |
VirSorter |
mining viral signal from microbial genomic data |
326 327 |
Tool |
RAST |
(Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating complete or nearly complete bacterial and archaeal genomes |
329 330 |
Tool |
ShortBRED |
Tool by Huttenhower group that identifies protein families in metagenomic samples. Useful for protein profiling?? |
336 |
Tool & R package |
GSEA |
Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes) |
337 338 |
Tool, Database |
GMMs Omixer |
Tool with curated database by raes lab that links metagenomic samples to functions and metabolic capabilities |
342, 343, 344, 523 |
Tool |
GRASP2 |
fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data |
345, 346 |
Tool |
Picrust2 |
a software for predicting functional abundances based only on marker gene sequences |
347, 348 |
Pipeline |
Antimicrobial Resistance Finder |
Nextflow pipeline to identify antimicrobial resistances protein sequences, looks simple to use |
350 |
Tool |
Geptop2 |
a gene essentiality prediction tool for complete-genome based on orthology and phylogeny |
351, 352 |
Tool |
Asgan |
[As]sembly [G]raphs [An]alyzer – is a tool for analysis of assembly graphs |
353 |
Tool |
PopCOGenT |
Identifying microbial populations using networks of horizontal gene transfer |
355 |
Tool |
PhiSpy |
a novel algorithm for finding prophages in bacterial genomes that combines similarity-and composition-based strategies |
356, 357 |
Tool |
MetaCurator |
Software for curating reference sequence databases used in barcoding, metabarcoding and metagenomics |
359, 360 |
Tutorial |
astrobiomike |
This site aims to be a useful resource for bioinformatics beginners |
361,362 |
Tool |
(sour)Mash |
fast genome and metagenome distance estimation using MinHas |
363,364 |
Tool |
(meta)pasmidSpades |
for plasmid assembly in metagenomic data sets that reduced the false positive rate of plasmid detection compared with the state-of-the-art approaches |
364,365 |
Tool |
IslandViewer4 |
integrates four different genomic island prediction methods: IslandPick, IslandPath-DIMOB, SIGI-HMM, and Islander |
366,367 |
Tool, Server |
Specl |
Web server (but also stand-alone tool) to determine species classification of whole genome based on ~40 universal single copy marker genes. |
370 |
Tool |
iRep |
is a method for determining replication rates for bacteria from single time point metagenomics sequencing and draft-quality genomes |
374,375 |
Tool |
antiSMASH |
allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes |
376,377,378 |
Tool |
NeuRiPP |
a neural network framework designed for classifying peptide sequences as putative precursor peptide sequences for RiPP biosynthetic gene clusters |
379,380 |
Tool |
PhyloMagnet |
Pipeline for screening metagenomes, looking for arbitrary lineages, using gene-centric assembly methods and phylogenetics |
381,382 |
Tool |
KrakenUnique |
Kraken based tool for classifying metagenomic reads with an additional algorithm that checks for unique Kmer matches - maybe similar to cosmosID approach |
383 |
Tool |
Mash |
Tool for classifying metagenomic reads similar to kraken which uses min Hash to identify species |
384 |
Tool |
RefSeq_mash |
Tool for checking what NCBI reference genomes raw reads match to or overall which reference genome fits the best, should be very fast. |
385 |
Pipeline |
Hybrid Assembler |
Hybrid Assembly pipeline in Nextflow thats coupled with a plasmIDent which identifies plasmids and resistance genes |
390, 391 |
Tool |
RMI |
Comprehensive antimicrobial resistance (AMR) gene finder tool online for quick analysis of genome sequences |
392 |
Pipeline |
SqueezeMeta |
A full automatic pipeline for metagenomics/metatranscriptomics, covering all steps of the analysis |
394, 395 |
Review |
Identifying repeats and transposable element |
Nice nature review that describes various software for finding these things but a bit oldated |
395 |
Tool |
ARDaP |
Antimicrobial Resistance Detection and Prediction) is a genomics pipeline for the comprehensive identification of antibiotic resistance markers from whole-genome sequencing data |
399 |
Tool |
Flye |
New long read assembler thats faster and often better than others published by USCD |
400 |
Tool |
Ra |
Overlap-layout-consensus based DNA assembler of long uncorrected reads (short for Rapid Assembler) |
403, 404 |
Tool |
Metagenomics-Index-Correction |
This repository contains scripts used to prepare, compare and analyse metagenomic classifications using custom index databases, either based on default NCBI or GTDB taxonomic systems |
405, 406 |
Tool |
drep |
a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set |
407 |
Tool |
strainProfiler |
Program to analyze strain-level diversity within a population |
408 |
Tool |
seqtk |
Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format |
409, 410 |
Tool |
anvio to bandage tools |
converts output from Anvi'o, a MAG binning tool, to the coloring scheme preferred by Bandage, an assembly visual tool, to improve binning especially for mobile genes (tranposons, recently horizontally transferred, etc.) 413 |
|
Tool |
OPERA-MS |
OPERA-MS is a hybrid metagenomic assembler which combines the advantages of short and long-read |
414, 415 |
Tool |
traitar |
Traitar is a software for characterizing microbial samples from nucleotide or protein sequences. It can accurately phenotype 67 diverse traits. |
418, 419, 420 |
Tool |
PhyloRank |
PhyloRank provide functionality for calculating the relative evolutionary divergence (RED) of taxa in a tree and for finding the best placement of taxonomic labels in a tree. |
421 |
Tool |
AnnoTree |
is a web tool for visualization of genome annotations across large phylogenetic trees. |
422, 423, 424 |
Tool |
AMRfinderPlus |
Antibiotic resistance gene finder from NCBI |
425, 426, 678 |
Tool |
nanotext |
This library enables the use of embedding vectors generated from a large corpus of protein domains to search for similar genomes, where similar is the cosine similarity between one genome's vector and another's. Think about protein domains as words, genomes as documents, and search as a form of document retrieval based on the notion of topic. |
427, 428, 453 |
Tool |
biomartr |
Download genomes from NCBI or other databases by specifying species or group name automatically in R |
429 |
Tool |
Starmr |
Tool in bioconda to scan for through plasmidfinder, Resfinder, pointfinder and then produce nice summary files with the results |
430 |
Tool |
TRF |
Tandem Repeat Finder and Tandem Repeats Database (TRDB) |
432, 433 |
Tool |
MIST |
a tool for rapid in silico generation of molecular data from bacterial genome sequences |
434, 435 |
Tool |
mummer |
Visualization of correct aligment between genomes |
436, 887, 888, 889 |
Tool |
Dot2dot |
accurate whole-genome tandem repeats discovery |
437, 438 |
Tool |
miCompletete |
An "easy" to use tool to quickly assess the completeness and quality of new genome assemblies, kind of like checkM but with some tweaks |
439 |
Tool, Database |
ARO |
Antibiotic resistance ontology database and webserver to quickly get phenotype information based on genes IDs |
440, 441 |
Webapp |
LINbase |
a database designed for the purpose of accelerating and simplifying the description of Earth's microbial diversity at a precision that includes, but also goes beyond, named species |
447, 448 |
R package |
RbioRXN |
facilitate retrieving and processing biochemical reaction data such as Rhea, MetaCyc, KEGG and Unipathway, the package provides the functions to download and parse data, instantiate generic reaction and check mass-balance. The package aims to construct an integrated metabolic network and genome-scale metabolic model |
450 |
Tool |
Mumame |
Mutation Mapping in Metagenomes is a software tool that allows mapping of shotgun metagenomic reads to point mutations. Designed for Antibiotic Resistance mutations |
451, 452 |
Tool |
Cobra |
Constraint-based reconstruction and analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental molecular systems biology data and quantitative prediction of physicochemically and biochemically feasible phenotypic states |
460, 461, 462, 467 |
Tool |
METABOLIC |
(METabolic And BiogeOchemistry anaLyses In miCrobes), a scalable high-throughput metabolic and biogeochemical functional trait profiler based on microbial genomes |
463, 464 |
Tool |
PhenotypeSeeker |
Identify phenotype-specific k-mers and predict phenotype using sequenced bacterial strains |
465, 466 |
R-package |
MetaboAnalystR |
An R Package for Comprehensive Analysis of Metabolomics Data |
468, 472, 473 |
Shiny-App |
MetaboShiny |
a novel R and RShiny based metabolomics data analysis package |
469, 470, 471 |
Tool |
micom |
micom is a Python package for metabolic modeling of microbial communities |
492, 493, 494 |
Tool |
Struo |
a pipeline for building custom databases for common metagenome profilers |
498, 499 |
Tool |
ubialSim |
This is µbialSim (pronounced microbialsim), a dynamic Flux-Balance-Analysis-based simulator for complex microbial communities. Batch and chemostat operation can be simulated |
500, 501 |
Tool |
ConFindr |
to find bacterial intra-species contamination in raw Illumina data. It does this by looking for multiple alleles of core, single copy genes. |
507, 508, 722 |
Tool |
MetaSanity |
a wrapper-script for genome/metagenome evaluation tasks. This script will run common evaluation and annotation programs and create a BioMetaDB project with the integrated results |
509 |
Tool |
REAPR |
From Sanger institute, it maps paired-end reads to de-novo assembly to check for assembly errors and can break up wrong scaffolds |
511 |
Tool |
Kaiju |
Metagenomic read classification based on Amino acid sequences. Suggested by Gabi that it works well |
512 |
Tool |
mOTU2 |
The mOTUs profiler is a computational tool that estimates relative abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data. |
513, 514 |
Tool |
fetchMG |
it extracts the 40 MGs from genomes and metagenomes in an easy and accurate manner. |
515 |
Tool |
Metage2Metabo |
is a Python3 (Python >= 3.6) tool to perform graph-based metabolic analysis starting from annotated genomes (reference genomes or metagenome-assembled genomes). It uses Pathway Tools in a automatic and parallel way to reconstruct metabolic networks for a large number of genomes |
518, 519 |
R package |
AMR |
simplify the analysis and prediction of Antimicrobial Resistance (AMR) |
520, 521, 878 |
Tool |
GRASE |
Genome Relative Abundance to Sequencing Effort (GRASE) |
522 |
Tool |
FMAP |
Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies |
524, 525, 526 |
Tool |
ResPipe |
A nextflow-pipeline for interrogating metagenomes for Antimicrobial Resistance Genes (CARD-based), Insertion Sequences and Enterobactericeae Plasmids |
527, 528 |
Tool |
epa-ng |
A tool to place a sequence among an already calculated tree such as SILVA. Similar to pplacer |
535 |
Tool |
ngs-less |
A toolbox for metagenomics analyeses by Peer Bork at Embl. Has MOCAT integrated with mOTUs and functional profiling |
536 |
R package |
Castor |
Interesting to calculate relative evolutionary divergence (RED) with get_reds to calculate relative evolutionary divergences in a tree |
537, 538 |
R package |
themetagenomics |
themetagenomics provides functions to explore topics generated from 16S rRNA sequencing information on both the abundance and functional levels. It also provides an R implementation of PICRUSt and wraps Tax4fun, giving users a choice for their functional prediction strategy |
543, 544 |
Tool |
prokka2kegg |
This script is used to assign KO entries (K numbers in KEGG annotation) according to UniProtKB ID in the .gbk file generated by Prokka |
546 |
Toolset |
PAGIT |
From Wellcome Sanger Institute a set of tools to polish draft genomes and correct annotation |
547 |
Tool |
DFAST |
a flexible and customizable pipeline for prokaryotic genome annotation as well as data submission to the INSDC |
552, 553 |
Tool |
DeepVariant |
is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data |
556 |
Tool |
Apollo |
Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size |
563, 564 |
Tool |
Minipolish |
A tool for Racon polishing of miniasm assemblies |
566 |
Tool |
AMON |
A command line tool for predicting the compounds produced by microbes and the host |
567 |
Tool |
Coinfinder |
A tool for the identification of coincident (associating and dissociating) genes in pangenomes |
568, 569, 570 |
Tool |
wtdbg2 |
Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT) |
571, 572 |
Tool |
freebayes |
a haplotype-based variant detector |
573, 574, 578 |
Tool |
qualimap |
to facilitate the quality control of alignment sequencing data and its derivatives like feature counts; like FastQC for WGS and MAGs |
579, 580 |
Tool |
picard |
A set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats from broadinstitute |
581, 582 |
Tool |
Diamond |
is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data |
583, 584 |
Tool |
vcftools |
a set of tools for working with the variant call format (VCF) and binary variant call format (BCF) |
585, 586, 587 |
Tool |
Gretel |
An algorithm for recovering haplotypes from metagenomes |
589, 590 |
Tool |
Hansel |
Computational haplotype recovery and long-read validation identifies novel isoforms of industrially relevant enzymes from natural microbial communities |
591, 592 |
Tool |
metabolisHHM |
a tool for exploration of microbial phylogenies and metabolic pathways |
593, 594 |
Tool |
ConjScan |
MacSyFinder-based detection of Conjugative elements using systems modelling and similarity search |
597 |
Tool |
MacSysFinder |
A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems |
598, 599, 600 |
Tool |
LEMON |
It is a software takes use of existing shotgun NGS datasets to detect HGT breakpoints, identify the transferred genome segments, and reconstructs the inserted local strain |
601, 602 |
Tool |
MMseqs2 |
Many-against-Many sequence searching is a software suite to search and cluster huge protein and nucleotide sequence sets |
604, 605, 606 |
Pipeline |
MicrobiomeBestPracticeReview |
Current Challenges and Best Practice Protocols for Microbiome Analysis using Amplicon and Metagenomic Sequencing |
607, 608 |
Tool |
Medaka |
is a tool to create a consensus sequence using neural networks from nanopore sequencing data |
609, 610 |
Software |
ARB |
a graphically oriented package comprising various tools for sequence database handling and data analysis |
611 |
Tool |
Piphillin |
a software package that predicts functional metagenomic content based on the frequency of detected 16S rRNA gene sequences corresponding to genomes in regularly updated, functionally annotated genome databases |
613, 614 |
Tool |
BlastFrost |
a highly efficient method for querying 100,000s of genome assemblies. BlastFrost builds on the recently developed Bifrost, which generates a dynamic data structure for compacted and colored de Bruijn graphs from bacterial genomes |
617, 618 |
Tool |
BioNode |
Command line tool for handy NGS data procedures, searching NCBI, downloading SRA stuff or handling fasta files. |
622 |
Tool |
Biopieces |
Command line tool for a lot of NGS data procedures, fastq files, mapping, SNPs, etc. but has some dependencies... |
623 |
Tool |
GrabSeqs |
Command line tool to download sequence files from SRA, iMicrobes, MG-rast easily |
626 |
Tool |
fARGene |
(Fragmented Antibiotic Resistance Gene iENntifiEr ) is a tool that takes either fragmented metagenomic data or longer sequences as input and predicts and delivers full-length antiobiotic resistance genes as output |
627, 628 |
Tool |
GTDBTk-Script |
various useful scripts related to GTDB |
629 |
Tool |
Cello |
the code is parsed to generate a truth table, and logic synthesis produces a circuit diagram with the genetically available gate types to implement the truth table. The gates in the circuit are assigned using experimentally characterized genetic gates. |
633,634,635 |
Tool |
URMAP |
The Ultra-fast Read Mapper (URMAP) is a fast, accurate read mapping with highly compressed output. It is ~10x faster than BWA and Bowtie with comparable accuracy on benchmark tests |
636, 637 |
Tool |
Artemis |
The Artemis Software is a set of software tools for genome browsing and annotation |
640 |
Tool |
EDGAR 2.0 |
"Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" is an enhanced software platform for comparative gene content analyses |
641, 642, 643 |
Tool |
ASA3P |
an automatic and scalable assembly, annotation and analysis pipeline for closely related bacterial genomes |
644, 645, 646 |
Tool |
BIGSdb |
a software designed to store and analyse sequence data for bacterial isolates |
647, 648, 649, 650 |
Tool |
OrthoVenn2 |
is a web platform for comparison and annotation of orthologous gene clusters among multiple species |
651, 652 |
Tool |
genomeribbon |
easy to use website to assess a genome assembly with raw reads, long reads and short reads |
653 |
R package |
FindMyFriends |
Fast alignment-free pangenome creation and exploration |
654, 655 |
R package |
dadasnake |
is a Snakemake workflow to process amplicon sequencing data, from raw fastq-files to taxonomically assigned "OTU" tables, based on the DADA2 method |
660, 661 |
Tool |
AMRtime |
Metagenomic AMR detection using hierarchical machine learning models |
662 |
Tool |
panaroo |
An updated pipeline for pangenome investigation |
663, 664 |
Pipeline |
TORMES |
An automated pipeline for whole bacterial genome analysis of genomes and/or raw Illumina paired-end sequencing data, regardless the number, origin or species |
665, 666 |
Pipeline |
ASAP3 |
Automatic Bacterial Isolate Assembly, Annotation and Analyses Pipeline |
667, 668 |
Pipeline |
nullarbor |
Pipeline to generate complete public health microbiology reports from sequenced isolates |
669 |
Pipeline |
Bactopia |
Bactopia is a flexible pipeline for complete analysis of bacterial genomes |
670, 671 |
Pipeline |
Common Workflow Language |
an open standard for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments |
673 |
Metric |
bacterialEvolutionMetrics |
Consistent Metagenome-Derived Metrics Verify and Delineate Bacterial Species Boundaries |
675, 676 |
Tool |
NGSpeciesID |
is a tool for clustering and consensus forming of targeted ONT reads |
677, 678 |
Catalogue |
long-read-tools |
A CATALOGUE OF LONG READ SEQUENCING DATA ANALYSIS TOOLS |
681 |
Tool |
fARGene |
Fragmented Antibiotic Resistance Gene iENntifiEr |
682, 683 |
Pipeline |
PathoFac |
a pipeline for the prediction of virulence factors and antimicrobial resistance genes in metagenomic data |
684, 685 |
Tool |
MFEprimer |
a functional primer quality control program for checking non-specific amplicons, dimers, hairpins and other parameters |
686, 687, 688 |
Pipeline |
STRONG |
STRONG resolves strains on assembly graphs by resolving variants on core COGs using co-occurrence across multiple samples |
689, 690, 691,704 |
Tool |
NanoClust |
De novo clustering and consensus building for ONT 16S sequencing data |
694 |
Tool |
mVIRs |
a tool that locates integration sites of inducible prophages in bacterial genomes |
697 |
Tool |
Metagenome-Atlas |
a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, to Annotation |
698, 699, 700, 701 |
Tool |
VIRify |
a recently developed pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies |
702 |
Plattform |
BioContainers |
is a community-driven project that provides the infrastructure and basic guidelines to create, manage and distribute bioinformatics packages (e.g conda) and containers (e.g docker, singularity) |
705, 706 |
Tool |
DeepMAsED |
deep-learning based evaluating the quality of metagenomic assemblies |
708, 709 |
Tool |
minMLST |
a machine-learning based methodology for identifying a minimal subset of genes that preserves high discrimination among bacterial strains |
713, 714 |
Tool |
hAMRonization |
CLI parser tools combine the outputs of disparate antimicrobial resistance gene detection tools into a single unified format |
715 |
Tool |
PPanGGOLiN |
Depicting microbial species diversity via a Partitioned PanGenome Graph Of Linked Neighbors |
717, 718 |
Webtool |
OGB |
OpenGenomeBrowser is a dynamic and scalable web platform for comparative genomics |
719, 720 |
Pipeline |
Bakta |
a tool for the rapid & standardized annotation of bacterial genomes & plasmids |
721 |
Tool |
MentaLiST |
The MLST pipeline developed by the PathOGiST research group |
725, 726 |
Webapp |
TyphiNET |
The TyphiNET dashboard collates antimicrobial resistance (AMR) and genotype (lineage) information extracted from whole genome sequence (WGS) data from the bacterial pathogen Salmonella Typhi, the agent of typhoid fever. |
727 |
Webapp |
Pathogenwatch provides species and taxonomy prediction for over 60,000 variants of bacteria, viruses, and fungi. MLST prediction is available for over 100 species using schemes from PubMLST, Pasteur, and Enterobase |
728 |
|
Tool |
mlst |
Scan contig files against traditional PubMLST typing schemes |
729 |
Tool |
snippy |
Rapid haploid variant calling and core genome alignment |
733 |
Tool |
MUFFIN |
a hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis. |
734, 735, 736 |
Tool |
Pandora |
a tool for bacterial genome analysis using a pangenome reference graph (PanRG) |
738, 739, 740 |
Tool |
cgmlst |
Fork of Torsten Seemanns excellent mlst tool modified for cgMLST |
741 |
Tool |
Phandango |
a fully interactive tool to allow visualisation of a phylogenetic tree, associated metadata and genomic information such as recombination blocks, pan-genome contents or GWAS results |
741, 742 |
R package |
Enriched heatmap |
is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions |
747, 748 |
R package |
Pagoo |
is an encapsulated, object-oriented class system for analyzing bacterial pangenomes |
752, 753, 754, 834 |
R package |
simurg |
Simulate a Bacterial Pangenome in R |
754, 755 |
Nextflow |
Porefile |
a Nextflow full-length 16S profiling pipeline for ONT reads |
757 |
Tool |
MLSTar |
R package allows you to easily determine the Multi Locus Sequence Type (MLST) of your genomes |
758, 759 |
Tool |
MOB-suite |
for clustering, reconstruction and typing of plasmids from draft assemblies |
760, 761 |
Tool |
PlasForest |
a random forest classifier of contigs to identify contigs of plasmid origin in contig and scaffold genomes |
763, 764 |
Tool |
GMGC-mapper |
Command line tool to query the Global Microbial Gene Catalog (GMGC) |
774 |
Tool |
MetaGraph |
Ultra Scalable Framework for DNA Search, Alignment, Assembly of bacterial sequences |
775, 776, 777, 778 |
Tool |
MIND |
Microbial Interaction Network Database |
786 |
Pipeline |
microPIPE |
a pipeline for high-quality bacterial genome construction using ONT and Illumina sequencing |
787 |
Tool |
giraffe |
variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods |
795, 796 |
Tool |
SquiggleKit |
A toolkit for accessing and manipulating nanopore signal data |
798, 799, 800 |
Tool |
FlowerPlot |
A Python 3.9+ function that makes flower plots for pangenomics |
804 |
Tool |
Poppunk |
A tool for clustering genomes. We refer to subclusters of strains as lineages. |
807,931,932 |
Tool |
PATO |
a R package designed to analyze pangenomes (set of genomes) intra or inter species |
810, 811 |
Tool |
PanX |
is a software package for comprehensive analysis, interactive visualization and dynamic exploration of bacterial pan-genomes |
812 |
Tool |
3mcor |
Metabolome-Microbiome-Metadata-Correlation Analysis |
814 |
Tool |
GenAPI |
a program for gene presence absence table generation for series of closely related bacterial genomes from annotated GFF files |
829, 830 |
Tool |
bammix |
Summarise nucleotide counts at a set of positions in a BAM file to search for mixtures |
835 |
Tool |
Wolka |
(Web of Life Toolkit App), is a bioinformatics package for shotgun metagenome data analysis |
836, 837 |
Tool |
ECTyper |
is a standalone versatile serotyping module for Escherichia coli |
838, 839 |
Tool |
Serotypefinder |
is a serotyping module for Escherichia coli |
840, 841 |
Tool |
SRST2 |
Short Read Sequence Typing for Bacterial Pathogens |
842, 843 |
Tool |
KEMET |
a python tool for KEGG Module evaluation and microbial genome annotation expansion (Metabolic) |
844, 845 |
Tool |
SIAMCAT |
Statistical Inference of Associations between Microbial Communities And host phenoTypes |
846, 847 |
Collection |
EMBL |
Microbiome Analysis Tools Developed at EMBL |
848 |
Tool |
BacDist |
Snakemake pipeline for bacterial SNP distance, recombination and phylogenetic analysis |
849 |
Tool |
PacTyper |
Snakemake pipeline for continuous clone type prediction for WGS sequenced bacterial isolates based on their core genome |
850 |
Pipeline |
CulebrONT |
a streamlined long reads multi-assembler pipeline for prokaryotic and eukaryotic genomes |
857, 858 |
Tool |
gapseq |
Informed prediction and analysis of bacterial metabolic pathways and genome-scale networks |
859, 860 |
Tool |
MicrobiomeAnalysis |
This package provides common methods for microbiome analysis |
863, also see 852 |
Tool |
MiMiC |
proposes minimal microbial consortia from the functional potential of a given metagenomic sample |
864, 865 |
Tool |
PIRATE |
identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds |
867, 868 |
Tool |
bacterial_strain_definition |
Contains the code and workflow for the bacterial strain definition paper with Kostas Kostantinidis |
869, 870 |
Tool |
CheckM2 |
Rapid assessment of genome bin quality using machine learning |
876 |
Tool |
Gubbins |
Genealogies Unbiased By recomBinations In Nucleotide Sequences |
879, 880 |
Tool |
SKA |
a toolkit for prokaryotic DNA sequence analysis (phylogeny) using split kmers |
881, 882 |
Tool |
Mashtree |
a rapid comparison of whole genome sequence files |
883, 884 |
Pipeline |
mGEMS |
Bacterial sequencing data binning on strain-level based on probabilistic taxonomic classification |
885, 886 |
Tool |
D-GENIES |
Dot plot large Genomes in an Interactive, Efficient and Simple way |
893, 894, 895 |
Tool |
nanotimeparse |
parses an Oxford Nanopore fastq file on read sequencing start times found in the fastq headers |
897 |
Tool |
ClonalFrameML |
package that performs efficient inference of recombination in bacterial genomes |
899, 900 |
Tool |
minidot |
Quickly produce pretty dotplots from minimap mappings using R/ggplot2 |
903 |
Pipeline |
microPIPE |
a pipeline for high-quality bacterial genome construction using ONT and Illumina sequencing |
911 |
Webapp |
Center for Genomic Epidemiology (CGE) |
provide access to various bioinformatics resources in clinical epidemiology |
917 |
Tool |
ggCaller |
a bacterial gene caller for pangenome graphs |
918, 919 |
Tool |
LEMMI |
A Live Evaluation of Computational Methods for Metagenome Investigation, is an online resource and a pipeline dedicated to continuous benchmarking of newly published metagenomics taxonomic classifiers |
920 |
Tool |
LEMORTHO |
is an online resource and a pipeline dedicated to continuous benchmarking of newly published methods for orthology delineation |
921 |
Tool |
KMCP |
accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping |
922, 923 |
Tool |
ClermonTyping |
an easy-to-use and accurate in silico method for Escherichia genus strain phylotyping |
924, 925 |
Tool |
Circlator |
A tool to circularize genome assemblies |
926, 927, 928 |
Tool |
ska2 |
A toolkit for prokaryotic DNA sequence analysis (phylogeny) using split kmers |
929, 930 |
Pipeline |
Is a amplicon sequencing pipeline for 16S |
933, 934, 935 |
|
Tool |
Pyseer |
A comprehensive tool for microbial pangenome-wide association studies |
941, 942 |
Tool |
TBProfiler |
Can rapidly and accurately predict anti-TB drug resistance profiles across large numbers of samples with WGS data |
943, 944, 945 |
Tool |
Minion_QC |
Fast and effective quality control for MinION and PromethION sequencing data |
946, 947 |
Tool |
GUNC |
package for detection of chimerism and contamination in prokaryotic genomes |
950, 951, 952 |