Artur-man / wastewater_virome

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pipelines and Scripts for Wyler, E. et al. 2022

Wyler, E., Lauber, C., Manukyan, A., Deter, A., Quedenau, C., Teixeira Alves, L. G., ... & Landthaler, M. (2022). Comprehensive profiling of wastewater viromes by genomic sequencing. bioRxiv, 2022-12.

DOI: https://doi.org/10.1101/2022.12.16.520800

This repository incorporates two pipelines for RNA and DNA samples generated from wastewater:

  • Kaiju Pipeline for taxonomy classification:

    • Duplicate removal of reads with CD-HIT.
    • Taxonomy classification and annotation with Kaiju.
    • Custom R scripts for summarizing annotated reads per sample.
  • CCTyper Pipeline for predicting cas proteins:

In addition to preprocessing pipelines, we also provide access to custom scripts for the downstream analysis for both taxonomy classification and Cas protein detection:

  • Metagenomic Analysis on taxonomically classified reads:

    • taxonomy ranks and lineage with taxizedb.
    • Processing and PCA of taxonomy counts.
    • Visualize heatmaps with ComplexHeatmap.
  • Downstream Analysis on reads associated with CRISPR-Cas genes:

    • Clustering open reading frames (ORFs) with CD-HIT.
    • Aligning ORFs to NR database in NCBI with rBLAST.
    • Protein embeddings of ORFs using ProtTrans.

About


Languages

Language:R 86.1%Language:Python 11.7%Language:Shell 2.2%