damioresegun / Pknowlesi_denovo_genome_assembly

An assembly pipeline that is designed to take raw input nanopore reads to assemblies ready for genome annotation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

README

Overview

This wiki is designed to work as a kind of manual which can hopefully take you through the major steps of the pipeline. A pipeline to take input data from raw nanopore reads to being ready for manual or automated annotation. It is important to note that while this pipeline automates alot of commands, it cannot automate everything. User input is needed at various points to confirm which branch of the pipeline to follow as well as when to continue at certain 'break-points' I'm hoping that this has been fairly prepped in a fairly straight-forward manner that is somewhat approachable However, if you want to tweak the script, the vast majority is written in Bash with some secondary scripts in python. Please be aware that the first portion of this script is tuned for barcoded ONT sequencing reads however, it is possible to use non-barcoded ONT sequenced read.

Update on Pipeline

This repository mainly focuses on the generation of the de novo assemblies and their subsequent improvement and analyses. Further analyses can be found in: https://github.com/damioresegun/DNDS_analyses for duplication and subsitution ratios (Forked from Peter Thorpe). Additionally, analysis outputs can be found on the associated zenodo repository.

Pipeline

image

Table of Contents

Data Access

The completed genomes generated from this pipeline was uploaded to Zenodo and NCBI.

How to Cite

Acknowledgements

  • Dr Peter Thorpe helped in various parts of the pipeline development; in particular the repeatmasking process; providing scripts to start off with and adapt
  • Amir Szitenberg for his repeatmasking cookbook which is the backbone for the in-depth repeatmasking process utilised in this pipeline
  • Dr Janet Cox-Singh helped in guiding the biological necessities required for this pipeline

Contact

About

An assembly pipeline that is designed to take raw input nanopore reads to assemblies ready for genome annotation

License:GNU General Public License v3.0


Languages

Language:Shell 64.2%Language:Python 29.1%Language:HTML 6.7%