A public repository of Monkeypox (MPXV) related resources maintained by ITER.
This is the result of a continuous collaborative effort of the following Institutions and Laboratories:
- Servicio de Microbiología, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain.
- Fundación Canaria Instituto de Investigación Sanitaria de Canarias at the Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain.
- Laboratorio de Inmunología Celular y Viral, Unidad de Farmacología, Facultad de Medicina, Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain.
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain.
- Virological post: A draft of the first genome sequence of MPXV virus associated with the multi-country outbreak in May 2022 from the Canary Islands, Spain
- Bioinformatic pipelines
- Code for Illumina short-reads processing
- Code for Nanopore long-reads processing and hybrid de novo assemby
- List of bioinformatic software used in our pipelines
- Useful files for the pipelines
- Sequences
- How to download sequences and metadata from GenBank
- Other useful repositories with resources to study MPXV
- References
- Acknowledgements
- License and Attribution
- Participating
- Update logs
A technical post with the draft of the first genome sequence of MPXV virus associated with the multi-country outbreak in May 2022 from the Canary Islands, Spain has been shared in Virological. Keep reading! Here.
The first genome sequence of MPXV virus described by us in Virological is phylogenetically related to the multiple viral genomes deposited in NCBI GenBank that correspond to the actual 2022 worldwide outbreak, as shown in Figure 1.
Figure 1. A phylogenetic tree depicting the draft MPXV sequence isolated on May 31, 2022 from a patient from the Canary Islands along with NCBI GenBank publicly available sequences computed by a Nextstrain-monkeypox local instance.
The following diagram (Figure 2) represents a full pipeline used to derive the consensus FASTA sequence of MPXV virus using and combining short- and long-reads technologies (Illumina and Nanopore, respectively).
In the upper part of the diagram, there is a typical pipeline to process short-reads, from the basecalling to the final consensus FASTA sequence, and downstream analysis such as the phylogenetic inference.
In the lower part of the diagram, it is shown a typical pipeline to process long-reads. In addition, it shows how to perform a hybrid de novo assembly combining short- and long-reads.
Two consensus MPXV sequences have been obtained and deposited in NCBI GenBank following the described pipeline:
- A FASTA sequence derived from the pipeline based on mapping of Illumina short-reads against a MPXV reference genome.
- A FASTA sequence resulting from the consensus of the hybrid *de novo* assembly and a MPXV reference genome to complete uncovered regions.
Figure 2. Full bioinformatic pipeline to obtain the MPXV sequences and to infer phylogenetic relationships with other MPXV viral genomes available obtained from public repositories.
Code for Illumina short-reads processing
See a detailed pipeline with examples of command usage for Illumina short-reads.
Code for Nanopore long-reads processing and hybrid de novo assemby
See a detailed pipeline with examples of command usage for Oxford Nanopore Technology long-reads.
List of bioinformatic software used in our pipelines
- Conda manual for installation of numerous open-source tools used in these pipelines:Conda documentation
- Reformat FASTQ files to get an interleaved FASTQ file: BBMap tools v.38.96
- Remove Human mapping-reads from your FASTQ files: NCBI SRA Human Scrubber v.1.0.2021_05_05
- Remove Human mapping-reads from your FASTQ files: Kraken2 v.2.1.2
- Programming environment of general purpose: R v.4.1.3
- Compute the depth of coverage and other statistics: Mosdepth v.0.3.3
- Compute de number of duplicates and other statistics: Picard Tools v.2.18.7
- Perform the variant calling and consensus: iVar v.1.3.1
- Perform the variant calling: LoFreq v.2.1.5
- Get mapping statistics, manipulate BAM files, and generate mpileups for FASTA consensus: SAMtools v.1.6
- Multiple Sample Alignment: MAFFT v.7.505]
- Phylogenomic inference and tree computing: IQ-TREE v.2.2.0.3
- Mapping of short-reads: Minimap2 v.2.24-r1122
- Mapping of short-reads: Bowtie2 v.2.4.5
- Mapping of short-reads: BWA v.0.7.17-r1188
- Framework for analyses and visualization of pathogen genome data (Nextstraing-monkeypox in this case): Nextstrain
- Assembly: Unicycler v.0.5.0
- Benchmarking and quality control of assemblies: QUAST v.5.0.2
- Visualization of assemblies: Bandage v.0.9.0
- Visualization of Kraken 2 reports: Pavian v.1.0
- Annotation of genomes: SnpEff v.5.1d
- Visualization of phylogenetic trees: Figtree
- Visualization of phylogenetic trees: ggtree 3.15
Useful files for the pipelines
- FASTA file ('multiMPXV01.fasta.zip') with multiple sequences of MPXV from NCBI GenBank to use used in the Multiple Sample Aligment step with MAFFT or Nextstrain-monkeypox, available here (last update: June 16, 2022)
- Metadata file (TSV format) to use with a Nextstrain-monkeypox local instance, available here (last update: June 23, 2022)
Consensus FASTA file obtained from a hybrid de novo Illumina-Nanopore based assembly and MT903344.1. See Virological post for more details
NCBI GenBank Accession: ON782054. Sequence available as MPXV/Spain/HUNSC_ITER_0001a/2022
Consensus FASTA file obtained from Illumina short-reads mapping to MT903344.1. See Virological post for more details
NCBI GenBank Accession: ON782055. Sequence available as MPXV/Spain/HUNSC_ITER_0001b/2022
Manual download
- Browse to GenBank.
- Select 'Nucleotide' from the combo box.
- Fill in the accession code of the sequence you want to download (i.e. ON782054) or just write the name of the species (i.e. Monkeypox, and then clic on a certain accession code you are interested in).
- Click on 'FASTA' link
- Click on 'Send to' on the upper right part of the screen.
- Select the option 'file'.
- Select 'FASTA as download format.
- Click on 'Generate' button.
Programmatically download
We provide a full Python code to retrieve all sequences larger than 190,000 bases from GenBank as example. See the code.
Kudos to all research teams behind the scenes in all these repositories:
- Virological.org posts on MPXV
- CADDE-CENTRE GitHub repository for MPXV
- Mike Honey GitHub repository for MPXV
- Mpox-Spectrum from the Computational Evolution group of ETH Zürich in Switzerland
- Global.health, geospatial data visualisations to explore MPXV GitHub repository and visualization
- Our World in Data for MPXV
- Nextstraing build for MPXV in GitHub and visualization
- FIND offers a searchable directory of monkeypox tests
Published papers
Berthet N, Descorps-Declère S, Besombes C, et al. Genomic history of human monkey pox infections in the Central African Republic between 2001 and 2018. Sci Rep. 2021;11(1):13085. Published 2021 Jun 22. doi: https://doi.org/10.1038/s41598-021-92315-8
Cervantes-Gracia K, Gramalla-Schmitz A, Weischedel J, Chahwan R. APOBECs orchestrate genomic and epigenomic editing across health and disease. Trends Genet. 2021;37(11):1028-1043. doi: https://doi.org/10.1016/j.tig.2021.07.003
Cohen J. Global outbreak puts spotlight on neglected virus. Science. 2022;376(6597):1032-1033. doi:10.1126/science.add2701
Cohen-Gihon I, Israeli O, Shifman O, et al. Identification and Whole-Genome Sequencing of a Monkeypox Virus Strain Isolated in Israel. Microbiol Resour Announc. 2020;9(10):e01524-19. Published 2020 Mar 5. doi: https://doi.org/10.1128/MRA.01524-19
Erez N, Achdout H, Milrot E, et al. Diagnosis of Imported Monkeypox, Israel, 2018. Emerg Infect Dis. 2019;25(5):980-983. doi:10.3201/eid2505.190076
Faye O, Pratt CB, Faye M, et al. Genomic characterisation of human monkeypox virus in Nigeria [published correction appears in Lancet Infect Dis. 2018 Mar;18(3):244]. Lancet Infect Dis. 2018;18(3):246. doi: https://doi.org/10.1016/S1473-3099(18)30043-4
Iizuka I, Saijo M, Shiota T, et al. Loop-mediated isothermal amplification-based diagnostic assay for monkeypox virus infections. J Med Virol. 2009;81(6):1102-1108. doi: https://doi.org/10.1002/jmv.21494
Kraemer MUG, Tegally H, Pigott DM, et al. Tracking the 2022 monkeypox outbreak with epidemiological data in real-time [published online ahead of print, 2022 Jun 8]. Lancet Infect Dis. 2022;S1473-3099(22)00359-0. doi: https://doi.org/10.1016/S1473-3099(22)00359-0
Kugelman JR, Johnston SC, Mulembakani PM, et al. Genomic variability of monkeypox virus among humans, Democratic Republic of the Congo. Emerg Infect Dis. 2014;20(2):232-239. doi: https://doi.org/10.3201/eid2002.130118
Kulesh DA, Loveless BM, Norwood D, et al. Monkeypox virus detection in rodents using real-time 3'-minor groove binder TaqMan assays on the Roche LightCycler. Lab Invest. 2004;84(9):1200-1208. doi: https://doi.org/10.1038/labinvest.3700143
Li D, Wilkins K, McCollum AM, et al. Evaluation of the GeneXpert for Human Monkeypox Diagnosis. Am J Trop Med Hyg. 2017;96(2):405-410. doi:10.4269/ajtmh.16-0567
Li Y, Olson VA, Laue T, Laker MT, Damon IK. Detection of monkeypox virus with real-time PCR assays. J Clin Virol. 2006;36(3):194-203. doi: https://doi.org/10.1016/j.jcv.2006.03.012
Li Y, Zhao H, Wilkins K, Hughes C, Damon IK. Real-time PCR assays for the specific detection of monkeypox virus West African and Congo Basin strain DNA. J Virol Methods. 2010;169(1):223-227. doi: https://doi.org/10.1016/j.jviromet.2010.07.012
Luciani L, Inchauste L, Ferraris O, et al. A novel and sensitive real-time PCR system for universal detection of poxviruses [published correction appears in Sci Rep. 2022 Apr 8;12(1):5961]. Sci Rep. 2021;11(1):1798. Published 2021 Jan 19. doi: https://doi.org/10.1038/s41598-021-81376-4
Maksyutov RA, Gavrilova EV, Shchelkunov SN. Species-specific differentiation of variola, monkeypox, and varicella-zoster viruses by multiplex real-time PCR assay. J Virol Methods. 2016;236:215-220. doi: https://doi.org/10.1016/j.jviromet.2016.07.024
Mucker EM, Hartmann C, Hering D, et al. Validation of a pan-orthopox real-time PCR assay for the detection and quantification of viral genomes from nonhuman primate blood. Virol J. 2017;14(1):210. Published 2017 Nov 3. doi: https://doi.org/10.1186/s12985-017-0880-8
Patrono LV, Pléh K, Samuni L, et al. Monkeypox virus emergence in wild chimpanzees reveals distinct clinical outcomes and viral diversity. Nat Microbiol. 2020;5(7):955-965. doi: https://doi.org/10.1038/s41564-020-0706-0
Pecori R, Di Giorgio S, Paulo Lorenzo J, Nina Papavasiliou F. Functions and consequences of AID/APOBEC-mediated DNA and RNA deamination [published online ahead of print, 2022 Mar 7]. Nat Rev Genet. 2022;1-14. doi: https://doi.org/10.1038/s41576-022-00459-8
Shchelkunov SN, Shcherbakov DN, Maksyutov RA, Gavrilova EV. Species-specific identification of variola, monkeypox, cowpox, and vaccinia viruses by multiplex real-time PCR assay. J Virol Methods. 2011;175(2):163-169. doi: https://doi.org/10.1016/j.jviromet.2011.05.002
Tumewu J, Wardiana M, Ervianty E, et al. An adult patient with suspected of monkeypox infection differential diagnosed to chickenpox. Infect Dis Rep. 2020;12(Suppl 1):8724. Published 2020 Jul 6. doi: https://doi.org/10.4081/idr.2020.8724
Yong SEF, Ng OT, Ho ZJM, et al. Imported Monkeypox, Singapore. Emerg Infect Dis. 2020;26(8):1826-1830. doi:10.3201/eid2608.191387
Zhao K, Wohlhueter RM, Li Y. Finishing monkeypox genomes from short reads: assembly analysis and a neural network method. BMC Genomics. 2016;17 Suppl 5(Suppl 5):497. Published 2016 Aug 31. doi: https://doi.org/10.1186/s12864-016-2826-8
Preprint papers
Enhanced surveillance of monkeypox in Bas-Uélé, Democratic Republic of Congo: the limitations of symptom-based case definitions Gaspard Mande, Innocent Akonda, Anja De Weggheleire, Isabel Brosius, Laurens Liesenborghs, Emmanuel Bottieau, Noam Ross, Guy -Crispin Gembu, Robert Colebunders, Erik Verheyen, Ngonda Dauly, Herwig Leirs, Anne Laudisoit medRxiv 2022.06.03.22275815; doi: https://doi.org/10.1101/2022.06.03.22275815
This study has been funded by Cabildo Insular de Tenerife (CGIEU0000219140 and "Apuestas científicas del ITER para colaborar en la lucha contra la COVID-19"), Instituto de Salud Carlos III (FI18/00230) cofunded by European Union (ERDF) "A way of making Europe", and by the agreement with Instituto Tecnológico y de Energías Renovables (ITER) to strengthen scientific and technological education, training, research, development and innovation in Genomics, Personalized Medicine and Biotechnology (OA17/008).
We acknowledge in Table 1 (EXCEL file) the researchers and their institutions who released the MPXV sequences through NCBI GenBank that are being used in our studies.
We also thank the authors, the laboratories that originated and submitted the genetic sequences and the metadata for sharing their work, as shown on Nextstrain, and:
- Hadfield et al, Nextstrain: real-time tracking of pathogen evolution, Bioinformatics (2018).
- Sagulenko et al, TreeTime: Maximum-likelihood phylodynamic analysis, Virus Evolution (2017).
We would like to acknowledge the contributions of several researchers and laboratories who share their preliminary results through the Virological website.
This repository and data exports are released under the CC BY 4.0 license. Please acknowledge the authors, the originating and submitting laboratories for the genetic sequences and metadata, and the open source software used in this work (third-party copyrights and licenses may apply).
Please cite as: "Monkeypox repository of the Reference Laboratory for Epidemiological Surveillance of Pathogens in the Canary Islands (accessed on YYYY-MM-DD)".
Want to share your relevant links? Place a Direct Message to @labcflores, @adrmunozb or @resocios on Twitter (see below).
By AMB @adrmunozb and JMLS @resocios
Follow us on Twitter @labcflores
June 13, 2022. Created the public version of this repository. Enjoy the reading! ;=)
June 16, 2022. The MultiSample FASTA file now holds 137 public MPXV sequences.
June 21, 2022. Added a section with other useful external repositories for MPXV.
June 23, 2022. Added a metadata file to use with a Nextstrain-monkeypox local instance in the useful-files section; bioinformatic codes completed.
June 28, 2022. Added the code to illustrate How-to-download seqs and metadata from GenBank.