yesdavid / phylogenetics-class

A course in the theory and practice of phylogenetic inference from DNA sequence data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Botany 563 Phylogenetic Analysis of Molecular Data (UW-Madison)

A course in the theory and practice of phylogenetic inference from DNA sequence data. Students will learn all the necessary components of state-of-the-art phylogenomic analyses and apply the knowledge to the data analyses of their own organisms.

Learning outcomes

By the end of the course, you will be able to

  1. Explain in details all the steps in the pipeline for phylogenetic inference and how different data and model choices affect the inference outcomes
  2. Plan and produce reproducible scripts with the analysis of your own biological data
  3. Justify the data and model choices in your own data analysis
  4. Interpret the results of the most widely used phylogenetic methods in biological terms
  5. Orally present the results of your own phylogenomic data analyses based on the best scientific and reproducibility practices

Textbooks

  • Phylogenetics in the Genomic Era (open access book) by Celine Scornavacca, Frederic Delsuc and Nicolas Galtier (denoted HAL in the schedule)
  • Tree thinking: an introduction to phylogenetic biology by David Baum and Stacey Smith (optional: denoted Baum in the schedule)
  • The Phylogenetic Handbook by Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme (optional: denoted HB in the schedule)

Schedule

Session Topic Reading before class At the end of the session Notes
01/26 Introduction Syllabus You will know what will be the structure of the class, the learning outcomes and the grading lecture1.md
01/28 Motivation: why learning phylogenomics? HAL 2.1 You will identify the different components in phylogenomic analyses lecture2.md
02/02 Reproducibility crash course Notes on mindful programming You will prioritize reproducibility and good computing practices throughout the semester (and beyond) lecture3.md
02/04 Continue with reproducibility Have git installed
02/09 Introduction to sequences Watch video1, video2, and read Zhang et al, 2019 You will be able to describe the next-generation sequencing pipeline (and UCE pipeline) as well as the post-processing bioinformatics steps for quality control lecture4.md
02/11 Alignment HAL 2.2 You will be able to explain the most widely used algorithms for multiple sequence alignment lecture5.md
02/16 Continue with alignment
02/18 Continue with alignment
02/23 Orthology detection HAL 2.4 and the OrthoFinder paper You will know about the different orthology inference methods and will be able to explain the OrthoFinder algorithm lecture6.md
02/25 Overview of phylogenetic inference You will be able to explain the overall methodology of phylogenetic inference as well as the main weaknesses lecture7.pdf
03/02 Distance and parsimony methods Install R and optional readings: HB Ch 5-6, Baum Ch 7-8 You will be able to explain both algorithms to reconstruct trees: 1) based on distances and 2) based on parsimony lecture8.md
03/04 Continue with distance and parsimony methods
03/09 Models of evolution HAL 1.1 You will be able to explain the main characteristics and assumptions of the substitution models lecture9.pdf
03/11 Continue with models of evolution
03/16 Maximum likelihood HAL 1.2 and optional: install RAxML-NG (HAL 1.3) or IQ-Tree You will be able to explain the main steps in maximum likelihood inference and the strength/weaknesses of the approach lecture10.pdf
03/18 Comparison of distances, parsimony and likelihood Investigate the pros/cons of the method of your team You will be able to assess the strenghts and weaknesses of distances, parsimony and likelihood methods in phylogenetic inference lecture11.md and discussion slides
03/23 Bayesian inference HAL 1.4 and Nascimento et al, 2017 You will be able to explain the main components of Bayesian inference and their effect on the inference performance lecture12.pdf
03/25 Model selection: Guest lecture by Rob Lanfear
03/30 Continuing with Bayesian inference
04/01 The coalescent HAL 3.1, 3.3 You will be able to explain the coalescent model for species trees and networks lecture14.pdf
04/06 Continue with the coalescent (and 1-min presentation of students' project data/goal) Create a slide describing your data here
04/08 Continue with the coalescent
04/09 Deadline: Draft of final report
04/13 Co-estimation methods Optional reading: HB 18 You will be able to explain the main components of co-estimation methods and follow the BEAST tutorial lecture15.md
04/15 Discussion: Measures of support One per group: 1) Stenz2015, 2) Lemoine2018, 3) Anisimova2006, 4) Sayyari2016 You will be able to compare and contrast the different ways in which we can measure confidence in our phylogenetic estimates Slides
04/16 Deadline: Peer evaluation of another student's report
04/20 Discussion: Coalescent vs concatenation All: HAL 3.4. One per group: 1) Springer2018, 2) Mendes2018, 3) Philippe2017, 4) Springer2016, 5) Edwards2016 You will be able to justify the choice of concatenation vs coalescent in specific scenarios Slides
04/22 Project check-up (extra OH before/after class for one-on-one meetings; check calendly link in slack)
04/27 Discussion: Phylogenomics pitfalls One per group: 1) Bravo2019, 2) Shen2017, 3) Young2020, 4) Steel2005 You will be able to describe and analyze some of the main pitfalls of phylogenomic analysis of big data Slides
04/29 What else is out there? You will hear a brief overview of topics not covered in this class and will have access to resources to learn more
04/30 Deadline: Final report with reproducible script
05/04 Project presentations
05/06 Project presentations

More details

See list of topics, grading and academic policies in the syllabus

About

A course in the theory and practice of phylogenetic inference from DNA sequence data


Languages

Language:HTML 99.6%Language:Julia 0.4%