A course in the theory and practice of phylogenetic inference from DNA sequence data. Students will learn all the necessary components of state-of-the-art phylogenomic analyses and apply the knowledge to the data analyses of their own organisms.
- Spring 2021: Tuesday and Thursday 1:00-2:15pm (online via Zoom)
- Instructor: Claudia Solis-Lemus, PhD
- Email: solislemus@wisc.edu
- website: https://solislemuslab.github.io/
- Office hours: Tuesday 2:30-3:30pm, or by appointment
By the end of the course, you will be able to
- Explain in details all the steps in the pipeline for phylogenetic inference and how different data and model choices affect the inference outcomes
- Plan and produce reproducible scripts with the analysis of your own biological data
- Justify the data and model choices in your own data analysis
- Interpret the results of the most widely used phylogenetic methods in biological terms
- Orally present the results of your own phylogenomic data analyses based on the best scientific and reproducibility practices
- Phylogenetics in the Genomic Era (open access book) by Celine Scornavacca, Frederic Delsuc and Nicolas Galtier (denoted HAL in the schedule)
- Tree thinking: an introduction to phylogenetic biology by David Baum and Stacey Smith (optional: denoted Baum in the schedule)
- The Phylogenetic Handbook by Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme (optional: denoted HB in the schedule)
Session | Topic | Reading before class | At the end of the session | Notes |
---|---|---|---|---|
01/26 | Introduction | Syllabus | You will know what will be the structure of the class, the learning outcomes and the grading | lecture1.md |
01/28 | Motivation: why learning phylogenomics? | HAL 2.1 | You will identify the different components in phylogenomic analyses | lecture2.md |
02/02 | Reproducibility crash course | Notes on mindful programming | You will prioritize reproducibility and good computing practices throughout the semester (and beyond) | lecture3.md |
02/04 | Continue with reproducibility | Have git installed | ||
02/09 | Introduction to sequences | Watch video1, video2, and read Zhang et al, 2019 | You will be able to describe the next-generation sequencing pipeline (and UCE pipeline) as well as the post-processing bioinformatics steps for quality control | lecture4.md |
02/11 | Alignment | HAL 2.2 | You will be able to explain the most widely used algorithms for multiple sequence alignment | lecture5.md |
02/16 | Continue with alignment | |||
02/18 | Continue with alignment | |||
02/23 | Orthology detection | HAL 2.4 and the OrthoFinder paper | You will know about the different orthology inference methods and will be able to explain the OrthoFinder algorithm | lecture6.md |
02/25 | Overview of phylogenetic inference | You will be able to explain the overall methodology of phylogenetic inference as well as the main weaknesses | lecture7.pdf | |
03/02 | Distance and parsimony methods | Install R and optional readings: HB Ch 5-6, Baum Ch 7-8 | You will be able to explain both algorithms to reconstruct trees: 1) based on distances and 2) based on parsimony | lecture8.md |
03/04 | Continue with distance and parsimony methods | |||
03/09 | Models of evolution | HAL 1.1 | You will be able to explain the main characteristics and assumptions of the substitution models | lecture9.pdf |
03/11 | Continue with models of evolution | |||
03/16 | Maximum likelihood | HAL 1.2 and optional: install RAxML-NG (HAL 1.3) or IQ-Tree | You will be able to explain the main steps in maximum likelihood inference and the strength/weaknesses of the approach | lecture10.pdf |
03/18 | Comparison of distances, parsimony and likelihood | Investigate the pros/cons of the method of your team | You will be able to assess the strenghts and weaknesses of distances, parsimony and likelihood methods in phylogenetic inference | lecture11.md and discussion slides |
03/23 | Bayesian inference | HAL 1.4 and Nascimento et al, 2017 | You will be able to explain the main components of Bayesian inference and their effect on the inference performance | lecture12.pdf |
03/25 | Model selection: Guest lecture by Rob Lanfear | |||
03/30 | Continuing with Bayesian inference | |||
04/01 | The coalescent | HAL 3.1, 3.3 | You will be able to explain the coalescent model for species trees and networks | lecture14.pdf |
04/06 | Continue with the coalescent (and 1-min presentation of students' project data/goal) | Create a slide describing your data here | ||
04/08 | Continue with the coalescent | |||
04/09 | Deadline: Draft of final report | |||
04/13 | Co-estimation methods | Optional reading: HB 18 | You will be able to explain the main components of co-estimation methods and follow the BEAST tutorial | lecture15.md |
04/15 | Discussion: Measures of support | One per group: 1) Stenz2015, 2) Lemoine2018, 3) Anisimova2006, 4) Sayyari2016 | You will be able to compare and contrast the different ways in which we can measure confidence in our phylogenetic estimates | Slides |
04/16 | Deadline: Peer evaluation of another student's report | |||
04/20 | Discussion: Coalescent vs concatenation | All: HAL 3.4. One per group: 1) Springer2018, 2) Mendes2018, 3) Philippe2017, 4) Springer2016, 5) Edwards2016 | You will be able to justify the choice of concatenation vs coalescent in specific scenarios | Slides |
04/22 | Project check-up (extra OH before/after class for one-on-one meetings; check calendly link in slack) | |||
04/27 | Discussion: Phylogenomics pitfalls | One per group: 1) Bravo2019, 2) Shen2017, 3) Young2020, 4) Steel2005 | You will be able to describe and analyze some of the main pitfalls of phylogenomic analysis of big data | Slides |
04/29 | What else is out there? | You will hear a brief overview of topics not covered in this class and will have access to resources to learn more | ||
04/30 | Deadline: Final report with reproducible script | |||
05/04 | Project presentations | |||
05/06 | Project presentations |
See list of topics, grading and academic policies in the syllabus