A special emphasis is put on methods that employ explicit models of the evolutionary process (maximum likelihood and Bayesian approaches), and we will explore the role of statistical modeling in molecular evolution, and in science more generally. A mathematical (statistical) model of a biological system can be considered to be a stringently phrased hypothesis about that system, and this way of thinking about models will often be helpful. In addition to model-based methods, you will also learn about other approaches, such as those based on parsimony and genetic distance (e.g., neighbor joining).Â
Often, the evolutionary tree is the result we are interested in - knowing how a set of sequences (or organisms) are related can provide us with important information about the biological problem we are  investigating. For instance, knowing which organisms are most closely related to a newly identified, uncharacterized, pathogenic bacterium will allow you to infer many aspects of its lifestyle, thereby giving you important clues about how to fight it. In other cases, however, inferring the structure of the tree is not the goal: for instance, our main focus may instead be the detection of positions in a protein undergoing positive selection (indicating adaptation) or negative selection (indicating conserved functional importance). However, even in these cases, the underlying phylogenetic tree will be an important part of our hypothesis about (model of) how the proteins have been evolving, and will help in getting the correct answer.Â
Although the study of molecular evolution does require a certain level of mathematical understanding, this course has been designed to be accessible also for students with limited computational background (e.g., students of biology).
Topics covered:
- Brief introduction to evolutionary theory and population genetics.
- Mechanisms of molecular evolution.
- Models of substitution.
- Reconstruction of phylogenetic trees using parsimony, distance based methods, maximum likelihood, and Bayesian techniques.
- Advanced models of nucleotide substitution (gamma-distributed mutation rates, codon models and analysis of selective pressure).
- Statistical analysis of biological hypotheses (likelihood ratio tests, Akaike Information Criterion, Bayesian statistics).