Error and Uncertainty in Computational Phylogenetics
Victor Hanson-Smith
Committee: John Conery (chair), Daniel Lowd, Sarah Douglas, Joseph Thornton
Dissertation Defense(May 2024)
Keywords:

The evolutionary history of protein families can be difficult to study because necessary ancestral molecules are often unavailable for direct observation. As an alternative, the field of computational phylogenetics has developed statistical methods to infer the evolutionary relationships among extant molecular sequences and their ancestral sequences. Typically, the methods of computational phylogenetic inference and ancestral sequence reconstruction are combined with other non-computational techniques in a larger analysis pipeline to study the inferred forms and functions of ancient molecules. Two big problems surrounding this analysis pipeline are computational error and statistical uncertainty. In this dissertation, I use simulations and analysis of empirical systems to show that phylogenetic error can be reduced by using an alternative search heuristic. I then use similar methods to reveal the relationship between phylogenetic uncertainty and the accuracy of ancestral sequence reconstruction. Finally, I provide a case-study of a molecular machine in yeast, to demonstrate all stages of the analysis pipeline.

This dissertation includes previously published co-authored material.