On Phylogenetic Uncertainty and Ancestral Sequence Reconstruction
Victor Hanson-Smith
Committee: John Conery (chair), Joseph Thornton, Dejing Dou
Directed Research Project(Mar 2009)
Keywords: phylogenetics; phylogeny; bioinformatics; evolution; biology; Bayesian; maximum likelihood; Markoff models; simulation; proteins;DNA, genetics

In this project, I consider the problem of inferring the ancient evolutionary history of molecular gene sequences. Given the extreme paucity of molecular fossils, the history of genes can be difficult to study. However, computational methods of ancestral sequence reconstruction (ASR) can be used to statistically infer the sequences of extinct genes; some of these reconstructions have been chemically synthesized and experimentally tested. Although ASR allows us to answer previously unknowable questions about evolutionary molecular mechanisms, results from ASR-based experiments rely on the accuracy of their underlying computational reconstruction. In this project, I investigate one aspect of the ASR algorithm which may impact accuracy: phylogenetic uncertainty. Most reconstruction algorithms assume the phylogeny is known with certainty; in practice, this assumption is rarely valid. Does ignoring phylogenetic uncertainty affect ASR accuracy?

To answer this question, I proposed an empirical Bayesian algorithm for integrating phylogenetic uncertainty in ASR. I examined this method in simulated and real conditions. My results are surprising and nonintuitive: phylogenetic uncertainty is not correlated with the accuracy of reconstructed ancestral states. The conditions which produce phylogenetic uncertainty result in ancestral states on alternate trees which are similar, if not identical, to the ancestral states on the maximum likelihood tree. Ultimately, integrating phylogenetic uncertainty does not significantly affect the accuracy of reconstructed ancestral sequences.