Automated Methods to Infer Ancient Homology and Synteny
Julian M. Catchen
Committee: John Conery (chair), John Postlethwait (chair), Virginia Lo, Arthur Farley, William Cresko
Dissertation Defense(Jun 2009)
Keywords: whole genome duplication; conserved synteny; bioinformatics

Establishing homologous (evolutionary) relationships among a set of genes allows us to hypothesize about their histories: how are they related, how have they changed over time, and are those changes the source of novel features? Likewise, aggregating related genes into larger, structurally conserved regions of the genome allows us to infer the evolutionary history of the genome itself: how have the chromosomes changed in number, gene content, and gene order over time? Establishing homology between genes is important for the construction of human disease models in other organisms, such as the zebrafish, by identifying and manipulating the zebrafish copies of genes involved in the human disease. To make such inferences, researchers compare the genomes of extant species. However, the dynamic nature of genomes, in gene content and chromosomal architecture, presents a major technical challenge to correctly identify homologous genes. This thesis presents a system to infer ancient homology between genes that takes into account a ma jor but previously overlooked source of architectural change in genomes: whole-genome duplication. Additionally, the system integrates genomic conservation of synteny (gene order on chromosomes), providing a new source of evidence in homology assignment that complements existing methods. The work applied these algorithms to several genomes to infer the evolutionary history of genes, gene families, and chromosomes in several case studies and to study several unique architectural features of post-duplication genomes, such as Ohnologs gone missing.