2008 Poster Sessions : Dispensability of Mammalian DNA

Student Name : Cory McLean
Advisor : Gill Bejerano
Research Areas: Artificial Intelligence
Comparison to other mammals finds roughly 5% of the human genome evolving under purifying selection. Two-thirds of this genomic mass does not code for protein. These conserved non-exonic sequences
(CNEs) cluster near genes involved in transcription and development, and hundreds of experiments have validated at least half as cis-regulatory elements. In the lab, the cis-regulatory network seems to exhibit great functional redundancy. Many experiments testing enhancer activity of multiple cis-regulatory elements controlling a single gene show largely overlapping expression domains. Of recent interest, mice in which cis-regulatory ultraconserved elements were knocked out showed no measurable phenotypes, further suggesting functional redundancy.

Here we present an analysis of mammalian evolution of CNEs, and find strong evidence to the contrary. Given a set of CNEs conserved between several mammals, we characterize functional dispensability as the propensity for the ancestral element to be lost in a mammalian species internal to the spanned species tree. We show that ultraconserved-like elements are over 350-fold less likely than neutral DNA to have been lost during rodent evolution. In fact, many thousands of non-coding loci under purifying selection (under ten per gene on average) display near uniform indispensability during mammalian evolution, largely irrespective of nucleotide conservation level. By mapping all CNEs to the genome of other vertebrates, we trace each element?s ancestry as far as the base of the vertebrate lineage. We observe that CNE ancestry is a better predictor of its indispensability than %id conservation during mammalian evolution. Overall, we show that when mammalian DNA under purifying selection can be pinpointed, most loci in this set (of many thousands) are equally indispensable over millions of years of species evolution. Population genetics theory suggests that therefore each of these cis-regulatory elements confers at least a small fitness gain, potentially undetectable by lab measurements but clearly noticeable over evolutionary time.

I am a second year CS student interested in applying computational methods to biological problems. I am particularly interested in understanding mammalian evolutionary processes, as the mechanisms of cis-regulation of gene expression are poorly understood. I have uncovered a number of interesting non-coding genomic regions within vertebrates using the computational tools of high-performance computing, statistics, and natural language processing. I am also investigating roles for machine learning in the discovery of a genomic signature of cis-regulatory elements. Transgenic experiments performed in collaboration with the Kingsley laboratory may help unearth a fuller understanding of vertebrate cis-regulation and its role in development and evolution.