Peter Andolfatto

Peter Andolfatto
Associated Faculty, Ecology and Evolutionary Biology nd the Lewis-Sigler Institute for Integrative Genomics
Icahn Laboratory, 144

Research Area

Genetics & Genomics

Research Focus

Population genetics and genomics

Adaptive evolution in non-coding DNA and the evolution of gene expression

A growing body of evidence supports the view that regulatory evolution – the evolution of where and when a gene is expressed – is the primary genetic mechanism behind the modular organization, functional diversification and origin of novel traits in higher organisms. Most elements regulating gene expression in eukaryotic genomes reside in noncoding DNA (i.e. DNA that does not encode protein). Recent studies suggest that much of the non-coding portion of the Drosophila genome is evolutionarily constrained, implying that these regions are important for an organism's fitness and may be the target of substantial adaptive evolution. In a pair of recent papers (Andolfatto Nature 2005; Haddrill et al. MBE 2008), we have shown that this signature of adaptive evolution is concentrated in untranslated transcribed regions (UTRs). This past year, we began a NIH-funded project that combines novel computational and experimental approaches to i) identify UTRs and cis-regulatory modules (CRMs) that may have been targets of recurrent adaptive evolution and ii) experimentally test the effects of putatively functional substitutions on levels of gene expression divergence between species. This research will identify new cis-regulatory elements, develop novel methodologies for mapping such elements and provide important insights into how gene regulatory changes have led to the evolution of new species and diversity in animal forms. We hope the computational methods and biological intuitions developed will become widely applicable to other model systems.

Recombination rate variation and its impact on patterns of genome evolution

Recombination is an important process in evolutionary genetic models that determines the scale of linkage disequilibrium, (the extent to which pairs of mutations are associated) and is predicted to influence how efficiently natural selection can eliminate deleterious mutations (the "Hill-Robertson" effect and the "Muller's Ratchet" model) and incorporate beneficial ones (the "Fisher-Muller" effect and the "Ruby in the Rubbish" model). The local recombination rate also determines the size of the expected footprint left by adaptive substitutions and thus is an important parameter in attempts to quantify the strength and frequency of positive selection throughout the genome (Andolfatto 2007; Sella et al. 2009).

Currently, the genetic map in D. melanogaster is poorly resolved compared to other model organisms. We currently have little idea about how recombination rates vary across the genome at physical (and genetic) scales relevant to individual selective signatures (<100 kb). We are remedying this limitation by quantifying recombination rate variation at three different genomic scales in D. melanogaster with state of the art genotyping methods based on next-generation sequencing. In one effort, we are collecting data on genome-wide recombination rates to improve the accuracy of the genetic map. We are also using a combination of transgenic and targeted sequencing to investigate recombination rate variation at much smaller scales (<100kb) and extending these approaches to survey recombination rate variation among individuals and in close relatives of D. melanogaster. We plan to integrate this new recombination rate data with population genomic data being produced by various labs to gain better insights into the extent to which recombination rate shapes patterns of genomic variation and evolution.

Comparative population genomics of Drosophila species

Considerable advances have been made in the comparative genomics of Drosophila species through the "12 Drosophila genomes" sequencing project. Anticipating the release of genome sequences for a large panel of D. melanogaster individuals, we have begun a project in collaboration with Kevin Thornton of UC Irvine to collect 40X coverage of 20 strains each of D. simulans and D. yakuba, two close relatives of D. melanogaster. This data will give us unprecedented insights into the genetic factors at shape genome variation in Drosophila species in a comparative framework, including recombination (Andolfatto and Wall 2003), positive and negative selection (Andolfatto 2005; Haddrill et al. 2008) and genetic hitchhiking (Andolfatto 2007).

Developing new approaches to high-throughput genotyping

While array-based genotyping methods are ideal for surveying a very large number of markers in few individuals and SNP-based methods (RT-PCR, Sequenom, ligation-mediated methods, etc) are well-suited to typing a small number of markers in a large number of individuals, there is a awkward gap in applications that require typing an intermediate number of markers (hundreds to thousands) in an intermediate number of individuals (hundreds to thousands). In particular, the construction of genetic maps, or mapping quantitative trait loci, currently lacks efficient and cost-effective genotyping at this scale. In addition, many genotyping methods require knowing the genome sequence and the SNPs beforehand making them unsuitable for most non-model species.

To fill this gap, we are collaborating with David Stern's lab (Princeton University) to develop new genotyping method we call "MSG" (Multiplexed Shotgun Sequencing) based on next-generation sequencing technology. We are currently piloting these methods on projects ranging from mapping genes underlying behavioural traits that differ between Drosophila species, constructing genetic maps and mapping ecologically relevant traits in Lepidoteran species, to investigating small-scale recombination rate variation in Drosophila. We hope that these genotyping methods will fill a much-needed gap in cost-effective genotyping methods and provide a valuable resource of particular interest to the evolutionary and ecological genetics communities.