The genes that have survived to the present day are like evolving fossils that allow us to understand the genomes of long-extinct organisms. We are studying the birth and death of genes in the deep history of plant evolution by developing ways to simultaneously analyse data from gene tree phylogenies, species phylogenies, and gene orders. We are particularly fascinated by ancient polyploidy and the subsequent large-scale loss of duplicated genes. The dozens of polyploidy events in plant evolutionary history provide a wonderful testbed for ideas about how the properties of the few genes that are retained post-polyploidy can help explain the unique nature and diversity of the contemporary biota.
Evolutionary population and quantitative genetics
A recent interest is the application of population genomic sequencing technologies to better understand the dynamics of structural genetic variation in nature. For instance, postdoc Lex Flagel has been conducting a study of the roles of mutation, selection and drift in shaping structural variation in wild populations of the monkeyflower, Mimulus guttatus. Earlier work in the lab looked at the genetic architecture of domestication traits in crop plants, such as tomatillo, Physalis philadelphica.
We love genetic maps. In recent years, we have developed tools for de novo generation of dense genetic maps, genetic map guided assembly of fragmented whole genome shotgun sequences, detecting map intervals sharing common ancestry between highly divergent genomes, using comparative genomic data to estimate the probability of genes being present in incompletely sampled genomic intervals, and obtaining high-resolution genetic maps from optimized sampling of large mapping populations.
Data -> Information -> Knowledge
Through NESCent, we have become involved in a number of collaborative projects that seek, in various ways, to take better advantage of the vast stores of data, information, and knowledge being collected, processed, and published in evolutionary biology. Through the Dryad Digital Repository, we are helping to promote the preservation and reuse of “small-science” datasets associated with the published literature. Dryad is itself a member of the DataONE network, a collection of many diverse repositories that together provide a stable and persistent framework for data management, preservation, discovery and reuse. Through the Phenoscape project, we are helping to develop tools that will allow machines to reason over descriptive biological data, like phenotype descriptions, with the rigor of formal logic; enabling scientists to make connections between disparate pieces of information buried in remote corners of the published literature. We also help to provide user support and outreach to the Generic Model Organism Database community, a virtual organization that develops tools for the management of community genome data resources.