Poster Presentation Society for Molecular Biology and Evolution Conference 2016

Estimating Mutation Parameters and Population History Simultaneously from Temporally-Spaced Genome Data (#539)

Arman Bilge 1 , Tanja Stadler 2 , Alexei Drummond 1
  1. Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand
  2. Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland

Next generation methods have made it increasingly feasible to sequence entire genomes. Furthermore, the possibility to recover ancient genomes also augments the amount of available data in another dimension. The analysis of temporally- spaced sequences dramatically increases our statistical power to make inference about evolutionary dynamics, as frequently demonstrated in epidemiology. However, current methods that handle sampling-through-time do not scale well to the large genomes of diploid organisms due to the computational intractability of searching tree space. Instead, we use a closed-form expression that analytically integrates over all coalescent phylogenies at a constant or biallelic site. By assuming the genealogical independence of sites, we can express the likelihood in a form that is computationally tractable. Additional speedups are attained by pre-processing the genome alignment and an efficient parallel implementation. This approach utilises similar mathematical techniques to the SNAPP software for inferring species trees from genetic markers, but is extended to account for serially-sampled data. Notably, our method considers both biological processes, such as the transition–transversion ratio and site rate heterogeneity, as well as practical problems, including unphased genomes and sequencing error. After placing appropriate priors, we use Markov chain Monte Carlo to sample from the posterior distribution on our parameters. We demonstrate the usefulness of our method for inference on both simulated and real datasets.

  1. David Bryant, Remco Bouckaert, Joseph Felsenstein, Noah A. Rosenberg, and Arindam RoyChoudhury. Inferring Species Trees Directly from Biallelic Genetic Markers: Bypassing Gene Trees in a Full Coalescent Analysis. Mol Biol Evol (2012) 29 (8): 1917-1932. doi:10.1093/molbev/mss086