Oral Presentation Society for Molecular Biology and Evolution Conference 2016

RADpainter and fineRADstructure: population inference from RAD-seq data (#183)

Milan Malinsky 1 2 , Emiliano Trucchi 3 , Daniel J Lawson 4 , Daniel Falush 5
  1. University of Cambridge, Cambridge, United Kingdom
  2. Wellcome Trust Sanger Institute, Cambridge, United Kingdom
  3. University of Vienna, Vienna, Austria
  4. University of Bristol, Bristol, United Kingdom
  5. University of Swansea, Swansea, Wales, United Kingdom

Understanding of shared ancestry in genetic datasets is almost always key to their interpretation. The fineSTRUCTURE package (Lawson et al., 2012) represents a powerful model-based approach to investigating population structure using genetic data. It offers especially high resolution in inference of recent shared ancestry, as evidenced for example in its application to investigation of genetic structure of the British population (Leslie et al., 2015). The high resolution of this method derives from utilizing haplotype linkage information and from focusing on the most recent coalescence (common ancestry) among the sampled individuals to derive a "co-ancestry matrix" - a summary of nearest neighbor haplotype relationships in the dataset. Further advantages when compared with other model-based methods (e.g. STRUCTURE and ADMIXTURE) include the ability to deal with a very large number of populations, explore relationships between them, and to quantify ancestry sources in each population.

The existing pipeline for co-ancestry matrix inference was designed to meet the needs of analyzing large scale human genetic SNP datasets, where chromosomal location of the markers are known and haplotypes are typically assumed to be correctly phased. Therefore, these methods have so far been inaccessible to users without high quality genome-wide haplotypes. With a boom in non-model organism genomics, there is a pressing need to bring these approaches to communities without access to such data.

Here we present RADpainter, a program designed specifically to infer the co-ancestry matrix from RAD-seq data, taking full advantage of its unique features. We package this new program together with the fineSTRUCTURE MCMC clustering algorithm into fineRADstructure - a complete, easy to use, and fast population inference package for RAD-seq data (https://github.com/millanek/fineRADstructure).

  1. Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. (2012).
  2. Leslie, S. et al. The fine-scale genetic structure of the British population. Nature 519, 309–314 (2015).