Understanding of shared ancestry in genetic datasets is almost always key to their interpretation. The fineSTRUCTURE package (Lawson et al., 2012) represents a powerful model-based approach to investigating population structure using genetic data. It offers especially high resolution in inference of recent shared ancestry, as evidenced for example in its application to investigation of genetic structure of the British population (Leslie et al., 2015). The high resolution of this method derives from utilizing haplotype linkage information and from focusing on the most recent coalescence (common ancestry) among the sampled individuals to derive a "co-ancestry matrix" - a summary of nearest neighbor haplotype relationships in the dataset. Further advantages when compared with other model-based methods (e.g. STRUCTURE and ADMIXTURE) include the ability to deal with a very large number of populations, explore relationships between them, and to quantify ancestry sources in each population.
The existing pipeline for co-ancestry matrix inference was designed to meet the needs of analyzing large scale human genetic SNP datasets, where chromosomal location of the markers are known and haplotypes are typically assumed to be correctly phased. Therefore, these methods have so far been inaccessible to users without high quality genome-wide haplotypes. With a boom in non-model organism genomics, there is a pressing need to bring these approaches to communities without access to such data.
Here we present RADpainter, a program designed specifically to infer the co-ancestry matrix from RAD-seq data, taking full advantage of its unique features. We package this new program together with the fineSTRUCTURE MCMC clustering algorithm into fineRADstructure - a complete, easy to use, and fast population inference package for RAD-seq data (https://github.com/millanek/fineRADstructure).