The two strains of Human herpesvirus 6 (HHV-6A and HHV-6B) infect >90% of the adult human population worldwide, and are linked with an increasing number of central nervous system and blood pathologies. These associations are debated because of the difficulty of sequencing the viruses. This led to very few complete published genomes, and lack of information on its intra-strain variation. The possible geographic stratification is of particular relevance, since many of the proposed associations with the virus are with diseases that show geographical patterns themselves.
Starting from the 1000 Genome Project data we scanned for HHV-6 in its genome-integrated form, and found 9 infected individuals. We retrieved biological samples from those individuals, and performed kit-based target enrichment and sequencing. This approach allowed us to obtain deep-coverage sequences to analyse variability and stratification.
The two viral strains show significantly different genome-wide variability, and different variability patterns along the genomes and among the different genomic features. HHV-6A sequences show clear separation of an Asian subgroup compared to the virus from individuals of others geographical origins. HHV-6B seems to have poor stratification, unless recombination is taken into consideration, which allows for an African and a European subgrouping to become detectable.
The overall results show that HHV-6A and B may have, as their sister taxa, geographical stratification. However, the resolution is low due to the low sample size. A higher number of samples, and thus a higher resolution, will allow us to create a better study design for disease association studies on these viruses, and shed light on this controversial field. We have therefore developed an in-house protocol that will allow us to perform target enrichment at low cost, giving us the possibility to strongly improve our data set.