Oral Presentation Society for Molecular Biology and Evolution Conference 2016

A novel approach for assessing genetic burden and constraint in protein structures leveraging large sequencing cohorts: insights from MYH7 and hypertrophic cardiomyopathy (#90)

Julian R Homburger 1 , Eric M Green 2 , Colleen Caleshu 1 , Margaret S Sunitha 3 , Rebecca Taylor 1 , Kathleen M Ruppel 1 , Raghu Metpally 4 , Steven D Colan 5 , Michelle Michels 6 , Sharlene Day 7 , Iacopo Olivotto 8 , Carlos D Bustamante 1 , Frederick Dewey 9 , Carolyn Y Ho 10 , James A Spudich 1 , Euan A Ashley 1
  1. Stanford University, Stanford, CA, United States
  2. Myokardia, Inc, South San Francisco, CA, USA
  3. Institute for Stem Cell Biology and Regenerative Medicine, Bangalore, India
  4. Geisinger Health System, Danville, PA, USA
  5. Boston Children's Hospital, Boston
  6. Erasmus MC, Rotterdam
  7. University of Michigan, Ann Arbor
  8. Careggi University Hospital, Florence, Italy
  9. Regeneron, Inc., Tarrytown, NY, USA
  10. Brigham and Women's Hospital, Boston, MA, USA

Myosin motors are the fundamental force-generating elements of muscle contraction. Variation in the β-cardiac myosin gene (MYH7) can lead to hypertrophic cardiomyopathy (HCM), a heritable disease characterized by cardiac hypertrophy, heart failure, and sudden cardiac death. A key debate is whether there exist hotspots of pathogenic variation within the myosin structure. Previous studies have reported conflicting results and suffered from small sample sizes and lack of reference cohorts. Furthermore, how specific myosin variants alter motor function or clinical expression of disease remains incompletely understood. To address these questions, we developed a statistical method for analyzing disease burden and constraint in three-dimensional protein structures and surfaces that we apply to β-cardiac myosin. We combine structural models of myosin from multiple stages of its chemomechanical cycle, exome sequencing data from two population cohorts of 60,706 and 42,930 individuals, and genetic and phenotypic data from 2,913 HCM patients to identify regions of disease-variant enrichment within β-cardiac myosin. We develop computational models of the human β-cardiac myosin protein structure before and after the myosin power stroke. Then, using a spatial scan statistic modified to analyze genetic variation in protein three-dimensional space, we show a significant enrichment of disease-associated variants in the converter (p=0.002), a kinetic domain that transduces force from the catalytic domain to the lever arm during the power stroke. Focusing our analysis on surface-exposed residues, we identified a larger region significantly enriched for disease-associated variants that contains both the converter domain and residues on a single flat surface on the myosin head described as the myosin mesa (p=0.002). Notably, HCM patients with variants in the enriched regions have earlier disease onset than those with variants elsewhere. Our study provides a model for integrating protein structure, large-scale genetic sequencing and detailed phenotypic data to reveal insight into time-shifted protein structures and genetic disease.