The quantification of diploid selection coefficients in specific human variants and genes remains largely elusive. Unlike model organisms, dominance (h) and selection (s) coefficients in humans must be inferred from natural population data. We present a method to estimate coarse average selection and dominance coefficients per gene by comparing Exome Aggregation Consortium1 population genetic data in ~35,000 Europeans to simulated diploid alleles in a realistic demography2. We match putatively deleterious variants (nonsense and damaging missense) via informative summary statistics of the per-gene frequency spectrum. We classify genes as candidate strong selection recessives (h<0.1), strongly selected “non-recessives” (h>=0.1), under weak selection, nearly neutral, or sub-drift.
To validate our candidate recessive and non-recessive gene sets, we demonstrate significant enrichment in genes under recessive selection (and/or depletion of non-recessives) for autosomal recessive diseases, hearing loss, and in genes identified in consanguineous individuals with depleted homozygous LOF variants3. We replicate classical predictions of recessivity in large metabolic pathways (e.g. TCA), consistent with Wright’s theory of the physiological origin of dominance4,5, and GO annotated extracellular localization, and dominance in GO transcription factors6. We find significant enrichment for GO infertility, meiosis, and spermatogenesis genes in the recessive strong selection class, but no enrichment for oogenesis, suggesting a large autosomal recessive component to male-specific infertility consistent with mammalian studies in cattle7.
To our knowledge this is the first large set of human candidate recessive genes (~1500) identified from panmictic population data. This is qualitatively consistent with recessivity observed in most deadly fly and yeast variants8,9. Notably, a large recessive component in many human genes is inconsistent with the simplifying assumption of additivity in previous estimates of selection against non-synonymous variants10,11, since recessive genes under strong selection map to weak selection due to prevalent neutral heterozygotes. Thus, a dominance-aware marginal DFE substantially increases the average selection against deleterious human variants.