Oral Presentation Society for Molecular Biology and Evolution Conference 2016

­­Envision: A computational tool for predicting mutational effect magnitudes (#184)

Vanessa Gray 1 , Ronald J HAUSE 1 , Douglas M Fowler 1 , Jay Shendure
  1. University of Washington, Seattle, WASHINGTON, United States

Current computational predictors for mutational effect focus on the binary consequences of mutations, e.g., deleterious or not. Recently, technological advances have afforded high-throughput methods to quantify mutational effects on protein function. Here, we leverage large-scale mutagenesis data sets comprising tens of thousands of quantitative mutational effect scores for several proteins and protein domains to train a computational tool for predicting mutational effect scores. Our tool, Envision, was trained using gradient boosting machine learning and uses evolutionary conservation, biochemical, and structural annotations to predict both categorical and quantitative effects of single amino acid mutations. Envision is highly accurate both for classification and regression in 10-fold cross validation. We validated Envision in several ways, including on large-scale mutagenesis data not included in model training and on other mutational databases like the Protein Mutant Database. In all cases, we find that Envision outperforms other predictors, except when those predictors were trained on the testing data in question.