The mutational constraint spectrum quantified from variation in 141,456 humans
Author:
Karczewski Konrad J.ORCID, Francioli Laurent C.ORCID, Tiao Grace, Cummings Beryl B., Alföldi JessicaORCID, Wang QingboORCID, Collins Ryan L.ORCID, Laricchia Kristen M., Ganna Andrea, Birnbaum Daniel P., Gauthier Laura D., Brand Harrison, Solomonson Matthew, Watts Nicholas A., Rhodes Daniel, Singer-Berk Moriel, England Eleina M., Seaby Eleanor G., Kosmicki Jack A., Walters Raymond K.ORCID, Tashman Katherine, Farjoun YossiORCID, Banks Eric, Poterba Timothy, Wang Arcturus, Seed Cotton, Whiffin Nicola, Chong Jessica X., Samocha Kaitlin E.ORCID, Pierce-Hoffman Emma, Zappala Zachary, O’Donnell-Luria Anne H.ORCID, Minikel Eric Vallabh, Weisburd Ben, Lek Monkol, Ware James S.ORCID, Vittal Christopher, Armean Irina M.ORCID, Bergelson Louis, Cibulskis Kristian, Connolly Kristen M., Covarrubias Miguel, Donnelly Stacey, Ferriera Steven, Gabriel Stacey, Gentry Jeff, Gupta Namrata, Jeandet Thibault, Kaplan Diane, Llanwarne Christopher, Munshi Ruchi, Novod Sam, Petrillo Nikelle, Roazen David, Ruano-Rubio Valentin, Saltzman Andrea, Schleicher Molly, Soto Jose, Tibbetts Kathleen, Tolonen Charlotte, Wade Gordon, Talkowski Michael E.ORCID, Aguilar Salinas Carlos A., Ahmad Tariq, Albert Christine M., Ardissino Diego, Atzmon Gil, Barnard John, Beaugerie Laurent, Benjamin Emelia J., Boehnke Michael, Bonnycastle Lori L., Bottinger Erwin P., Bowden Donald W., Bown Matthew J., Chambers John C., Chan Juliana C., Chasman Daniel, Cho Judy, Chung Mina K., Cohen Bruce, Correa Adolfo, Dabelea Dana, Daly Mark J., Darbar Dawood, Duggirala Ravindranath, Dupuis Josée, Ellinor Patrick T., Elosua Roberto, Erdmann Jeanette, Esko Tõnu, Färkkilä Martti, Florez Jose, Franke Andre, Getz Gad, Glaser Benjamin, Glatt Stephen J., Goldstein David, Gonzalez Clicerio, Groop Leif, Haiman Christopher, Hanis Craig, Harms Matthew, Hiltunen Mikko, Holi Matti M., Hultman Christina M., Kallela Mikko, Kaprio Jaakko, Kathiresan Sekar, Kim Bong-Jo, Kim Young Jin, Kirov George, Kooner Jaspal, Koskinen Seppo, Krumholz Harlan M., Kugathasan Subra, Kwak Soo Heon, Laakso Markku, Lehtimäki Terho, Loos Ruth J. F., Lubitz Steven A., Ma Ronald C. W., MacArthur Daniel G., Marrugat Jaume, Mattila Kari M., McCarroll Steven, McCarthy Mark I., McGovern Dermot, McPherson Ruth, Meigs James B., Melander Olle, Metspalu Andres, Neale Benjamin M., Nilsson Peter M., O’Donovan Michael C., Ongur Dost, Orozco Lorena, Owen Michael J., Palmer Colin N. A., Palotie Aarno, Park Kyong Soo, Pato Carlos, Pulver Ann E., Rahman Nazneen, Remes Anne M., Rioux John D., Ripatti Samuli, Roden Dan M., Saleheen Danish, Salomaa Veikko, Samani Nilesh J., Scharf Jeremiah, Schunkert Heribert, Shoemaker Moore B., Sklar Pamela, Soininen Hilkka, Sokol Harry, Spector Tim, Sullivan Patrick F., Suvisaari Jaana, Tai E. Shyong, Teo Yik Ying, Tiinamaija Tuomi, Tsuang Ming, Turner Dan, Tusie-Luna Teresa, Vartiainen Erkki, Vawter Marquis P., Ware James S., Watkins Hugh, Weersma Rinse K., Wessman Maija, Wilson James G., Xavier Ramnik J., Neale Benjamin M.ORCID, Daly Mark J., MacArthur Daniel G.ORCID,
Abstract
AbstractGenetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
Publisher
Springer Science and Business Media LLC
Subject
Multidisciplinary
Reference52 articles.
1. MacArthur, D. G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012). 2. Schneeberger, K. Using next-generation sequencing to isolate mutant genes from forward genetic screens. Nat. Rev. Genet. 15, 662–676 (2014). 3. Zambrowicz, B. P. & Sands, A. T. Knockouts model the 100 best-selling drugs—will they model the next 100? Nat. Rev. Drug Discov. 2, 38–51 (2003). 4. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). 5. Chong, J. X. et al. The genetic basis of mendelian phenotypes: discoveries, challenges, and opportunities. Am. J. Hum. Genet. 97, 199–215 (2015).
Cited by
7243 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|