Affiliation:
1. Department of Biological Sciences, Columbia University
2. Department of Evolution and Ecology and Center for Population Biology, University of California
3. Department of Systems Biology, Columbia University
Abstract
A major focus of human genetics is to map severe disease mutations. Increasingly, that goal is understood as requiring huge numbers of people to be sequenced from every broadly defined genetic ancestry group, so as not to miss “ancestry-specific variants.” Here, we consider whether this focus is warranted. We start from first principles considerations, based on models of mutation–drift-selection balance, which suggest that since severe disease mutations tend to be strongly deleterious, and thus evolutionarily young, they will be kept at relatively constant frequency through recurrent mutation. Therefore, highly pathogenic alleles should be shared identically by descent within extended families, not broad ancestry groups, and sequencing more people should yield similar numbers regardless of ancestry. We test the model predictions using gnomAD genetic ancestry groupings and show that they provide a good fit to the classes of variants most likely to be highly pathogenic, notably sets of loss of function alleles at strongly constrained genes. These findings clarify that strongly deleterious alleles will be found at comparable rates in people of all ancestries, and the information they provide about human biology is shared across ancestries.
Publisher
Proceedings of the National Academy of Sciences