A joint use of pooling and imputation for genotyping SNPs-Reference-Cited by-同舟云学术

A joint use of pooling and imputation for genotyping SNPs

Published:2022-10-13 Issue:1 Volume:23 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Clouard Camille,Ausmees Kristiina,Nettelblad Carl

Abstract

Abstract Background Despite continuing technological advances, the cost for large-scale genotyping of a high number of samples can be prohibitive. The purpose of this study is to design a cost-saving strategy for SNP genotyping. We suggest making use of pooling, a group testing technique, to drop the amount of SNP arrays needed. We believe that this will be of the greatest importance for non-model organisms with more limited resources in terms of cost-efficient large-scale chips and high-quality reference genomes, such as application in wildlife monitoring, plant and animal breeding, but it is in essence species-agnostic. The proposed approach consists in grouping and mixing individual DNA samples into pools before testing these pools on bead-chips, such that the number of pools is less than the number of individual samples. We present a statistical estimation algorithm, based on the pooling outcomes, for inferring marker-wise the most likely genotype of every sample in each pool. Finally, we input these estimated genotypes into existing imputation algorithms. We compare the imputation performance from pooled data with the Beagle algorithm, and a local likelihood-aware phasing algorithm closely modeled on MaCH that we implemented. Results We conduct simulations based on human data from the 1000 Genomes Project, to aid comparison with other imputation studies. Based on the simulated data, we find that pooling impacts the genotype frequencies of the directly identifiable markers, without imputation. We also demonstrate how a combinatorial estimation of the genotype probabilities from the pooling design can improve the prediction performance of imputation models. Our algorithm achieves 93% concordance in predicting unassayed markers from pooled data, thus it outperforms the Beagle imputation model which reaches 80% concordance. We observe that the pooling design gives higher concordance for the rare variants than traditional low-density to high-density imputation commonly used for cost-effective genotyping of large cohorts. Conclusions We present promising results for combining a pooling scheme for SNP genotyping with computational genotype imputation on human data. These results could find potential applications in any context where the genotyping costs form a limiting factor on the study size, such as in marker-assisted selection in plant breeding.

Funder

Svenska Forskningsrådet Formas

Uppsala University

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/s12859-022-04974-7.pdf

Reference50 articles.

1. Fernández ME, Goszczynski DE, Lirón JP, Villegas-Castagnasso EE, Carino MH, Rogberg-Muñoz MVRA, Posik DM, Peral-García P, Giovambattista G. Comparison of the effectiveness of microsatellites and snp panels for genetic identification, traceability and assessment of parentage in an inbred angus herd. Genet Mol Biol. 2013;36(2):185–91.

2. Cao C, Li C, Huang Z, Ma X, Sun X. Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing. Genet Epidemiol. 2013;37(8):820–30.

3. Howie B, Marchini J. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010;11:66.

4. Sung YJ, Gu CC, Tiwari HK, Arnett DK, Broeckel U, Rao DC. Genotype imputation for African Americans using data from hapmap phase ii versus 1000 genomes projects. Genet Epidemiol. 2012;36(5):508–16.

5. Chanda P, Li NYM, et al. Haplotype variation and genotype imputation in African populations. Hum Genet. 2012;57:411–21.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Genotyping of SNPs in bread wheat at reduced cost from pooled experiments and imputation;Theoretical and Applied Genetics;2024-01

2. Half-cost array-based genotyping of SNPs in bread wheat from pooled experiments and imputation;2023-05-22