Split-Transformer Impute (STI): Genotype Imputation Using a Transformer-Based Model-Reference-Cited by-同舟云学术

Split-Transformer Impute (STI): Genotype Imputation Using a Transformer-Based Model

Published:2023-03-06 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Mowlaei Mohammad Erfan^ORCID,Li Chong,Chen Junjie^ORCID,Jamialahmadi Benyamin,Kumar Sudhir^ORCID,Rebbeck Timothy Richard,Shi Xinghua

Abstract

AbstractWith recent advances in DNA sequencing technologies, researchers are able to acquire increasingly larger volumes of genomic datasets, enabling the training of powerful models for downstream genomic tasks. However, genome scale dataset often contain many missing values, decreasing the accuracy and power in drawing robust conclusions drawn in genomic analysis. Consequently, imputation of missing information by statistical and machine learning methods has become important. We show that the current state-of-the-art can be advanced significantly by applying a novel variation of the Transformer architecture, called Split-Transformer Impute (STI), coupled with improved preprocessing of data input into deep learning models. We performed extensive experiments to benchmark STI against existing methods using resequencing datasets from human 1000 Genomes Project and yeast genomes. Results establish superior performance of our new methods compared to competing genotype imputation methods in terms of accuracy and imputation quality score in the benchmark datasets.

Publisher

Cold Spring Harbor Laboratory

Reference53 articles.

1. Torkamaneh, D. , Belzile, F. : Accurate imputation of untyped variants from deep sequencing data. Deep Sequencing Data Analysis, 271–281 (2021)

2. Song, M. , Greenbaum, J. , Luttrell IV, J. , Zhou, W. , Wu, C. , Luo, Z. , Qiu, C. , Zhao, L.J. , Su, K.-J. , Tian, Q. , et al.: An autoencoder-based deep learning method for genotype imputation. Frontiers in Artificial Intelligence 5 (2022)

3. Genotype Imputation from Large Reference Panels

4. Exact inference for hardy-weinberg proportions with missing genotypes: Single and multiple imputation;G3: Genes, Genomes, Genetics,2015

5. A Note on Exact Tests of Hardy-Weinberg Equilibrium

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology;Journal of Human Genetics;2024-01-15

2. Deep Learning Methods for Omics Data Imputation;Biology;2023-10-07