Exome-wide benchmark of difficult-to-sequence regions using short-read next-generation DNA sequencing

Author:

Hijikata Atsushi1ORCID,Suyama Mikita2ORCID,Kikugawa Shingo3,Matoba Ryo3,Naruto Takuya4ORCID,Enomoto Yumi4,Kurosawa Kenji45,Harada Naoki6,Yanagi Kumiko7,Kaname Tadashi7ORCID,Miyako Keisuke8,Takazawa Masaki8,Sasai Hideo89,Hosokawa Junichi8,Itoga Sakae8,Yamaguchi Tomomi101112,Kosho Tomoki101112ORCID,Matsubara Keiko1314,Kuroki Yoko713,Fukami Maki14ORCID,Adachi Kaori15,Nanba Eiji15,Tsuchida Naomi1617,Uchiyama Yuri1617,Matsumoto Naomichi16ORCID,Nishimura Kunihiro18,Ohara Osamu812ORCID

Affiliation:

1. Laboratory of Computational Genomics, School of Life Sciences, Tokyo University of Pharmacy and Life Sciences , Hachioji, Tokyo 192-0392, Japan

2. Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University , Higashi-ku, Fukuoka 812-8582, Japan

3. DNA Chip Research Inc. , Minato-ku, Tokyo 105-0022, Japan

4. Clinical Research Institute, Kanagawa Children's Medical Center , Minami-ku, Yokohama , Kanagawa 232-0066, Japan

5. Division of Medical Genetics, Kanagawa Children's Medical Center , Minami-ku, Yokohama , Kanagawa 232-0066, Japan

6. Department of Fundamental Cell Technology, Center for iPS Cell Research and Application (CiRA), Kyoto University , Sakyo-ku, Kyoto 606-8507, Japan

7. Department of Genome Medicine, National Center for Child Health and Development , Setagaya-ku, Tokyo 157-8535, Japan

8. Department of Applied Genomics, Kazusa DNA Research Institute , Kisarazu , Chiba 292-0818, Japan

9. Department of Pediatrics, Graduate School of Medicine, Gifu University , Gifu , Gifu 501-1194, Japan

10. Department of Medical Genetics, Shinshu University School of Medicine , Matsumoto , Nagano 390-8621, Japan

11. Center for Medical Genetics, Shinshu University Hospital , Matsumoto , Nagano 390-8621, Japan

12. Division of Clinical Sequencing, Shinshu University School of Medicine , Matsumoto , Nagano 390-8621, Japan

13. Division of Collaborative Research, National Research Institute for Child Health and Development , Setagaya-ku, Tokyo 157-8535, Japan

14. Department of Molecular Endocrinology, National Research Institute for Child Health and Development , Setagaya-ku, Tokyo 157-8535, Japan

15. Organization for Research Initiative and Promotion, Tottori University , Yonago , Tottori 680-8550, Japan

16. Department of Human Genetics, Yokohama City University Graduate School of Medicine , Kanazawa-ku, Yokohama , Kanagawa 236-0027, Japan

17. Department of Rare Disease Genomics, Yokohama City University Hospital , Yokohama , Kanagawa 236-0027, Japan

18. Xcoo, Inc. , Bunkyo-ku, Tokyo 113-0033, Japan

Abstract

Abstract Next-generation DNA sequencing (NGS) in short-read mode has recently been used for genetic testing in various clinical settings. NGS data accuracy is crucial in clinical settings, and several reports regarding quality control of NGS data, primarily focusing on establishing NGS sequence read accuracy, have been published thus far. Variant calling is another critical source of NGS errors that remains unexplored at the single-nucleotide level despite its established significance. In this study, we used a machine-learning-based method to establish an exome-wide benchmark of difficult-to-sequence regions at the nucleotide-residue resolution using 10 genome sequence features based on real-world NGS data accumulated in The Genome Aggregation Database (gnomAD) of the human reference genome sequence (GRCh38/hg38). The newly acquired metric, designated the ‘UNMET score,’ along with additional lines of structural information from the human genome, allowed us to assess the sequencing challenges within the exonic region of interest using conventional short-read NGS. Thus, the UNMET score could provide a basis for addressing potential sequential errors in protein-coding exons of the human reference genome sequence GRCh38/hg38 in clinical sequencing.

Funder

Kazusa DNA Research Institute

Medical Research Centre Initiative for High Depth Omics at Kyushu University

Publisher

Oxford University Press (OUP)

Subject

Genetics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3