Expanded methylome and quantitative trait loci detection by long-read profiling of personal DNA

Author:

Groza CristianORCID,Ge Bing,Cheung Warren,Pastinen Tomi,Bourque GuillaumeORCID

Abstract

AbstractStructural variants (SVs) are omnipresent in human DNA, yet their genotype and methylation status is rarely characterized due to previous limitations in genome assembly and detection of modified nucleotides. Because of this, the extent to which these regions act as quantitative-trait loci is also largely unknown.Here, we generated a pangenome graph summarizing the SVs in 782 de novo assembled genomes obtained from the Genomic Answers for Kids rare disease cohort, that captures 14.6 million CpGs in DNA segments that are absent from the CHM13v2 assembly (SV-CpGs), expanding their number by 43.6%. Next, using 435 methylomes from the same samples, we genotyped a total of 7.99 million SV-CpGs, of which 5.18 million (64.8%) were found to be methylated (SV-5mCpGs) in at least one sample.To understand the provenance and impact of these novel SV-CpGs, we noted that non-repeat sequences were the leading contributor of SV-CpGs (3.3 × 106), followed by centromeric satellites (1.58 × 106), simple repeats (1.19 × 106), Alus (0.67 × 106), satellites (0.39 × 106), L1s (0.27 × 106), and SVAs (0.19 × 106). Meanwhile, the methylation rate of SV-CpGs was the highest in repeat sequences. Moreover, in contrast to Alus and L1s, centromeric satellites, simple repeats and SVA sequences were overrepresented in SV-5mCpGs compared to reference CpGs. Similarly, we established that non-reference CpGs were more than twice (37% vs. 15%) as likely to be variable, showing intermediate methylation levels in the population.Lastly, to explore if SVs detected in this pangenome are potentially causal for functional variation in population we measured methylation quantitative trait loci (SV-mQTLs) using CHM13v2 as a backbone. This revealed over 230,464 methylation bins within 100 kbp of a common SV (>5% MAF) showing significant association (at 5% FDR) with methylation variation. Finally, we assessed how many of these SVs-mQTLs were the leading QTL variant compared to SNVs and identified 65,659 methylation bins (28.5%) where the leading variant was an SV.In conclusion, our results demonstrate that graph genome references providing full SV structures in combination with the associated methylation variation reveal tens-of-thousands of QTLs that are more accurately mapped in personal genomes, underscoring the importance of assembly-based analyses of human traits.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3