mEthAE: an Explainable AutoEncoder for methylation data

Author:

Katz SonjaORCID,Martins dos Santos Vitor A.P.ORCID,Saccenti EdoardoORCID,Roshchupkin Gennady V.

Abstract

1AbstractDespite the wealth of knowledge generated through epigenome-wide association studies our under-standing of the relationships of CpG sites is still limited, as analysis of DNA methylation data remains difficult due its high dimensionality. To combat this problem, deep learning algorithms, such as autoencoders, are increasingly applied to capture the complex patterns and reduce dimensionality into latent space. We believe that the way an autoencoder groups together CpGs in its latent dimensions has biological meaning and might reveal novel insights regarding the relationship of CpGs. Therefore, in this work, we propose a chromosome-wise autoencoder for interpretable dimensionality reduction of methylation data (mEthAE). Our framework shows an impressive reduction in dimensions of up to 400-fold compared to the provided input, without compromising on reconstruction accuracy or predictive power in the latent space. Through our perturbation-based interpretability approach we revealed groups of CpGs which are highly connected across all latent dimensions (global CpGs) and were significantly more often reported in EWAS studies, indicating our interpretability method can successfully identify CpGs with biological relevance. In an attempt to gain a deeper understanding of the relationship between individual CpG sites, we focused on interpreting individual latent features and found that CpGs connected to a common feature do not share biological associations, correlation patterns, or are located in close proximity on the chromosome. We conclude that while there is evidence that the autoencoder does not group CpGs randomly, the logic behind the observed CpG relationships can not be delineated easily. With regards to the analyses done in this work, we believe that the autoencoder groups CpGs according to long range non-linear interaction patterns that lack characterisation in the current epigenetic research landscape.

Publisher

Cold Spring Harbor Laboratory

Reference58 articles.

1. Zachary D. Smith and Alexander Meissner. DNA methylation: Roles in mammalian development. 14(3):204–220.

2. Elizabeth M. Martin and Rebecca C. Fry. Environmental Influences on the Epigenome: Exposure-Associated DNA Methylation in Human Populations. 39(1):309–333.

3. Lotte C. Houtepen , Christiaan H. Vinkers , Tania Carrillo-Roa , Marieke Hiemstra , Pol A. van Lier , Wim Meeus , Susan Branje , Christine M. Heim , Charles B. Nemeroff , Jonathan Mill , Leonard C. Schalkwyk , Menno P. Creyghton René S. Kahn Marian Joëls , Elisabeth B. Binder , and Marco P. M. Boks . Genome-wide DNA methylation levels and altered cortisol stress reactivity following childhood trauma in humans. 7(1):10967.

4. Roby Joehanes , Allan C. Just , Riccardo E. Marioni , Luke C. Pilling , Lindsay M. Reynolds , Pooja R. Mandaviya , Weihua Guan , Tao Xu , Cathy E. Elks , Stella Aslibekyan , Hortensia Moreno-Macias , Jennifer A. Smith , Jennifer A. Brody , Radhika Dhingra , Paul Yousefi , James S. Pankow , Sonja Kunze , Sonia H. Shah , Allan F. McRae , Kurt Lohman , Jin Sha , Devin M. Absher , Luigi Ferrucci , Wei Zhao , Ellen W. Demerath , Jan Bressler , Megan L. Grove , Tianxiao Huan , Chunyu Liu , Michael M. Mendelson , Chen Yao , Douglas P. Kiel , Annette Peters , Rui Wang-Sattler , Pe-ter M. Visscher , Naomi R. Wray , John M. Starr , Jingzhong Ding , Carlos J. Rodriguez , Nicholas J. Wareham , Marguerite R. Irvin , Degui Zhi , Myrto Barrdahl , Paolo Vineis , Srikant Ambatipudi André G. Uitterlinden , Albert Hofman , Joel Schwartz , Elena Colicino , Lifang Hou , Pantel S. Vokonas , Dena G. Hernandez , Andrew B. Singleton , Stefania Bandinelli , Stephen T. Turner , Erin B. Ware , Alicia K. Smith , Torsten Klengel , Elisabeth B. Binder , Bruce M. Psaty , Kent D. Taylor , Sina A. Gharib , Brenton R. Swenson , Liming Liang , Dawn L. DeMeo George T. O’Connor , Zdenko Herceg , Kerry J. Ressler , Karen N. Conneely , Nona Sotoodehnia , Sharon L. R. Kardia , David Melzer , Andrea A. Baccarelli , Joyce B. J. van Meurs , Isabelle Romieu , Donna K. Arnett , Ken K. Ong , Yongmei Liu , Melanie Waldenberger , Ian J. Deary , Myriam Fornage , Daniel Levy , and Stephanie J. London . Epigenetic Signatures of Cigarette Smoking. 9(5):436–447.

5. Silvana C. E. Maas , Athina Vidaki , Rory Wilson , Alexander Teumer , Fan Liu , Joyce B. J. van Meurs André G. Uitterlinden , Dorret I. Boomsma , Eco J. C. de Geus , Gonneke Willemsen , Jenny van Dongen , Carla J. H. van der Kallen , P. Eline Slagboom , Marian Beekman , Diana van Heemst , Leonard H. van den Berg , BIOS Consortium , Liesbeth Duijts , Vincent W. V. Jaddoe , Karl-Heinz Ladwig , Sonja Kunze , Annette Peters , M. Arfan Ikram , Hans J. Grabe , Janine F. Felix , Melanie Waldenberger , Oscar H. Franco , Mohsen Ghanbari , and Manfred Kayser . Validated inference of smoking habits from blood with a finite DNA methylation marker set. 34(11):1055–1074.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3