MGnify: the microbiome sequence data analysis resource in 2023

Author:

Richardson Lorna1ORCID,Allen Ben2,Baldi Germana1,Beracochea Martin1,Bileschi Maxwell L3,Burdett Tony1ORCID,Burgin Josephine1ORCID,Caballero-Pérez Juan1ORCID,Cochrane Guy1ORCID,Colwell Lucy J34,Curtis Tom2,Escobar-Zepeda Alejandra1,Gurbich Tatiana A1ORCID,Kale Varsha1,Korobeynikov Anton5,Raj Shriya1ORCID,Rogers Alexander B1,Sakharova Ekaterina1,Sanchez Santiago1,Wilkinson Darren J6,Finn Robert D1ORCID

Affiliation:

1. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Genome Campus, Hinxton, Cambridge , UK

2. School of Engineering, Newcastle University , Newcastle upon Tyne, UK

3. Google Research, Brain Team , Mountain View , CA , USA

4. Department of Chemistry, University of Cambridge , Cambridge , UK

5. Center for Algorithmic Biotechnology, St Petersburg State University , St Petersburg, Russia

6. Department of Mathematical Sciences, Durham University , Durham , UK

Abstract

Abstract The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.

Funder

European Union's Horizon 2020

Research and Innovation programme

Biotechnology and Biological Sciences Research Council

ELIXIR

Russian Science Foundation

European Molecular Biology Laboratory

UK Research and Innovation

Publisher

Oxford University Press (OUP)

Subject

Genetics

Reference30 articles.

1. Ecosystem-specific microbiota and microbiome databases in the era of big data;Lobanov;Environ. Microbiome.,2022

2. MGnify: the microbiome analysis resource in 2020;Mitchell;Nucleic Acids Res.,2020

3. Methods included: standardizing computational reuse and portability with the common workflow language;Crusoe;Commun. ACM,2022

4. Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory;Goble;Zenodo,2021

5. Community structure and metabolism through reconstruction of microbial genomes from the environment;Tyson;Nature,2004

Cited by 56 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3