scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data-Reference-Cited by-同舟云学术

scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data

Published:2022-01-06 Issue:6 Volume:38 Page:1575-1583
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Wan Hui¹^ORCID,Chen Liang¹,Deng Minghua¹²³^ORCID

Affiliation:

1. School of Mathematical Sciences, Peking University , Beijing 100871, China

2. Center for Quantitative Biology, Peking University , Beijing 100871, China

3. Center for Statistical Science, Peking university , Beijing 100871, China

Abstract

Abstract Motivation The rapid development of single-cell RNA sequencing (scRNA-seq) makes it possible to study the heterogeneity of individual cell characteristics. Cell clustering is a vital procedure in scRNA-seq analysis, providing insight into complex biological phenomena. However, the noisy, high-dimensional and large-scale nature of scRNA-seq data introduces challenges in clustering analysis. Up to now, many deep learning-based methods have emerged to learn underlying feature representations while clustering. However, these methods are inefficient when it comes to rare cell type identification and barely able to fully utilize gene dependencies or cell similarity integrally. As a result, they cannot detect a clear cell type structure which is required for clustering accuracy as well as downstream analysis. Results Here, we propose a novel scRNA-seq clustering algorithm called scNAME which incorporates a mask estimation task for gene pertinence mining and a neighborhood contrastive learning framework for cell intrinsic structure exploitation. The learned pattern through mask estimation helps reveal uncorrupted data structure and denoise the original single-cell data. In addition, the randomly created augmented data introduced in contrastive learning not only helps improve robustness of clustering, but also increases sample size in each cluster for better data capacity. Beyond this, we also introduce a neighborhood contrastive paradigm with an offline memory bank, global in scope, which can inspire discriminative feature representation and achieve intra-cluster compactness, yet inter-cluster separation. The combination of mask estimation task, neighborhood contrastive learning and global memory bank designed in scNAME is conductive to rare cell type detection. The experimental results of both simulations and real data confirm that our method is accurate, robust and scalable. We also implement biological analysis, including marker gene identification, gene ontology and pathway enrichment analysis, to validate the biological significance of our method. To the best of our knowledge, we are among the first to introduce a gene relationship exploration strategy, as well as a global cellular similarity repository, in the single-cell field. Availability and implementation An implementation of scNAME is available from https://github.com/aster-ww/scNAME. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Key Research and Development Program of China

National Natural Science Foundation of China

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btac011/42257405/btac011.pdf

Reference28 articles.

1. Cell adhesion molecules in the normal and cancerous mammary gland;Alford;J. Mammary Gland Biol. Neoplasia,1996

2. Single-cell transcriptome data clustering via multinomial modeling and adaptive fuzzy k-means algorithm;Chen;Front. Genet,2020

3. Deep soft k-means clustering with self-training for single-cell RNA sequence data;Chen;NAR Genomics Bioinf,2020

4. Integrating deep supervised, self-supervised and unsupervised learning for single-cell RNA-seq clustering and annotation;Chen;Genes,2020

5. Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation;Chen;Bioinformatics,2021

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving cell type identification with Gaussian noise-augmented single-cell RNA-seq contrastive learning;Briefings in Functional Genomics;2024-01-18

2. scMAE: a masked autoencoder for single-cell RNA-seq clustering;Bioinformatics;2024-01-01

3. scGCC: Graph Contrastive Clustering With Neighborhood Augmentations for scRNA-Seq Data Analysis;IEEE Journal of Biomedical and Health Informatics;2023-12

4. Multi-View Clustering With Graph Learning for scRNA-Seq Data;IEEE/ACM Transactions on Computational Biology and Bioinformatics;2023-11

5. Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data;Briefings in Bioinformatics;2023-06-13