The increasing number of single-cell transcriptomics and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism, as well as the onset of pathologies. In this context, cell type annotation represents a crucial step for the analysis of single-cell RNA sequencing data, which is usually performed by means of time-consuming and possibly biased manual processes, carried out by expert biologists. Recently, alternative computational tools have been proposed to realize an automatic cell identification either based on supervised or unsupervised Machine Learning approaches. These methods typically exploit gene expression data of curated marker gene databases to associate gene expression profiles of single cells with a cell type. In this paper, we propose a novel fully-automatic computational pipeline, named single-cell Automatic Labeling of cell POpulations (scALPO), which leverages a Long Short-Term Memory Neural Network to assign the cell types. Specifically, scALPO can label the provided clusters by simply relying on marker genes rather than gene expressions. Our results, obtained by considering two different datasets, show that scALPO outperforms the most promising state-of-the-art approaches (i.e., SCSA and scType), achieving a cell type annotation more similar to the manually-created ground truth.

(2022). A Deep Learning Pipeline for the Automatic cell type Assignment of scRNA-seq Data . Retrieved from http://hdl.handle.net/10446/229830

A Deep Learning Pipeline for the Automatic cell type Assignment of scRNA-seq Data

Cazzaniga, Paolo;Tangherloni, Andrea
2022

Abstract

The increasing number of single-cell transcriptomics and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism, as well as the onset of pathologies. In this context, cell type annotation represents a crucial step for the analysis of single-cell RNA sequencing data, which is usually performed by means of time-consuming and possibly biased manual processes, carried out by expert biologists. Recently, alternative computational tools have been proposed to realize an automatic cell identification either based on supervised or unsupervised Machine Learning approaches. These methods typically exploit gene expression data of curated marker gene databases to associate gene expression profiles of single cells with a cell type. In this paper, we propose a novel fully-automatic computational pipeline, named single-cell Automatic Labeling of cell POpulations (scALPO), which leverages a Long Short-Term Memory Neural Network to assign the cell types. Specifically, scALPO can label the provided clusters by simply relying on marker genes rather than gene expressions. Our results, obtained by considering two different datasets, show that scALPO outperforms the most promising state-of-the-art approaches (i.e., SCSA and scType), achieving a cell type annotation more similar to the manually-created ground truth.
Riva, Simone G.; Myers, Brynelle; Cazzaniga, Paolo; Tangherloni, Andrea
File allegato/i alla scheda:
File Dimensione del file Formato  
CIBCB2022_scALPO.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 777.42 kB
Formato Adobe PDF
777.42 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/229830
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact