To see the other types of publications on this topic, follow the link: Cell Annotation.

Journal articles on the topic 'Cell Annotation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Cell Annotation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Huang, Xiaoqian, Ruiqi Liu, Shiwei Yang, Xiaozhou Chen, and Huamei Li. "scAnnoX: an R package integrating multiple public tools for single-cell annotation." PeerJ 12 (March 28, 2024): e17184. http://dx.doi.org/10.7717/peerj.17184.

Full text
Abstract:
Background Single-cell annotation plays a crucial role in the analysis of single-cell genomics data. Despite the existence of numerous single-cell annotation algorithms, a comprehensive tool for integrating and comparing these algorithms is also lacking. Methods This study meticulously investigated a plethora of widely adopted single-cell annotation algorithms. Ten single-cell annotation algorithms were selected based on the classification of either reference dataset-dependent or marker gene-dependent approaches. These algorithms included SingleR, Seurat, sciBet, scmap, CHETAH, scSorter, sc.type, cellID, scCATCH, and SCINA. Building upon these algorithms, we developed an R package named scAnnoX for the integration and comparative analysis of single-cell annotation algorithms. Results The development of the scAnnoX software package provides a cohesive framework for annotating cells in scRNA-seq data, enabling researchers to more efficiently perform comparative analyses among the cell type annotations contained in scRNA-seq datasets. The integrated environment of scAnnoX streamlines the testing, evaluation, and comparison processes among various algorithms. Among the ten annotation tools evaluated, SingleR, Seurat, sciBet, and scSorter emerged as top-performing algorithms in terms of prediction accuracy, with SingleR and sciBet demonstrating particularly superior performance, offering guidance for users. Interested parties can access the scAnnoX package at https://github.com/XQ-hub/scAnnoX.
APA, Harvard, Vancouver, ISO, and other styles
2

Vădineanu, Serban, Daniël M. Pelt, Oleh Dzyubachyk, and Kees Joost Batenburg. "Reducing Manual Annotation Costs for Cell Segmentation by Upgrading Low-Quality Annotations." Journal of Imaging 10, no. 7 (July 17, 2024): 172. http://dx.doi.org/10.3390/jimaging10070172.

Full text
Abstract:
Deep-learning algorithms for cell segmentation typically require large data sets with high-quality annotations to be trained with. However, the annotation cost for obtaining such sets may prove to be prohibitively expensive. Our work aims to reduce the time necessary to create high-quality annotations of cell images by using a relatively small well-annotated data set for training a convolutional neural network to upgrade lower-quality annotations, produced at lower annotation costs. We investigate the performance of our solution when upgrading the annotation quality for labels affected by three types of annotation error: omission, inclusion, and bias. We observe that our method can upgrade annotations affected by high error levels from 0.3 to 0.9 Dice similarity with the ground-truth annotations. We also show that a relatively small well-annotated set enlarged with samples with upgraded annotations can be used to train better-performing cell segmentation networks compared to training only on the well-annotated set. Moreover, we present a use case where our solution can be successfully employed to increase the quality of the predictions of a segmentation network trained on just 10 annotated samples.
APA, Harvard, Vancouver, ISO, and other styles
3

Hia, Nazifa Tasnim, and Sumon Ahmed. "Automatic cell type annotation using supervised classification: A systematic literature review." Systematic Literature Review and Meta-Analysis Journal 3, no. 3 (October 21, 2022): 99–108. http://dx.doi.org/10.54480/slrm.v3i3.45.

Full text
Abstract:
Single-cell sequencing gives us the opportunity to analyze cells on an individual level rather than at a population level. There are different types of sequencing based on the stage and portion of the cell from where the data are collected. Among those Single Cell RNA seq is most widely used and most application of cell type annotation has been on Single-cell RNA seq data. Tools have been developed for automatic cell type annotation as manual annotation of cell type is time-consuming and partially subjective. There are mainly three strategies to associate cell type with gene expression profiles of single cell by using marker genes databases, correlating expression data, transferring levels by supervised classification. In this SLR, we present a comprehensive evaluation of the available tools and the underlying approaches to perform automated cell type annotations on scRNA-seq data.
APA, Harvard, Vancouver, ISO, and other styles
4

Xu, Yang, Simon J. Baumgart, Christian M. Stegmann, and Sikander Hayat. "MACA: marker-based automatic cell-type annotation for single-cell expression data." Bioinformatics 38, no. 6 (December 22, 2021): 1756–60. http://dx.doi.org/10.1093/bioinformatics/btab840.

Full text
Abstract:
Abstract Summary Accurately identifying cell types is a critical step in single-cell sequencing analyses. Here, we present marker-based automatic cell-type annotation (MACA), a new tool for annotating single-cell transcriptomics datasets. We developed MACA by testing four cell-type scoring methods with two public cell-marker databases as reference in six single-cell studies. MACA compares favorably to four existing marker-based cell-type annotation methods in terms of accuracy and speed. We show that MACA can annotate a large single-nuclei RNA-seq study in minutes on human hearts with ∼290K cells. MACA scales easily to large datasets and can broadly help experts to annotate cell types in single-cell transcriptomics datasets, and we envision MACA provides a new opportunity for integration and standardization of cell-type annotation across multiple datasets. Availability and implementation MACA is written in python and released under GNU General Public License v3.0. The source code is available at https://github.com/ImXman/MACA. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
5

Gill, Jaidip, Abhijit Dasgupta, Brychan Manry, and Natasha Markuzon. "Abstract 4927: Combining single-cell ATAC and RNA sequencing for supervised cell annotation." Cancer Research 84, no. 6_Supplement (March 22, 2024): 4927. http://dx.doi.org/10.1158/1538-7445.am2024-4927.

Full text
Abstract:
Abstract Background: Analysis of samples at the single-cell level offers insights into cellular heterogeneity and cell function. Cell type annotation is the first critical step for performing such an analysis. While current methods primarily utilize single-cell RNA sequencing (scRNA-seq) for annotation, several studies have demonstrated improved classification accuracy by combining scRNA-seq with transposase-accessible chromatin sequencing (ATAC-seq) using unsupervised methods. However, the utility of ATAC-seq features for supervised cell-type annotation has not been explored. Aims/Objectives: The objective of this study was to evaluate the relative performance of supervised cell-type classification using scRNA-seq alone vs. in multimodal combination with ATAC-seq; and how these data interplay with choice of classification and dimensionality reduction methods. Methods: A peripheral-blood mononuclear cell multi-omic dataset from a single, healthy female donor wasanalysed in this study. Ground truth annotations were generated using unsupervised annotation with the weighted nearest neighbour clustering method. Two dimensionality reduction methods (principal component analysis (PCA), single-cell Variational Inference (scVI) autoencoder) and four classification models (logistic regression, random forest, support vector machine (SVM)) were implemented and performance metrics (F1 score, precision, and recall) were compared over 10 bootstrap samples. Results: ATAC-seq features improved annotation quality and prediction confidence when using scVI embeddings, independent of the classifier. The best-performing model (SVM with scVI embeddings) showed an increase from a median macro F1 score of 0.907 (IQR = [0.902, 0.910]) using scRNA-seq alone to 0.946 (IQR = [0.940, 0.949], p <0.05) with ATAC-seq added. For PCA embeddings, improvements in macro F1 score were insignificant. All cell types (B, T, monocytes, natural killer and dendritic cells) showed significant improvements when using ATAC-seq with scVI embeddings. CD4 T effector memory cells showed the largest gain in F1 score (0.112, p <0.01), whilst type-2 conventional dendritic cells showed the smallest improvement (0.006, p <0.05). Prediction confidence was improved in B cells, monocytes, natural killer cells, CD4 and CD8 naïve cells, CD4 T central memory cells and CD8 T effector memory cells. Improvements in F1 scores were lost when only classifying major cell types rather than subtypes. Conclusions: Employing ATAC-seq embeddings with scVI autoencoder enhances supervised annotation quality over scRNA-only methods. Further studies should explore the use of ATAC to improve the annotation of highly heterogeneous tissues such as tumours. Citation Format: Jaidip Gill, Abhijit Dasgupta, Brychan Manry, Natasha Markuzon. Combining single-cell ATAC and RNA sequencing for supervised cell annotation [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 4927.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhou, Xiao, Miao Gu, and Zhen Cheng. "Local Integral Regression Network for Cell Nuclei Detection." Entropy 23, no. 10 (October 14, 2021): 1336. http://dx.doi.org/10.3390/e23101336.

Full text
Abstract:
Nuclei detection is a fundamental task in the field of histopathology image analysis and remains challenging due to cellular heterogeneity. Recent studies explore convolutional neural networks to either isolate them with sophisticated boundaries (segmentation-based methods) or locate the centroids of the nuclei (counting-based approaches). Although these two methods have demonstrated superior success, their fully supervised training demands considerable and laborious pixel-wise annotations manually labeled by pathology experts. To alleviate such tedious effort and reduce the annotation cost, we propose a novel local integral regression network (LIRNet) that allows both fully and weakly supervised learning (FSL/WSL) frameworks for nuclei detection. Furthermore, the LIRNet can output an exquisite density map of nuclei, in which the localization of each nucleus is barely affected by the post-processing algorithms. The quantitative experimental results demonstrate that the FSL version of the LIRNet achieves a state-of-the-art performance compared to other counterparts. In addition, the WSL version has exhibited a competitive detection performance and an effortless data annotation that requires only 17.5% of the annotation effort.
APA, Harvard, Vancouver, ISO, and other styles
7

Zhou, Xiao, Miao Gu, and Zhen Cheng. "Local Integral Regression Network for Cell Nuclei Detection." Entropy 23, no. 10 (October 14, 2021): 1336. http://dx.doi.org/10.3390/e23101336.

Full text
Abstract:
Nuclei detection is a fundamental task in the field of histopathology image analysis and remains challenging due to cellular heterogeneity. Recent studies explore convolutional neural networks to either isolate them with sophisticated boundaries (segmentation-based methods) or locate the centroids of the nuclei (counting-based approaches). Although these two methods have demonstrated superior success, their fully supervised training demands considerable and laborious pixel-wise annotations manually labeled by pathology experts. To alleviate such tedious effort and reduce the annotation cost, we propose a novel local integral regression network (LIRNet) that allows both fully and weakly supervised learning (FSL/WSL) frameworks for nuclei detection. Furthermore, the LIRNet can output an exquisite density map of nuclei, in which the localization of each nucleus is barely affected by the post-processing algorithms. The quantitative experimental results demonstrate that the FSL version of the LIRNet achieves a state-of-the-art performance compared to other counterparts. In addition, the WSL version has exhibited a competitive detection performance and an effortless data annotation that requires only 17.5% of the annotation effort.
APA, Harvard, Vancouver, ISO, and other styles
8

Cheng, Changde, Wenan Chen, Hongjian Jin, and Xiang Chen. "A Review of Single-Cell RNA-Seq Annotation, Integration, and Cell–Cell Communication." Cells 12, no. 15 (July 30, 2023): 1970. http://dx.doi.org/10.3390/cells12151970.

Full text
Abstract:
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for investigating cellular biology at an unprecedented resolution, enabling the characterization of cellular heterogeneity, identification of rare but significant cell types, and exploration of cell–cell communications and interactions. Its broad applications span both basic and clinical research domains. In this comprehensive review, we survey the current landscape of scRNA-seq analysis methods and tools, focusing on count modeling, cell-type annotation, data integration, including spatial transcriptomics, and the inference of cell–cell communication. We review the challenges encountered in scRNA-seq analysis, including issues of sparsity or low expression, reliability of cell annotation, and assumptions in data integration, and discuss the potential impact of suboptimal clustering and differential expression analysis tools on downstream analyses, particularly in identifying cell subpopulations. Finally, we discuss recent advancements and future directions for enhancing scRNA-seq analysis. Specifically, we highlight the development of novel tools for annotating single-cell data, integrating and interpreting multimodal datasets covering transcriptomics, epigenomics, and proteomics, and inferring cellular communication networks. By elucidating the latest progress and innovation, we provide a comprehensive overview of the rapidly advancing field of scRNA-seq analysis.
APA, Harvard, Vancouver, ISO, and other styles
9

Long, Helen, Richard Reeves, and Michelle M. Simon. "Mouse genomic and cellular annotations." Mammalian Genome 33, no. 1 (February 5, 2022): 19–30. http://dx.doi.org/10.1007/s00335-021-09936-7.

Full text
Abstract:
AbstractMice have emerged as one of the most popular and valuable model organisms in the research of human biology. This is due to their genetic and physiological similarity to humans, short generation times, availability of genetically homologous inbred strains, and relatively easy laboratory maintenance. Therefore, following the release of the initial human reference genome, the generation of the mouse reference genome was prioritised and represented an important scientific resource for the mouse genetics community. In 2002, the Mouse Genome Sequencing Consortium published an initial draft of the mouse reference genome which contained ~ 96% of the euchromatic genome of female C57BL/6 J mice. Almost two decades on from the publication of the initial draft, sequencing efforts have continued to increase the completeness and accuracy of the C57BL/6 J reference genome alongside advances in genome annotation. Additionally new sequencing technologies have provided a wealth of data that has added to the repertoire of annotations associated with traditional genomic annotations. Including but not limited to advances in regulatory elements, the 3D genome and individual cellular states. In this review we focus on the reference genome C57BL/6 J and summarise the different aspects of genomic and cellular annotations, as well as their relevance to mouse genetic research. We denote a genomic annotation as a functional unit of the genome. Cellular annotations are annotations of cell type or state, defined by the transcriptomic expression profile of a cell. Due to the wide-ranging number and diversity of annotations describing the mouse genome, we focus on gene, repeat and regulatory element annotation as well as two relatively new technologies; 3D genome architecture and single-cell sequencing outlining their utility in genetic research and their current challenges.
APA, Harvard, Vancouver, ISO, and other styles
10

Wei, Ziyang, and Shuqin Zhang. "CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data." Bioinformatics 37, Supplement_1 (July 1, 2021): i51—i58. http://dx.doi.org/10.1093/bioinformatics/btab286.

Full text
Abstract:
Abstract Motivation Single-cell RNA sequencing (scRNA-seq) technology has been widely applied to capture the heterogeneity of different cell types within complex tissues. An essential step in scRNA-seq data analysis is the annotation of cell types. Traditional cell-type annotation is mainly clustering the cells first, and then using the aggregated cluster-level expression profiles and the marker genes to label each cluster. Such methods are greatly dependent on the clustering results, which are insufficient for accurate annotation. Results In this article, we propose a semi-supervised learning method for cell-type annotation called CALLR. It combines unsupervised learning represented by the graph Laplacian matrix constructed from all the cells and supervised learning using sparse logistic regression. By alternately updating the cell clusters and annotation labels, high annotation accuracy can be achieved. The model is formulated as an optimization problem, and a computationally efficient algorithm is developed to solve it. Experiments on 10 real datasets show that CALLR outperforms the compared (semi-)supervised learning methods, and the popular clustering methods. Availability and implementation The implementation of CALLR is available at https://github.com/MathSZhang/CALLR. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
11

Yuan, Musu, Liang Chen, and Minghua Deng. "scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets." Bioinformatics 38, no. 3 (October 8, 2021): 738–45. http://dx.doi.org/10.1093/bioinformatics/btab700.

Full text
Abstract:
Abstract Motivation Single-cell RNA-seq (scRNA-seq) has been widely used to resolve cellular heterogeneity. After collecting scRNA-seq data, the natural next step is to integrate the accumulated data to achieve a common ontology of cell types and states. Thus, an effective and efficient cell-type identification method is urgently needed. Meanwhile, high-quality reference data remain a necessity for precise annotation. However, such tailored reference data are always lacking in practice. To address this, we aggregated multiple datasets into a meta-dataset on which annotation is conducted. Existing supervised or semi-supervised annotation methods suffer from batch effects caused by different sequencing platforms, the effect of which increases in severity with multiple reference datasets. Results Herein, a robust deep learning-based single-cell Multiple Reference Annotator (scMRA) is introduced. In scMRA, a knowledge graph is constructed to represent the characteristics of cell types in different datasets, and a graphic convolutional network serves as a discriminator based on this graph. scMRA keeps intra-cell-type closeness and the relative position of cell types across datasets. scMRA is remarkably powerful at transferring knowledge from multiple reference datasets, to the unlabeled target domain, thereby gaining an advantage over other state-of-the-art annotation methods in multi-reference data experiments. Furthermore, scMRA can remove batch effects. To the best of our knowledge, this is the first attempt to use multiple insufficient reference datasets to annotate target data, and it is, comparatively, the best annotation method for multiple scRNA-seq datasets. Availability and implementation An implementation of scMRA is available from https://github.com/ddb-qiwang/scMRA-torch. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
12

Zhao, Zipei, Fengqian Pang, Yaou Liu, Zhiwen Liu, and Chuyang Ye. "Positive-unlabeled learning for binary and multi-class cell detection in histopathology images with incomplete annotations." Machine Learning for Biomedical Imaging 1, December 2022 (February 17, 2023): 1–30. http://dx.doi.org/10.59275/j.melba.2022-8g31.

Full text
Abstract:
Cell detection in histopathology images is of great interest to clinical practice and research, and convolutional neural networks (CNNs) have achieved remarkable cell detection results. Typically, to train CNN-based cell detection models, every positive instance in the training images needs to be annotated, and instances that are not labeled as positive are considered negative samples. However, manual cell annotation is complicated due to the large number and diversity of cells, and it can be difficult to ensure the annotation of every positive instance. In many cases, only incomplete annotations are available, where some of the positive instances are annotated and the others are not, and the classification loss term for negative samples in typical network training becomes incorrect. In this work, to address this problem of incomplete annotations, we propose to reformulate the training of the detection network as a positive-unlabeled learning problem. Since the instances in unannotated regions can be either positive or negative, they have unknown labels. Using the samples with unknown labels and the positively labeled samples, we first derive an approximation of the classification loss term corresponding to negative samples for binary cell detection, and based on this approximation we further extend the proposed framework to multi-class cell detection. For evaluation, experiments were performed on four publicly available datasets. The experimental results show that our method improves the performance of cell detection in histopathology images given incomplete annotations for network training.
APA, Harvard, Vancouver, ISO, and other styles
13

Doddahonnaiah, Deeksha, Patrick J. Lenehan, Travis K. Hughes, David Zemmour, Enrique Garcia-Rivera, A. J. Venkatakrishnan, Ramakrishna Chilaka, et al. "A Literature-Derived Knowledge Graph Augments the Interpretation of Single Cell RNA-seq Datasets." Genes 12, no. 6 (June 10, 2021): 898. http://dx.doi.org/10.3390/genes12060898.

Full text
Abstract:
Technology to generate single cell RNA-sequencing (scRNA-seq) datasets and tools to annotate them have advanced rapidly in the past several years. Such tools generally rely on existing transcriptomic datasets or curated databases of cell type defining genes, while the application of scalable natural language processing (NLP) methods to enhance analysis workflows has not been adequately explored. Here we deployed an NLP framework to objectively quantify associations between a comprehensive set of over 20,000 human protein-coding genes and over 500 cell type terms across over 26 million biomedical documents. The resultant gene-cell type associations (GCAs) are significantly stronger between a curated set of matched cell type-marker pairs than the complementary set of mismatched pairs (Mann Whitney p = 6.15 × 10−76, r = 0.24; cohen’s D = 2.6). Building on this, we developed an augmented annotation algorithm (single cell Annotation via Literature Encoding, or scALE) that leverages GCAs to categorize cell clusters identified in scRNA-seq datasets, and we tested its ability to predict the cellular identity of 133 clusters from nine datasets of human breast, colon, heart, joint, ovary, prostate, skin, and small intestine tissues. With the optimized settings, the true cellular identity matched the top prediction in 59% of tested clusters and was present among the top five predictions for 91% of clusters. scALE slightly outperformed an existing method for reference data driven automated cluster annotation, and we demonstrate that integration of scALE can meaningfully improve the annotations derived from such methods. Further, contextualization of differential expression analyses with these GCAs highlights poorly characterized markers of well-studied cell types, such as CLIC6 and DNASE1L3 in retinal pigment epithelial cells and endothelial cells, respectively. Taken together, this study illustrates for the first time how the systematic application of a literature-derived knowledge graph can expedite and enhance the annotation and interpretation of scRNA-seq data.
APA, Harvard, Vancouver, ISO, and other styles
14

Barrett, John, and Richard Childs. "Non-myeloablative stem cell transplants. Annotation." British Journal of Haematology 111, no. 1 (October 2000): 6–17. http://dx.doi.org/10.1046/j.1365-2141.2000.02405.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Liu, Huaitian, Alexandra Harris, Brittany Jenkins-Lord, Tiffany H. Dorsey, Francis Makokha, Shahin Sayed, Gretchen Gierach, and Stefan Ambs. "Abstract LB240: Cell type annotation using singleR with custom reference for single-nucleus multiome data derived from frozen human breast tumors." Cancer Research 84, no. 7_Supplement (April 5, 2024): LB240. http://dx.doi.org/10.1158/1538-7445.am2024-lb240.

Full text
Abstract:
Abstract Single-nucleus joint ATAC- and RNA-sequencing (snMultiome) can be used to identify functionally divergent cell subpopulations based on their transcriptomic and epigenetic profiles within complex samples. Accurate cell type annotation is critical to successful snMultiome data analysis. Several computational methods have been developed for automatic annotation. Traditional cell type annotation methods initially cluster cells using unsupervised learning methods based on the gene expression profiles, then label the clusters using aggregated cluster-level expression profiles and marker genes. These methods rely heavily on the clustering results. As the purity of clusters cannot be guaranteed, false detection of cluster features may lead to incorrect annotations. Further, canonical cell surface markers may not always be suitable to be applied in single-nucleus RNA-seq studies because single-nucleus RNA-seq generally yields lower detected transcript numbers compared to typical single-cell RNA-seq. Moreover, cell type marker genes in the snRNA-seq data may differ from the ones obtained with scRNA-seq data, reflecting biological differences in the cytoplasmic and nuclear RNA pools. Lastly, the data obtained from malignant cells are best left out in establishing cell type reference data because they are too heterogeneous and patient-specific. Reference-based automated algorithms such as SingleR enable quick and unbiased classifications by leveraging a collection of built-in reference data sets for human (e.g. Human Primary Cell Atlas (microarray-based) and the combined Blueprint Epigenomics and Encode data set (RNA-seq-based)). Still, SingleR may return erroneous cell type classifications. Our dataset was generated using the 10x Genomics snMultiome platform to yield 296,557 nuclei from 82 frozen breast tumors, representing patients from diverse genetic ancestral background. Using these data, we sought to improve the accuracy of cell type annotation by SingleR. To achieve this, we first separated malignant and non-malignant cells based on DNA copy number aberrations (aneuploidy) through CopyKAT. For cells determined to be non-malignant, we built the custom reference from snRNA-seq data set, recently made available by The Human Breast Cell Atlas, and then applied singleR with a custom reference where each cell type is represented by single-cells of that type, allowing a well-founded estimate of the confidence with which a cell type call can be made. Using this approach, we successfully identified 11 distinct cell types for non-malignant cells, including fibroblast, adipocyte, pericyte, basal, luminal-secretory, luminal-HR, myeloid, mast, vascular, lymphatic, and T-cells, which can then be further subclassified. Furthermore, we interrogated each cluster using known canonical markers and transferred the cell type labels to snATAC-seq. This approach enabled us to link peaks to genes in each cell type. We believe this new approach that refines SingleR can greatly improve accuracy and minimize misclassification when annotating cell types in breast tumors using snMultiome data. Citation Format: Huaitian Liu, Alexandra Harris, Brittany Jenkins-Lord, Tiffany H. Dorsey, Francis Makokha, Shahin Sayed, Gretchen Gierach, Stefan Ambs. Cell type annotation using singleR with custom reference for single-nucleus multiome data derived from frozen human breast tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 2 (Late-Breaking, Clinical Trial, and Invited Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(7_Suppl):Abstract nr LB240.
APA, Harvard, Vancouver, ISO, and other styles
16

Feng, Zhanying, Xianwen Ren, Yuan Fang, Yining Yin, Chutian Huang, Yimin Zhao, and Yong Wang. "scTIM: seeking cell-type-indicative marker from single cell RNA-seq data by consensus optimization." Bioinformatics 36, no. 8 (December 17, 2019): 2474–85. http://dx.doi.org/10.1093/bioinformatics/btz936.

Full text
Abstract:
Abstract Motivation Single cell RNA-seq data offers us new resource and resolution to study cell type identity and its conversion. However, data analyses are challenging in dealing with noise, sparsity and poor annotation at single cell resolution. Detecting cell-type-indicative markers is promising to help denoising, clustering and cell type annotation. Results We developed a new method, scTIM, to reveal cell-type-indicative markers. scTIM is based on a multi-objective optimization framework to simultaneously maximize gene specificity by considering gene-cell relationship, maximize gene’s ability to reconstruct cell–cell relationship and minimize gene redundancy by considering gene–gene relationship. Furthermore, consensus optimization is introduced for robust solution. Experimental results on three diverse single cell RNA-seq datasets show scTIM’s advantages in identifying cell types (clustering), annotating cell types and reconstructing cell development trajectory. Applying scTIM to the large-scale mouse cell atlas data identifies critical markers for 15 tissues as ‘mouse cell marker atlas’, which allows us to investigate identities of different tissues and subtle cell types within a tissue. scTIM will serve as a useful method for single cell RNA-seq data mining. Availability and implementation scTIM is freely available at https://github.com/Frank-Orwell/scTIM. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
17

Sun, Hao, Danqi Guo, and Zhao Chen. "Mixed-Supervised Learning for Cell Classification." Sensors 25, no. 4 (February 16, 2025): 1207. https://doi.org/10.3390/s25041207.

Full text
Abstract:
Cell classification based on histopathology images is crucial for tumor recognition and cancer diagnosis. Using deep learning, classification accuracy is hugely improved. Semi-supervised learning is an advanced deep learning approach that uses both labeled and unlabeled data. However, complex datasets that comprise diverse patterns may drive models towards learning harmful features. Therefore, it is useful to involve human guidance during training. Hence, we propose a mixed-supervised method incorporating semi-supervision and “human-in-the-loop” for cell classification. We design a sample selection mechanism that assigns highly confident unlabeled samples to automatic semi-supervised optimization and unreliable ones for online annotation correction. We use prior human annotations to pretrain the backbone and trustworthy pseudo labels and online human annotations to fine-tune the model for accurate cell classification. Experimental results show that the mixed-supervised model reaches overall accuracies as high as 86.56%, 99.33% and 74.12% on LUSC, BloodCell, and PanNuke datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
18

Tang, Dachao, Cheng Han, Shaofeng Lin, Xiaodan Tan, Weizhi Zhang, Di Peng, Chenwei Wang, and Yu Xue. "iPCD: A Comprehensive Data Resource of Regulatory Proteins in Programmed Cell Death." Cells 11, no. 13 (June 24, 2022): 2018. http://dx.doi.org/10.3390/cells11132018.

Full text
Abstract:
Programmed cell death (PCD) is an essential biological process involved in many human pathologies. According to the continuous discovery of new PCD forms, a large number of proteins have been found to regulate PCD. Notably, post-translational modifications play critical roles in PCD process and the rapid advances in proteomics have facilitated the discovery of new PCD proteins. However, an integrative resource has yet to be established for maintaining these regulatory proteins. Here, we briefly summarize the mainstream PCD forms, as well as the current progress in the development of public databases to collect, curate and annotate PCD proteins. Further, we developed a comprehensive database, with integrated annotations for programmed cell death (iPCD), which contained 1,091,014 regulatory proteins involved in 30 PCD forms across 562 eukaryotic species. From the scientific literature, we manually collected 6493 experimentally identified PCD proteins, and an orthologous search was then conducted to computationally identify more potential PCD proteins. Additionally, we provided an in-depth annotation of PCD proteins in eight model organisms, by integrating the knowledge from 102 additional resources that covered 16 aspects, including post-translational modification, protein expression/proteomics, genetic variation and mutation, functional annotation, structural annotation, physicochemical property, functional domain, disease-associated information, protein–protein interaction, drug–target relation, orthologous information, biological pathway, transcriptional regulator, mRNA expression, subcellular localization and DNA and RNA element. With a data volume of 125 GB, we anticipate that iPCD can serve as a highly useful resource for further analysis of PCD in eukaryotes.
APA, Harvard, Vancouver, ISO, and other styles
19

Lagier, Michael J., Brittany Bowman, Kelsey Brend, Katherine Hobbs, Michael Foggia, and Mark McDaniel. "Improved Functional Prediction of Hypothetical Proteins from Listeria monocytogenes 08-5578." Journal of the Iowa Academy of Science 121, no. 1-4 (January 1, 2014): 16–27. http://dx.doi.org/10.17833/121-03.1.

Full text
Abstract:
Listeria monocytogenes is a foodborne human pathogen responsible for listerosis. The genomes of several L. monocytogenes strains have been recently sequenced. The genome of L. monocytogenes 08-5578, which was in part responsible for a significant listerosis outbreak in 2008, contains an unexpectedly high percentage of protein-encoding genes (1,927 out of 3,161; 60.96%) autonomously annotated as hypothetical proteins. The aim of this study was to test whether a manual annotation strategy could be used to assign more meaningful functional names to the hypothetical proteins of 08-5578. A holistic, manual gene annotation strategy that utilized sequence homology, cellular localization predictions, structure-based evidence, phylogeny, and protein-protein interaction data was used to assign potential cellular roles to 79 out of 100 hypothetical proteins randomly selected from the genome of 08-5578. Of significance, 5 of the 79 hypothetical proteins assigned a more meaningful name may contribute to the virulence of L. monocytogenes 08-5578, by contributing to chemotaxis, cell surface protein sorting, cell wall biosynthesis, and cold adaptation. The findings here support the notion that manual annotations, using a combination of diverse bioinformatics tools, can improve the quality of genomic information provided by automated genome annotation methods alone.
APA, Harvard, Vancouver, ISO, and other styles
20

Lachmann, Alexander, Kaeli A. Rizzo, Alon Bartal, Minji Jeon, Daniel J. B. Clarke, and Avi Ma’ayan. "PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices." PeerJ 11 (February 27, 2023): e14927. http://dx.doi.org/10.7717/peerj.14927.

Full text
Abstract:
Background Gene-gene co-expression correlations measured by mRNA-sequencing (RNA-seq) can be used to predict gene annotations based on the co-variance structure within these data. In our prior work, we showed that uniformly aligned RNA-seq co-expression data from thousands of diverse studies is highly predictive of both gene annotations and protein-protein interactions. However, the performance of the predictions varies depending on whether the gene annotations and interactions are cell type and tissue specific or agnostic. Tissue and cell type-specific gene-gene co-expression data can be useful for making more accurate predictions because many genes perform their functions in unique ways in different cellular contexts. However, identifying the optimal tissues and cell types to partition the global gene-gene co-expression matrix is challenging. Results Here we introduce and validate an approach called PRediction of gene Insights from Stratified Mammalian gene co-EXPression (PrismEXP) for improved gene annotation predictions based on RNA-seq gene-gene co-expression data. Using uniformly aligned data from ARCHS4, we apply PrismEXP to predict a wide variety of gene annotations including pathway membership, Gene Ontology terms, as well as human and mouse phenotypes. Predictions made with PrismEXP outperform predictions made with the global cross-tissue co-expression correlation matrix approach on all tested domains, and training using one annotation domain can be used to predict annotations in other domains. Conclusions By demonstrating the utility of PrismEXP predictions in multiple use cases we show how PrismEXP can be used to enhance unsupervised machine learning methods to better understand the roles of understudied genes and proteins. To make PrismEXP accessible, it is provided via a user-friendly web interface, a Python package, and an Appyter. AVAILABILITY. The PrismEXP web-based application, with pre-computed PrismEXP predictions, is available from: https://maayanlab.cloud/prismexp; PrismEXP is also available as an Appyter: https://appyters.maayanlab.cloud/PrismEXP/; and as Python package: https://github.com/maayanlab/prismexp.
APA, Harvard, Vancouver, ISO, and other styles
21

Zhang, Yuexin, Chao Song, Yimeng Zhang, Yuezhu Wang, Chenchen Feng, Jiaxin Chen, Ling Wei, et al. "TcoFBase: a comprehensive database for decoding the regulatory transcription co-factors in human and mouse." Nucleic Acids Research 50, no. D1 (October 30, 2021): D391—D401. http://dx.doi.org/10.1093/nar/gkab950.

Full text
Abstract:
Abstract Transcription co-factors (TcoFs) play crucial roles in gene expression regulation by communicating regulatory cues from enhancers to promoters. With the rapid accumulation of TcoF associated chromatin immunoprecipitation sequencing (ChIP-seq) data, the comprehensive collection and integrative analyses of these data are urgently required. Here, we developed the TcoFBase database (http://tcof.liclab.net/TcoFbase), which aimed to document a large number of available resources for mammalian TcoFs and provided annotations and enrichment analyses of TcoFs. TcoFBase curated 2322 TcoFs and 6759 TcoFs associated ChIP-seq data from over 500 tissues/cell types in human and mouse. Importantly, TcoFBase provided detailed and abundant (epi) genetic annotations of ChIP-seq based TcoF binding regions. Furthermore, TcoFBase supported regulatory annotation information and various functional annotations for TcoFs. Meanwhile, TcoFBase embedded five types of TcoF regulatory analyses for users, including TcoF gene set enrichment, TcoF binding genomic region annotation, TcoF regulatory network analysis, TcoF-TF co-occupancy analysis and TcoF regulatory axis analysis. TcoFBase was designed to be a useful resource that will help reveal the potential biological effects of TcoFs and elucidate TcoF-related regulatory mechanisms.
APA, Harvard, Vancouver, ISO, and other styles
22

Li, Jia, Quanhu Sheng, Yu Shyr, and Qi Liu. "scMRMA: single cell multiresolution marker-based annotation." Nucleic Acids Research 50, no. 2 (October 14, 2021): e7-e7. http://dx.doi.org/10.1093/nar/gkab931.

Full text
Abstract:
Abstract Single-cell RNA sequencing has become a powerful tool for identifying and characterizing cellular heterogeneity. One essential step to understanding cellular heterogeneity is determining cell identities. The widely used strategy predicts identities by projecting cells or cell clusters unidirectionally against a reference to find the best match. Here, we develop a bidirectional method, scMRMA, where a hierarchical reference guides iterative clustering and deep annotation with enhanced resolutions. Taking full advantage of the reference, scMRMA greatly improves the annotation accuracy. scMRMA achieved better performance than existing methods in four benchmark datasets and successfully revealed the expansion of CD8 T cell populations in squamous cell carcinoma after anti-PD-1 treatment.
APA, Harvard, Vancouver, ISO, and other styles
23

Xiong, Yi-Xuan, Meng-Guo Wang, Luonan Chen, and Xiao-Fei Zhang. "Cell-type annotation with accurate unseen cell-type identification using multiple references." PLOS Computational Biology 19, no. 6 (June 28, 2023): e1011261. http://dx.doi.org/10.1371/journal.pcbi.1011261.

Full text
Abstract:
The recent advances in single-cell RNA sequencing (scRNA-seq) techniques have stimulated efforts to identify and characterize the cellular composition of complex tissues. With the advent of various sequencing techniques, automated cell-type annotation using a well-annotated scRNA-seq reference becomes popular. But it relies on the diversity of cell types in the reference, which may not capture all the cell types present in the query data of interest. There are generally unseen cell types in the query data of interest because most data atlases are obtained for different purposes and techniques. Identifying previously unseen cell types is essential for improving annotation accuracy and uncovering novel biological discoveries. To address this challenge, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the aid of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric that considers three complementary aspects to distinguish between unseen cell types and shared cell types. Additionally, we provide a data-driven method to adaptively select a threshold for identifying previously unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for unseen cell-type identification and cell-type annotation on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets. The source code and tutorial are available at https://github.com/Zhangxf-ccnu/mtANN.
APA, Harvard, Vancouver, ISO, and other styles
24

Zubair, Asif, Rich Chapple, Sivaraman Natarajan, William C. Wright, Min Pan, Hyeong-Min Lee, Heather Tillman, John Easton, and Paul Geeleher. "Abstract 456: Jointly leveraging spatial transcriptomics and deep learning models for image annotation achieves better-than-pathologist performance in cell type identification in tumors." Cancer Research 82, no. 12_Supplement (June 15, 2022): 456. http://dx.doi.org/10.1158/1538-7445.am2022-456.

Full text
Abstract:
Abstract For over 100 years, the traditional tools of pathology, such as tissue-marking dyes (e.g. the H&E stain) have been used to study the disorganization and dysfunction of cells within tissues. This has represented a principal diagnostic and prognostic tool in cancer. However, in the last 5 years, new technologies have promised to revolutionize histopathology, with Spatial Transcriptomics technologies allowing us to measure gene expression directly in pathology-stained tissue sections. In parallel with these developments, Artificial Intelligence (AI) applied to histopathology tissue images now approaches pathologist level performance in cell type identification. However, these new technologies still have severe limitations, with Spatial Transcriptomics suffering difficulties distinguishing transcriptionally similar cell types, and AI-based pathology tools often performing poorly on real world out-of-batch test datasets. Thus, century-old techniques still represent standard-of-care in most areas of clinical cancer diagnostics and prognostics. Here, we present a new frontier in digital pathology: describing a conceptually novel computational methodology, based on Bayesian probabilistic modelling, that allows Spatial Transcriptomics data to be leveraged together with the output of deep learning-based AI used to computationally annotate H&E-stained sections of the same tumor. By leveraging cell-type annotations from multiple independent pathologists, we show that this integrated methodology achieves better performance than any given pathologist’s manual tissue annotation in the task of identifying regions of immune cell infiltration in breast cancer, and easily outperforms either technology alone. We also show that on a subset of histopathology slides examined, the methodology can identify regions of clinically relevant immune cell infiltration that were missed entirely by an initial pathologist’s manual annotation. While this use case has clear diagnostic and prognostic value in cancer (e.g. predicting response to immunotherapy), our methodology is generalizable to any type of pathology images and also has broad applications in spatial transcriptomics data analytics, where most applications (such as identifying cell-cell interactions) rely on correct cell type annotations having been established a priori. We anticipate that this work will spur many follow-up studies, including new computational innovations building on the approach. The work sets the stage for better-than-pathologist performance in other cell-type annotation tasks, with relevant applications in diagnostics and prognostics across almost all cancers. Citation Format: Asif Zubair, Rich Chapple, Sivaraman Natarajan, William C. Wright, Min Pan, Hyeong-Min Lee, Heather Tillman, John Easton, Paul Geeleher. Jointly leveraging spatial transcriptomics and deep learning models for image annotation achieves better-than-pathologist performance in cell type identification in tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 456.
APA, Harvard, Vancouver, ISO, and other styles
25

Tickotsky, Nili, and Moti Moskovitz. "Protein Activation in Periapical Reaction to Iodoform Containing Root Canal Sealer." Journal of Clinical Pediatric Dentistry 41, no. 6 (January 1, 2017): 450–55. http://dx.doi.org/10.17796/1053-4628-41.6.6.

Full text
Abstract:
Objectives: An association between root canal sealers and periapical lesions in primary dentition has been suggested, yet the chemical-protein interactions that may be involved in it have not been studied. The present study explored root sealer components' effect on periapical tissue proteins using bioinformatics tools. Study design: For each chemical component of Endoflas F.S. root sealing material we identified the known and predicted target proteins, using STITCH (search tool for interactions of chemicals http://stitch.embl.de/). Identified target proteins were grouped into functional categories using the annotation clustering tool from DAVID, the Database for Annotation, Visualization and Integrated Discovery (http://david.abcc.ncifcrf.gov/). STRING Protein-Protein Interaction network database identified associations between the proteins. Results: Sixteen proteins identified with STITCH served as input to DAVID annotation clustering tool. Only ZnO and Eugenol targeted proteins had statistically significant annotations. Gene Ontology terms of ZnO and Eugenol targeted proteins demonstrated that these proteins respond to mechanical stimulus and to oxidative stress. They highlight these proteins' role in the positive regulation of transcription, gene expression, cell proliferation and apoptosis, and their complementary role in the negative regulation of cell death. Conclusion: When stimulated by Zinc Oxide, Eugenol and Calcium hydroxide, chemical-protein and subsequent protein-protein interactions result in cell proliferation in the periapical area. Our findings indicate that certain root sealers components may cause enlargement of the permanent tooth follicle. Dentists should be aware of this phenomenon and radiographically monitor root canal treated teeth until shedding.
APA, Harvard, Vancouver, ISO, and other styles
26

Englbrecht, Fabian, Iris E. Ruider, and Andreas R. Bausch. "Automatic image annotation for fluorescent cell nuclei segmentation." PLOS ONE 16, no. 4 (April 16, 2021): e0250093. http://dx.doi.org/10.1371/journal.pone.0250093.

Full text
Abstract:
Dataset annotation is a time and labor-intensive task and an integral requirement for training and testing deep learning models. The segmentation of images in life science microscopy requires annotated image datasets for object detection tasks such as instance segmentation. Although the amount of annotated image data has been steadily reduced due to methods such as data augmentation, the process of manual or semi-automated data annotation is the most labor and cost intensive task in the process of cell nuclei segmentation with deep neural networks. In this work we propose a system to fully automate the annotation process of a custom fluorescent cell nuclei image dataset. By that we are able to reduce nuclei labelling time by up to 99.5%. The output of our system provides high quality training data for machine learning applications to identify the position of cell nuclei in microscopy images. Our experiments have shown that the automatically annotated dataset provides coequal segmentation performance compared to manual data annotation. In addition, we show that our system enables a single workflow from raw data input to desired nuclei segmentation and tracking results without relying on pre-trained models or third-party training datasets for neural networks.
APA, Harvard, Vancouver, ISO, and other styles
27

Xu, Congmin, Huyun Lu, and Peng Qiu. "Comparison of cell type annotation algorithms for revealing immune response of COVID-19." Frontiers in Systems Biology 2 (October 24, 2022). http://dx.doi.org/10.3389/fsysb.2022.1026686.

Full text
Abstract:
When analyzing scRNA-seq data with clustering algorithms, annotating the clusters with cell types is an essential step toward biological interpretation of the data. Annotations can be performed manually using known cell type marker genes. Annotations can also be automated using knowledge-driven or data-driven machine learning algorithms. Majority of cell type annotation algorithms are designed to predict cell types for individual cells in a new dataset. Since biological interpretation of scRNA-seq data is often made on cell clusters rather than individual cells, several algorithms have been developed to annotate cell clusters. In this study, we compared five cell type annotation algorithms, Azimuth, SingleR, Garnett, scCATCH, and SCSA, which cover the spectrum of knowledge-driven and data-driven approaches to annotate either individual cells or cell clusters. We applied these five algorithms to two scRNA-seq datasets of peripheral blood mononuclear cells (PBMC) samples from COVID-19 patients and healthy controls, and evaluated their annotation performance. From this comparison, we observed that methods for annotating individual cells outperformed methods for annotation cell clusters. We applied the cell-based annotation algorithm Azimuth to the two scRNA-seq datasets to examine the immune response during COVID-19 infection. Both datasets presented significant depletion of plasmacytoid dendritic cells (pDCs), where differential expression in this cell type and pathway analysis revealed strong activation of type I interferon signaling pathway in response to the infection.
APA, Harvard, Vancouver, ISO, and other styles
28

Hou, Wenpin, and Zhicheng Ji. "Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis." Nature Methods, March 25, 2024. http://dx.doi.org/10.1038/s41592-024-02235-4.

Full text
Abstract:
AbstractHere we demonstrate that the large language model GPT-4 can accurately annotate cell types using marker gene information in single-cell RNA sequencing analysis. When evaluated across hundreds of tissue and cell types, GPT-4 generates cell type annotations exhibiting strong concordance with manual annotations. This capability can considerably reduce the effort and expertise required for cell type annotation. Additionally, we have developed an R software package GPTCelltype for GPT-4’s automated cell type annotation.
APA, Harvard, Vancouver, ISO, and other styles
29

Guo, Qirui, Musu Yuan, Lei Zhang, and Minghua Deng. "scPLAN: a hierarchical computational framework for single transcriptomics data annotation, integration and cell-type label refinement." Briefings in Bioinformatics 25, no. 4 (May 23, 2024). http://dx.doi.org/10.1093/bib/bbae305.

Full text
Abstract:
Abstract Motivation In the past decade, single-cell RNA sequencing (scRNA-seq) has emerged as a pivotal method for transcriptomic profiling in biomedical research. Precise cell-type identification is crucial for subsequent analysis of single-cell data. And the integration and refinement of annotated data are essential for building comprehensive databases. However, prevailing annotation techniques often overlook the hierarchical organization of cell types, resulting in inconsistent annotations. Meanwhile, most existing integration approaches fail to integrate datasets with different annotation depths and none of them can enhance the labels of outdated data with lower annotation resolutions using more intricately annotated datasets or novel biological findings. Results Here, we introduce scPLAN, a hierarchical computational framework designed for scRNA-seq data analysis. scPLAN excels in annotating unlabeled scRNA-seq data using a reference dataset structured along a hierarchical cell-type tree. It identifies potential novel cell types in a systematic, layer-by-layer manner. Additionally, scPLAN effectively integrates annotated scRNA-seq datasets with varying levels of annotation depth, ensuring consistent refinement of cell-type labels across datasets with lower resolutions. Through extensive annotation and novel cell detection experiments, scPLAN has demonstrated its efficacy. Two case studies have been conducted to showcase how scPLAN integrates datasets with diverse cell-type label resolutions and refine their cell-type labels. Availability https://github.com/michaelGuo1204/scPLAN
APA, Harvard, Vancouver, ISO, and other styles
30

Dong, Sherry, Kaiwen Deng, and Xiuzhen Huang. "Single-Cell Type Annotation With Deep Learning in 265 Cell Types For Humans." Bioinformatics Advances, April 8, 2024. http://dx.doi.org/10.1093/bioadv/vbae054.

Full text
Abstract:
Abstract Motivation Annotating cell types is a challenging yet essential task in analyzing single-cell RNA sequencing data. However, due to the lack of a gold standard, it is difficult to evaluate the algorithms fairly and an overfitting algorithm may be favored in benchmarks. To address this challenge, we developed a deep learning-based single-cell type prediction tool that assigns the cell type to 265 different cell types for humans, based on data from approximately five million cells. Results We achieved a median AUC of 0.93 when evaluated across datasets. We found that inconsistent labeling in the existing database generated by different labs contributed to the mistakes of the model. Therefore, we used cell ontology to correct the annotations and retrained the model, which resulted in 0.971 median AUC. Our study reveals a limiting factor of the accuracy one may achieve with the current database annotation and points to the solutions towards an algorithm-based correction of the gold standard for future automated cell annotation approaches. Availability The code is available at: https://github.com/SherrySDong/Hierarchical-Correction-Improves-Automated-Single-cell-Type-Annotation. Data used in this study is listed in Table S1 and is retrievable at the CZI database. Supplementary information Supplementary data are available at Bioinformatics Advances online.
APA, Harvard, Vancouver, ISO, and other styles
31

Altay, Aybuge, and Martin Vingron. "scATAcat: cell-type annotation for scATAC-seq data." NAR Genomics and Bioinformatics 6, no. 4 (July 2, 2024). http://dx.doi.org/10.1093/nargab/lqae135.

Full text
Abstract:
Abstract Cells whose accessibility landscape has been profiled with scATAC-seq cannot readily be annotated to a particular cell type. In fact, annotating cell-types in scATAC-seq data is a challenging task since, unlike in scRNA-seq data, we lack knowledge of ‘marker regions’ which could be used for cell-type annotation. Current annotation methods typically translate accessibility to expression space and rely on gene expression patterns. We propose a novel approach, scATAcat, that leverages characterized bulk ATAC-seq data as prototypes to annotate scATAC-seq data. To mitigate the inherent sparsity of single-cell data, we aggregate cells that belong to the same cluster and create pseudobulk. To demonstrate the feasibility of our approach we collected a number of datasets with respective annotations to quantify the results and evaluate performance for scATAcat. scATAcat is available as a python package at https://github.com/aybugealtay/scATAcat.
APA, Harvard, Vancouver, ISO, and other styles
32

Vu, Ha, and Jason Ernst. "Universal annotation of the human genome through integration of over a thousand epigenomic datasets." Genome Biology 23, no. 1 (January 6, 2022). http://dx.doi.org/10.1186/s13059-021-02572-z.

Full text
Abstract:
Abstract Background Genome-wide maps of chromatin marks such as histone modifications and open chromatin sites provide valuable information for annotating the non-coding genome, including identifying regulatory elements. Computational approaches such as ChromHMM have been applied to discover and annotate chromatin states defined by combinatorial and spatial patterns of chromatin marks within the same cell type. An alternative “stacked modeling” approach was previously suggested, where chromatin states are defined jointly from datasets of multiple cell types to produce a single universal genome annotation based on all datasets. Despite its potential benefits for applications that are not specific to one cell type, such an approach was previously applied only for small-scale specialized purposes. Large-scale applications of stacked modeling have previously posed scalability challenges. Results Using a version of ChromHMM enhanced for large-scale applications, we apply the stacked modeling approach to produce a universal chromatin state annotation of the human genome using over 1000 datasets from more than 100 cell types, with the learned model denoted as the full-stack model. The full-stack model states show distinct enrichments for external genomic annotations, which we use in characterizing each state. Compared to per-cell-type annotations, the full-stack annotations directly differentiate constitutive from cell type-specific activity and is more predictive of locations of external genomic annotations. Conclusions The full-stack ChromHMM model provides a universal chromatin state annotation of the genome and a unified global view of over 1000 datasets. We expect this to be a useful resource that complements existing per-cell-type annotations for studying the non-coding human genome.
APA, Harvard, Vancouver, ISO, and other styles
33

Lawson, Nathan D., Rui Li, Masahiro Shin, Ann Grosse, Onur Yukselen, Oliver A. Stone, Alper Kucukural, and Lihua Zhu. "An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes." eLife 9 (August 24, 2020). http://dx.doi.org/10.7554/elife.55792.

Full text
Abstract:
The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3' untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers.
APA, Harvard, Vancouver, ISO, and other styles
34

Kimmel, Jacob C., and David R. Kelley. "Semisupervised adversarial neural networks for single-cell classification." Genome Research, February 24, 2021. http://dx.doi.org/10.1101/gr.268581.120.

Full text
Abstract:
Annotating cell identities is a common bottleneck in the analysis of single-cell genomics experiments. Here, we present scNym, a semisupervised, adversarial neural network that learns to transfer cell identity annotations from one experiment to another. scNym takes advantage of information in both labeled data sets and new, unlabeled data sets to learn rich representations of cell identity that enable effective annotation transfer. We show that scNym effectively transfers annotations across experiments despite biological and technical differences, achieving performance superior to existing methods. We also show that scNym models can synthesize information from multiple training and target data sets to improve performance. We show that in addition to high accuracy, scNym models are well calibrated and interpretable with saliency methods.
APA, Harvard, Vancouver, ISO, and other styles
35

Michielsen, Lieke, Mohammad Lotfollahi, Daniel Strobl, Lisa Sikkema, Marcel J. T. Reinders, Fabian J. Theis, and Ahmed Mahfouz. "Single-cell reference mapping to construct and extend cell-type hierarchies." NAR Genomics and Bioinformatics 5, no. 3 (July 5, 2023). http://dx.doi.org/10.1093/nargab/lqad070.

Full text
Abstract:
Abstract Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health and disease. Such large-scale atlases increase the scale and generalizability of analyses and enable combining knowledge generated by individual studies. Specifically, individual studies often differ regarding cell annotation terminology and depth, with different groups specializing in different cell type compartments, often using distinct terminology. Understanding how these distinct sets of annotations are related and complement each other would mark a major step towards a consensus-based cell-type annotation reflecting the latest knowledge in the field. Whereas recent computational techniques, referred to as ‘reference mapping’ methods, facilitate the usage and expansion of existing reference atlases by mapping new datasets (i.e. queries) onto an atlas; a systematic approach towards harmonizing dataset-specific cell-type terminology and annotation depth is still lacking. Here, we present ‘treeArches’, a framework to automatically build and extend reference atlases while enriching them with an updatable hierarchy of cell-type annotations across different datasets. We demonstrate various use cases for treeArches, from automatically resolving relations between reference and query cell types to identifying unseen cell types absent in the reference, such as disease-associated cell states. We envision treeArches enabling data-driven construction of consensus atlas-level cell-type hierarchies and facilitating efficient usage of reference atlases.
APA, Harvard, Vancouver, ISO, and other styles
36

Liu, Yan, Guo Wei, Chen Li, Long-Chen Shen, Robin B. Gasser, Jiangning Song, Dijun Chen, and Dong-Jun Yu. "TripletCell: a deep metric learning framework for accurate annotation of cell types at the single-cell level." Briefings in Bioinformatics, April 20, 2023. http://dx.doi.org/10.1093/bib/bbad132.

Full text
Abstract:
Abstract Single-cell RNA sequencing (scRNA-seq) has significantly accelerated the experimental characterization of distinct cell lineages and types in complex tissues and organisms. Cell-type annotation is of great importance in most of the scRNA-seq analysis pipelines. However, manual cell-type annotation heavily relies on the quality of scRNA-seq data and marker genes, and therefore can be laborious and time-consuming. Furthermore, the heterogeneity of scRNA-seq datasets poses another challenge for accurate cell-type annotation, such as the batch effect induced by different scRNA-seq protocols and samples. To overcome these limitations, here we propose a novel pipeline, termed TripletCell, for cross-species, cross-protocol and cross-sample cell-type annotation. We developed a cell embedding and dimension-reduction module for the feature extraction (FE) in TripletCell, namely TripletCell-FE, to leverage the deep metric learning-based algorithm for the relationships between the reference gene expression matrix and the query cells. Our experimental studies on 21 datasets (covering nine scRNA-seq protocols, two species and three tissues) demonstrate that TripletCell outperformed state-of-the-art approaches for cell-type annotation. More importantly, regardless of protocols or species, TripletCell can deliver outstanding and robust performance in annotating different types of cells. TripletCell is freely available at https://github.com/liuyan3056/TripletCell. We believe that TripletCell is a reliable computational tool for accurately annotating various cell types using scRNA-seq data and will be instrumental in assisting the generation of novel biological hypotheses in cell biology.
APA, Harvard, Vancouver, ISO, and other styles
37

Li, Ziyi, and Hao Feng. "A neural network-based method for exhaustive cell label assignment using single cell RNA-seq data." Scientific Reports 12, no. 1 (January 18, 2022). http://dx.doi.org/10.1038/s41598-021-04473-4.

Full text
Abstract:
AbstractThe fast-advancing single cell RNA sequencing (scRNA-seq) technology enables researchers to study the transcriptome of heterogeneous tissues at a single cell level. The initial important step of analyzing scRNA-seq data is usually to accurately annotate cells. The traditional approach of annotating cell types based on unsupervised clustering and marker genes is time-consuming and laborious. Taking advantage of the numerous existing scRNA-seq databases, many supervised label assignment methods have been developed. One feature that many label assignment methods shares is to label cells with low confidence as “unassigned.” These unassigned cells can be the result of assignment difficulties due to highly similar cell types or caused by the presence of unknown cell types. However, when unknown cell types are not expected, existing methods still label a considerable number of cells as unassigned, which is not desirable. In this work, we develop a neural network-based cell annotation method called NeuCA (Neural network-based Cell Annotation) for scRNA-seq data obtained from well-studied tissues. NeuCA can utilize the hierarchical structure information of the cell types to improve the annotation accuracy, which is especially helpful when data contain closely correlated cell types. We show that NeuCA can achieve more accurate cell annotation results compared with existing methods. Additionally, the applications on eight real datasets show that NeuCA has stable performance for intra- and inter-study annotation, as well as cross-condition annotation. NeuCA is freely available as an R/Bioconductor package at https://bioconductor.org/packages/NeuCA.
APA, Harvard, Vancouver, ISO, and other styles
38

Zhang, Weihang, Yang Cui, Bowen Liu, Martin Loza, Sung-Joon Park, and Kenta Nakai. "HyGAnno: hybrid graph neural network–based cell type annotation for single-cell ATAC sequencing data." Briefings in Bioinformatics 25, no. 3 (March 27, 2024). http://dx.doi.org/10.1093/bib/bbae152.

Full text
Abstract:
Abstract Reliable cell type annotations are crucial for investigating cellular heterogeneity in single-cell omics data. Although various computational approaches have been proposed for single-cell RNA sequencing (scRNA-seq) annotation, high-quality cell labels are still lacking in single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) data, because of extreme sparsity and inconsistent chromatin accessibility between datasets. Here, we present a novel automated cell annotation method that transfers cell type information from a well-labeled scRNA-seq reference to an unlabeled scATAC-seq target, via a parallel graph neural network, in a semi-supervised manner. Unlike existing methods that utilize only gene expression or gene activity features, HyGAnno leverages genome-wide accessibility peak features to facilitate the training process. In addition, HyGAnno reconstructs a reference–target cell graph to detect cells with low prediction reliability, according to their specific graph connectivity patterns. HyGAnno was assessed across various datasets, showcasing its strengths in precise cell annotation, generating interpretable cell embeddings, robustness to noisy reference data and adaptability to tumor tissues.
APA, Harvard, Vancouver, ISO, and other styles
39

Vu, Ha, and Jason Ernst. "Universal chromatin state annotation of the mouse genome." Genome Biology 24, no. 1 (June 27, 2023). http://dx.doi.org/10.1186/s13059-023-02994-x.

Full text
Abstract:
Abstract A large-scale application of the “stacked modeling” approach for chromatin state discovery previously provides a single “universal” chromatin state annotation of the human genome based jointly on data from many cell and tissue types. Here, we produce an analogous chromatin state annotation for mouse based on 901 datasets assaying 14 chromatin marks in 26 cell or tissue types. To characterize each chromatin state, we relate the states to external annotations and compare them to analogously defined human states. We expect the universal chromatin state annotation for mouse to be a useful resource for studying this key model organism’s genome.
APA, Harvard, Vancouver, ISO, and other styles
40

Ford, Michael K. B., Ananth Hari, Qinghui Zhou, Ibrahim Numanagić, and S. Cenk Sahinalp. "Biologically-informed Killer cell immunoglobulin-like receptor (KIR) gene annotation tool." Bioinformatics, October 21, 2024. http://dx.doi.org/10.1093/bioinformatics/btae622.

Full text
Abstract:
Abstract Summary Natural killer (NK) cells are essential components of the innate immune system, with their activity significantly regulated by Killer cell Immunoglobulin-like Receptors (KIRs). The diversity and structural complexity of KIR genes present significant challenges for accurate genotyping, essential for understanding NK cell functions and their implications in health and disease. Traditional genotyping methods struggle with the variable nature of KIR genes, leading to inaccuracies that can impede immunogenetic research. These challenges extend to high-quality phased assemblies, which have been recently popularized by the Human Pangenome Consortium. This paper introduces BAKIR (Biologically-informed Annotator for KIR locus), a tailored computational tool designed to overcome the challenges of KIR genotyping and annotation on high-quality, phased genome assemblies. BAKIR aims to enhance the accuracy of KIR gene annotations by structuring its annotation pipeline around identifying key functional mutations, thereby improving the identification and subsequent relevance of gene and allele calls. It uses a multi-stage mapping, alignment, and variant calling process to ensure high-precision gene and allele identification, while also maintaining high recall for sequences that are significantly mutated or truncated relative to the known allele database. BAKIR has been evaluated on a subset of the HPRC assemblies, where BAKIR was able to improve many of the associated annotations and call novel variants. BAKIR is freely available on GitHub, offering ease of access and use through multiple installation methods, including pip, conda, and singularity container, and is equipped with a user-friendly command-line interface, thereby promoting its adoption in the scientific community. Availability and Implementation BAKIR is available at github.com/algo-cancer/bakir Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
41

Shrestha, Prem, Nicholas Kuang, and Ji Yu. "Efficient end-to-end learning for cell segmentation with machine generated weak annotations." Communications Biology 6, no. 1 (March 2, 2023). http://dx.doi.org/10.1038/s42003-023-04608-5.

Full text
Abstract:
AbstractAutomated cell segmentation from optical microscopy images is usually the first step in the pipeline of single-cell analysis. Recently, deep-learning based algorithms have shown superior performances for the cell segmentation tasks. However, a disadvantage of deep-learning is the requirement for a large amount of fully annotated training data, which is costly to generate. Weakly-supervised and self-supervised learning is an active research area, but often the model accuracy is inversely correlated with the amount of annotation information provided. Here we focus on a specific subtype of weak annotations, which can be generated programmably from experimental data, thus allowing for more annotation information content without sacrificing the annotation speed. We designed a new model architecture for end-to-end training using such incomplete annotations. We have benchmarked our method on a variety of publicly available datasets, covering both fluorescence and bright-field imaging modality. We additionally tested our method on a microscopy dataset generated by us, using machine-generated annotations. The results demonstrated that our models trained under weak supervision can achieve segmentation accuracy competitive to, and in some cases, surpassing, state-of-the-art models trained under full supervision. Therefore, our method can be a practical alternative to the established full-supervision methods.
APA, Harvard, Vancouver, ISO, and other styles
42

Geuenich, Michael J., Dae-won Gong, and Kieran R. Campbell. "The impacts of active and self-supervised learning on efficient annotation of single-cell expression data." Nature Communications 15, no. 1 (February 3, 2024). http://dx.doi.org/10.1038/s41467-024-45198-y.

Full text
Abstract:
AbstractA crucial step in the analysis of single-cell data is annotating cells to cell types and states. While a myriad of approaches has been proposed, manual labeling of cells to create training datasets remains tedious and time-consuming. In the field of machine learning, active and self-supervised learning methods have been proposed to improve the performance of a classifier while reducing both annotation time and label budget. However, the benefits of such strategies for single-cell annotation have yet to be evaluated in realistic settings. Here, we perform a comprehensive benchmarking of active and self-supervised labeling strategies across a range of single-cell technologies and cell type annotation algorithms. We quantify the benefits of active learning and self-supervised strategies in the presence of cell type imbalance and variable similarity. We introduce adaptive reweighting, a heuristic procedure tailored to single-cell data—including a marker-aware version—that shows competitive performance with existing approaches. In addition, we demonstrate that having prior knowledge of cell type markers improves annotation accuracy. Finally, we summarize our findings into a set of recommendations for those implementing cell type annotation procedures or platforms. An R package implementing the heuristic approaches introduced in this work may be found at https://github.com/camlab-bioml/leader.
APA, Harvard, Vancouver, ISO, and other styles
43

Shi, Yongle, Yibing Ma, Xiang Chen, and Jie Gao. "scADCA: An Anomaly Detection-Based scRNA-seq Dataset Cell Type Annotation Method for Identifying Novel Cells." Current Bioinformatics 20 (October 10, 2024). http://dx.doi.org/10.2174/0115748936334071240903064630.

Full text
Abstract:
Background: With the rapid evolution of single-cell RNA sequencing technology, the study of cellular heterogeneity in complex tissues has reached an unprecedented resolution. One critical task of the technology is cell-type annotation. However, challenges persist, particularly in annotating novel cell types. Objective: Current methods rely heavily on well-annotated reference data, using correlation comparisons to determine cell types. However, identifying novel cells remains unstable due to the inherent complexity and heterogeneity of scRNA-seq data and cell types. To address this problem, we propose scADCA, a method based on anomaly detection, for identifying novel cell types and annotating the entire dataset. Methods: The convolutional modules and fully connected networks are integrated into an autoencoder, and the reference dataset is trained to obtain the reconstruction errors. The threshold based on these errors can distinguish between novel and known cells in the query dataset. After novel cells are identified, a multinomial logistic regression model fully annotates the dataset. Results: Using a simulation dataset, three real scRNA-seq pancreatic datasets, and a real scRNA-seq lung cancer cell line dataset, we compare scADCA with six other cell-type annotation methods, demonstrating competitive performance in terms of distinguished accuracy, full accuracy, F!-score, and confusion matrix. Conclusion: In conclusion, the scADCA method can be further improved and expanded to achieve better performance and application effects in cell type annotation, which is helpful to improve the accuracy and reliability of cytology research and promote the development of single-cell omics.
APA, Harvard, Vancouver, ISO, and other styles
44

Xiong, Yi-Xuan, and Xiao-Fei Zhang. "scDOT: enhancing single-cell RNA-Seq data annotation and uncovering novel cell types through multi-reference integration." Briefings in Bioinformatics 25, no. 2 (January 22, 2024). http://dx.doi.org/10.1093/bib/bbae072.

Full text
Abstract:
Abstract The proliferation of single-cell RNA-seq data has greatly enhanced our ability to comprehend the intricate nature of diverse tissues. However, accurately annotating cell types in such data, especially when handling multiple reference datasets and identifying novel cell types, remains a significant challenge. To address these issues, we introduce Single Cell annotation based on Distance metric learning and Optimal Transport (scDOT), an innovative cell-type annotation method adept at integrating multiple reference datasets and uncovering previously unseen cell types. scDOT introduces two key innovations. First, by incorporating distance metric learning and optimal transport, it presents a novel optimization framework. This framework effectively learns the predictive power of each reference dataset for new query data and simultaneously establishes a probabilistic mapping between cells in the query data and reference-defined cell types. Secondly, scDOT develops an interpretable scoring system based on the acquired probabilistic mapping, enabling the precise identification of previously unseen cell types within the data. To rigorously assess scDOT’s capabilities, we systematically evaluate its performance using two diverse collections of benchmark datasets encompassing various tissues, sequencing technologies and diverse cell types. Our experimental results consistently affirm the superior performance of scDOT in cell-type annotation and the identification of previously unseen cell types. These advancements provide researchers with a potent tool for precise cell-type annotation, ultimately enriching our understanding of complex biological tissues.
APA, Harvard, Vancouver, ISO, and other styles
45

Michielsen, Lieke, Marcel J. T. Reinders, and Ahmed Mahfouz. "Hierarchical progressive learning of cell identities in single-cell data." Nature Communications 12, no. 1 (May 14, 2021). http://dx.doi.org/10.1038/s41467-021-23196-8.

Full text
Abstract:
AbstractSupervised methods are increasingly used to identify cell populations in single-cell data. Yet, current methods are limited in their ability to learn from multiple datasets simultaneously, are hampered by the annotation of datasets at different resolutions, and do not preserve annotations when retrained on new datasets. The latter point is especially important as researchers cannot rely on downstream analysis performed using earlier versions of the dataset. Here, we present scHPL, a hierarchical progressive learning method which allows continuous learning from single-cell data by leveraging the different resolutions of annotations across multiple datasets to learn and continuously update a classification tree. We evaluate the classification and tree learning performance using simulated as well as real datasets and show that scHPL can successfully learn known cellular hierarchies from multiple datasets while preserving the original annotations. scHPL is available at https://github.com/lcmmichielsen/scHPL.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhang, Ying, Huaicheng Sun, Wei Zhang, Tingting Fu, Shijie Huang, Minjie Mou, Jinsong Zhang, et al. "CellSTAR: a comprehensive resource for single-cell transcriptomic annotation." Nucleic Acids Research, October 19, 2023. http://dx.doi.org/10.1093/nar/gkad874.

Full text
Abstract:
Abstract Large-scale studies of single-cell sequencing and biological experiments have successfully revealed expression patterns that distinguish different cell types in tissues, emphasizing the importance of studying cellular heterogeneity and accurately annotating cell types. Analysis of gene expression profiles in these experiments provides two essential types of data for cell type annotation: annotated references and canonical markers. In this study, the first comprehensive database of single-cell transcriptomic annotation resource (CellSTAR) was thus developed. It is unique in (a) offering the comprehensive expertly annotated reference data for annotating hundreds of cell types for the first time and (b) enabling the collective consideration of reference data and marker genes by incorporating tens of thousands of markers. Given its unique features, CellSTAR is expected to attract broad research interests from the technological innovations in single-cell transcriptomics, the studies of cellular heterogeneity & dynamics, and so on. It is now publicly accessible without any login requirement at: https://idrblab.org/cellstar.
APA, Harvard, Vancouver, ISO, and other styles
47

Shao, Xin, Haihong Yang, Xiang Zhuang, Jie Liao, Penghui Yang, Junyun Cheng, Xiaoyan Lu, Huajun Chen, and Xiaohui Fan. "scDeepSort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network." Nucleic Acids Research, September 9, 2021. http://dx.doi.org/10.1093/nar/gkab775.

Full text
Abstract:
Abstract Advances in single-cell RNA sequencing (scRNA-seq) have furthered the simultaneous classification of thousands of cells in a single assay based on transcriptome profiling. In most analysis protocols, single-cell type annotation relies on marker genes or RNA-seq profiles, resulting in poor extrapolation. Still, the accurate cell-type annotation for single-cell transcriptomic data remains a great challenge. Here, we introduce scDeepSort (https://github.com/ZJUFanLab/scDeepSort), a pre-trained cell-type annotation tool for single-cell transcriptomics that uses a deep learning model with a weighted graph neural network (GNN). Using human and mouse scRNA-seq data resources, we demonstrate the high performance and robustness of scDeepSort in labeling 764 741 cells involving 56 human and 32 mouse tissues. Significantly, scDeepSort outperformed other known methods in annotating 76 external test datasets, reaching an 83.79% accuracy across 265 489 cells in humans and mice. Moreover, we demonstrate the universality of scDeepSort using more challenging datasets and using references from different scRNA-seq technology. Above all, scDeepSort is the first attempt to annotate cell types of scRNA-seq data with a pre-trained GNN model, which can realize the accurate cell-type annotation without additional references, i.e. markers or RNA-seq profiles.
APA, Harvard, Vancouver, ISO, and other styles
48

Lee, Sarada M. W., Andrew Shaw, Jodie L. Simpson, David Uminsky, and Luke W. Garratt. "Differential cell counts using center-point networks achieves human-level accuracy and efficiency over segmentation." Scientific Reports 11, no. 1 (August 19, 2021). http://dx.doi.org/10.1038/s41598-021-96067-3.

Full text
Abstract:
AbstractDifferential cell counts is a challenging task when applying computer vision algorithms to pathology. Existing approaches to train cell recognition require high availability of multi-class segmentation and/or bounding box annotations and suffer in performance when objects are tightly clustered. We present differential count network (“DCNet”), an annotation efficient modality that utilises keypoint detection to locate in brightfield images the centre points of cells (not nuclei) and their cell class. The single centre point annotation for DCNet lowered burden for experts to generate ground truth data by 77.1% compared to bounding box labeling. Yet centre point annotation still enabled high accuracy when training DCNet on a multi-class algorithm on whole cell features, matching human experts in all 5 object classes in average precision and outperforming humans in consistency. The efficacy and efficiency of the DCNet end-to-end system represents a significant progress toward an open source, fully computationally approach to differential cell count based diagnosis that can be adapted to any pathology need.
APA, Harvard, Vancouver, ISO, and other styles
49

Wang, Yuge, Xingzhi Sun, and Hongyu Zhao. "Benchmarking automated cell type annotation tools for single-cell ATAC-seq data." Frontiers in Genetics 13 (December 13, 2022). http://dx.doi.org/10.3389/fgene.2022.1063233.

Full text
Abstract:
As single-cell chromatin accessibility profiling methods advance, scATAC-seq has become ever more important in the study of candidate regulatory genomic regions and their roles underlying developmental, evolutionary, and disease processes. At the same time, cell type annotation is critical in understanding the cellular composition of complex tissues and identifying potential novel cell types. However, most existing methods that can perform automated cell type annotation are designed to transfer labels from an annotated scRNA-seq data set to another scRNA-seq data set, and it is not clear whether these methods are adaptable to annotate scATAC-seq data. Several methods have been recently proposed for label transfer from scRNA-seq data to scATAC-seq data, but there is a lack of benchmarking study on the performance of these methods. Here, we evaluated the performance of five scATAC-seq annotation methods on both their classification accuracy and scalability using publicly available single-cell datasets from mouse and human tissues including brain, lung, kidney, PBMC, and BMMC. Using the BMMC data as basis, we further investigated the performance of these methods across different data sizes, mislabeling rates, sequencing depths and the number of cell types unique to scATAC-seq. Bridge integration, which is the only method that requires additional multimodal data and does not need gene activity calculation, was overall the best method and robust to changes in data size, mislabeling rate and sequencing depth. Conos was the most time and memory efficient method but performed the worst in terms of prediction accuracy. scJoint tended to assign cells to similar cell types and performed relatively poorly for complex datasets with deep annotations but performed better for datasets only with major label annotations. The performance of scGCN and Seurat v3 was moderate, but scGCN was the most time-consuming method and had the most similar performance to random classifiers for cell types unique to scATAC-seq.
APA, Harvard, Vancouver, ISO, and other styles
50

Quan, Fei, Xin Liang, Mingjiang Cheng, Huan Yang, Kun Liu, Shengyuan He, Shangqin Sun, et al. "Annotation of cell types (ACT): a convenient web server for cell type annotation." Genome Medicine 15, no. 1 (November 3, 2023). http://dx.doi.org/10.1186/s13073-023-01249-5.

Full text
Abstract:
Abstract Background The advancement of single-cell sequencing has progressed our ability to solve biological questions. Cell type annotation is of vital importance to this process, allowing for the analysis and interpretation of enormous single-cell datasets. At present, however, manual cell annotation which is the predominant approach remains limited by both speed and the requirement of expert knowledge. Methods To address these challenges, we constructed a hierarchically organized marker map through manually curating over 26,000 cell marker entries from about 7000 publications. We then developed WISE, a weighted and integrated gene set enrichment method, to integrate the prevalence of canonical markers and ordered differentially expressed genes of specific cell types in the marker map. Benchmarking analysis suggested that our method outperformed state-of-the-art methods. Results By integrating the marker map and WISE, we developed a user-friendly and convenient web server, ACT (http://xteam.xbio.top/ACT/ or http://biocc.hrbmu.edu.cn/ACT/), which only takes a simple list of upregulated genes as input and provides interactive hierarchy maps, together with well-designed charts and statistical information, to accelerate the assignment of cell identities and made the results comparable to expert manual annotation. Besides, a pan-tissue marker map was constructed to assist in cell assignments in less-studied tissues. Applying ACT to three case studies showed that all cell clusters were quickly and accurately annotated, and multi-level and more refined cell types were identified. Conclusions We developed a knowledge-based resource and a corresponding method, together with an intuitive graphical web interface, for cell type annotation. We believe that ACT, emerging as a powerful tool for cell type annotation, would be widely used in single-cell research and considerably accelerate the process of cell type identification.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography