To see the other types of publications on this topic, follow the link: Model annotation.

Journal articles on the topic 'Model annotation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Model annotation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liu, Zheng. "LDA-Based Automatic Image Annotation Model." Advanced Materials Research 108-111 (May 2010): 88–94. http://dx.doi.org/10.4028/www.scientific.net/amr.108-111.88.

Full text
Abstract:
This paper presents LDA-based automatic image annotation by visual topic learning and related annotation extending. We introduce the Latent Dirichlet Allocation (LDA) model in visual application domain. Firstly, the visual topic which is most relevant to the unlabeled image is obtained. According to this visual topic, the annotations with highest likelihood serve as seed annotations. Next, seed annotations are extended by analyzing the relationship between seed annotations and related Flickr tags. Finally, we combine seed annotations and extended annotations to construct final annotation set. Experiments conducted on corel5k dataset demonstrate the effectiveness of the proposed model.
APA, Harvard, Vancouver, ISO, and other styles
2

Paun, Silviu, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, and Massimo Poesio. "Comparing Bayesian Models of Annotation." Transactions of the Association for Computational Linguistics 6 (December 2018): 571–85. http://dx.doi.org/10.1162/tacl_a_00040.

Full text
Abstract:
The analysis of crowdsourced annotations in natural language processing is concerned with identifying (1) gold standard labels, (2) annotator accuracies and biases, and (3) item difficulties and error patterns. Traditionally, majority voting was used for 1, and coefficients of agreement for 2 and 3. Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation.
APA, Harvard, Vancouver, ISO, and other styles
3

Misirli, Goksel, Matteo Cavaliere, William Waites, Matthew Pocock, Curtis Madsen, Owen Gilfellon, Ricardo Honorato-Zimmer, Paolo Zuliani, Vincent Danos, and Anil Wipat. "Annotation of rule-based models with formal semantics to enable creation, analysis, reuse and visualization." Bioinformatics 32, no. 6 (November 11, 2015): 908–17. http://dx.doi.org/10.1093/bioinformatics/btv660.

Full text
Abstract:
Abstract Motivation: Biological systems are complex and challenging to model and therefore model reuse is highly desirable. To promote model reuse, models should include both information about the specifics of simulations and the underlying biology in the form of metadata. The availability of computationally tractable metadata is especially important for the effective automated interpretation and processing of models. Metadata are typically represented as machine-readable annotations which enhance programmatic access to information about models. Rule-based languages have emerged as a modelling framework to represent the complexity of biological systems. Annotation approaches have been widely used for reaction-based formalisms such as SBML. However, rule-based languages still lack a rich annotation framework to add semantic information, such as machine-readable descriptions, to the components of a model. Results: We present an annotation framework and guidelines for annotating rule-based models, encoded in the commonly used Kappa and BioNetGen languages. We adapt widely adopted annotation approaches to rule-based models. We initially propose a syntax to store machine-readable annotations and describe a mapping between rule-based modelling entities, such as agents and rules, and their annotations. We then describe an ontology to both annotate these models and capture the information contained therein, and demonstrate annotating these models using examples. Finally, we present a proof of concept tool for extracting annotations from a model that can be queried and analyzed in a uniform way. The uniform representation of the annotations can be used to facilitate the creation, analysis, reuse and visualization of rule-based models. Although examples are given, using specific implementations the proposed techniques can be applied to rule-based models in general. Availability and implementation: The annotation ontology for rule-based models can be found at http://purl.org/rbm/rbmo. The krdf tool and associated executable examples are available at http://purl.org/rbm/rbmo/krdf. Contact: anil.wipat@newcastle.ac.uk or vdanos@inf.ed.ac.uk
APA, Harvard, Vancouver, ISO, and other styles
4

Li, Huadong, Ying Wei, Han Peng, and Wei Zhang. "DiffuPrompter: Pixel-Level Automatic Annotation for High-Resolution Remote Sensing Images with Foundation Models." Remote Sensing 16, no. 11 (June 2, 2024): 2004. http://dx.doi.org/10.3390/rs16112004.

Full text
Abstract:
Instance segmentation is pivotal in remote sensing image (RSI) analysis, aiding in many downstream tasks. However, annotating images with pixel-wise annotations is time-consuming and laborious. Despite some progress in automatic annotation, the performance of existing methods still needs improvement due to the high precision requirements for pixel-level annotation and the complexity of RSIs. With the support of large-scale data, some foundational models have made significant progress in semantic understanding and generalization capabilities. In this paper, we delve deep into the potential of the foundational models in automatic annotation and propose a training-free automatic annotation method called DiffuPrompter, achieving pixel-level automatic annotation of RSIs. Extensive experimental results indicate that the proposed method can provide reliable pseudo-labels, significantly reducing the annotation costs of the segmentation task. Additionally, the cross-domain validation experiments confirm the powerful effectiveness of large-scale pseudo-data in improving model generalization performance.
APA, Harvard, Vancouver, ISO, and other styles
5

Chu, Zhendong, Jing Ma, and Hongning Wang. "Learning from Crowds by Modeling Common Confusions." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 7 (May 18, 2021): 5832–40. http://dx.doi.org/10.1609/aaai.v35i7.16730.

Full text
Abstract:
Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the crowdsourced annotations. In this work, we provide a new perspective to decompose annotation noise into common noise and individual noise and differentiate the source of confusion based on instance difficulty and annotator expertise on a per-instance-annotator basis. We realize this new crowdsourcing model by an end-to-end learning solution with two types of noise adaptation layers: one is shared across annotators to capture their commonly shared confusions, and the other one is pertaining to each annotator to realize individual confusion. To recognize the source of noise in each annotation, we use an auxiliary network to choose from the two noise adaptation layers with respect to both instances and annotators. Extensive experiments on both synthesized and real-world benchmarks demonstrate the effectiveness of our proposed common noise adaptation solution.
APA, Harvard, Vancouver, ISO, and other styles
6

Rotman, Guy, and Roi Reichart. "Multi-task Active Learning for Pre-trained Transformer-based Models." Transactions of the Association for Computational Linguistics 10 (2022): 1209–28. http://dx.doi.org/10.1162/tacl_a_00515.

Full text
Abstract:
Abstract Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations and may facilitate better predictions when the tasks are inter-related. This technique, however, requires annotating the same text with multiple annotation schemes, which may be costly and laborious. Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples whose annotation is most valuable for the NLP model. Yet, multi-task active learning (MT-AL) has not been applied to state-of-the-art pre-trained Transformer-based NLP models. This paper aims to close this gap. We explore various multi-task selection criteria in three realistic multi-task scenarios, reflecting different relations between the participating tasks, and demonstrate the effectiveness of multi-task compared to single-task selection. Our results suggest that MT-AL can be effectively used in order to minimize annotation efforts for multi-task NLP models.1
APA, Harvard, Vancouver, ISO, and other styles
7

Luo, Yan, Tianxiu Lu, Weihan Zhang, Suiqun Li, and Xuefeng Wang. "Augmenting Three-Dimensional Model Annotation System with Enhanced Reality." Journal of Computing and Electronic Information Management 12, no. 2 (March 30, 2024): 1–7. http://dx.doi.org/10.54097/uv15ws76.

Full text
Abstract:
This study proposes an augmented reality-based three-dimensional model annotation system, integrating cloud anchors, three-dimensional reconstruction, and augmented reality technology to achieve explicit three-dimensional annotations on models. Employing an improved ORB algorithm, the annotated model is persistently anchored in three-dimensional space through cloud anchors, presenting accurate spatial information and showcasing the depth of scenes and relationships between elements. The system supports multiple data types for annotations, such as text and images. Through a comparison with traditional two-dimensional annotation in a drone experiment, the system demonstrates higher experimental efficiency, providing more intuitive annotation guidance and enhancing remote guidance efficiency and user understanding of drones.
APA, Harvard, Vancouver, ISO, and other styles
8

Filali, Jalila, Hajer Baazaoui Zghal, and Jean Martinet. "Ontology-Based Image Classification and Annotation." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 11 (March 16, 2020): 2040002. http://dx.doi.org/10.1142/s0218001420400029.

Full text
Abstract:
With the rapid growth of image collections, image classification and annotation has been active areas of research with notable recent progress. Bag-of-Visual-Words (BoVW) model, which relies on building visual vocabulary, has been widely used in this area. Recently, attention has been shifted to the use of advanced architectures which are characterized by multi-level processing. Hierarchical Max-Pooling (HMAX) model has attracted a great deal of attention in image classification. To improve image classification and annotation, several approaches based on ontologies have been proposed. However, image classification and annotation remain a challenging problem due to many related issues like the problem of ambiguity between classes. This problem can affect the quality of both classification and annotation results. In this paper, we propose an ontology-based image classification and annotation approach. Our contributions consist of the following: (1) exploiting ontological relationships between classes during both image classification and annotation processes; (2) combining the outputs of hypernym–hyponym classifiers to lead to a better discrimination between classes; and (3) annotating images by combining hypernym and hyponym classification results in order to improve image annotation and to reduce the ambiguous and inconsistent annotations. The aim is to improve image classification and annotation by using ontologies. Several strategies have been experimented, and the obtained results have shown that our proposal improves image classification and annotation.
APA, Harvard, Vancouver, ISO, and other styles
9

Wu, Xian, Wei Fan, and Yong Yu. "Sembler: Ensembling Crowd Sequential Labeling for Improved Quality." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 1713–19. http://dx.doi.org/10.1609/aaai.v26i1.8351.

Full text
Abstract:
Many natural language processing tasks, such as named entity recognition (NER), part of speech (POS) tagging, word segmentation, and etc., can be formulated as sequential data labeling problems. Building a sound labeler requires very large number of correctly labeled training examples, which may not always be possible. On the other hand, crowdsourcing provides an inexpensive yet efficient alternative to collect manual sequential labeling from non-experts. However the quality of crowd labeling cannot be guaranteed, and three kinds of errors are typical: (1) incorrect annotations due to lack of expertise (e.g., labeling gene names from plain text requires corresponding domain knowledge); (2) ignored or omitted annotations due to carelessness or low confidence; (3) noisy annotations due to cheating or vandalism. To correct these mistakes, we present Sembler, a statistical model for ensembling crowd sequential labelings. Sembler considers three types of statistical information: (1) the majority agreement that proves the correctness of an annotation; (2) correct annotation that improves the credibility of the corresponding annotator; (3) correct annotation that enhances the correctness of other annotations which share similar linguistic or contextual features. We evaluate the proposed model on a real Twitter and a synthetical biological data set, and find that Sembler is particularly accurate when more than half of annotators make mistakes.
APA, Harvard, Vancouver, ISO, and other styles
10

VanBerlo, Bennett, Delaney Smith, Jared Tschirhart, Blake VanBerlo, Derek Wu, Alex Ford, Joseph McCauley, et al. "Enhancing Annotation Efficiency with Machine Learning: Automated Partitioning of a Lung Ultrasound Dataset by View." Diagnostics 12, no. 10 (September 28, 2022): 2351. http://dx.doi.org/10.3390/diagnostics12102351.

Full text
Abstract:
Background: Annotating large medical imaging datasets is an arduous and expensive task, especially when the datasets in question are not organized according to deep learning goals. Here, we propose a method that exploits the hierarchical organization of annotating tasks to optimize efficiency. Methods: We trained a machine learning model to accurately distinguish between one of two classes of lung ultrasound (LUS) views using 2908 clips from a larger dataset. Partitioning the remaining dataset by view would reduce downstream labelling efforts by enabling annotators to focus on annotating pathological features specific to each view. Results: In a sample view-specific annotation task, we found that automatically partitioning a 780-clip dataset by view saved 42 min of manual annotation time and resulted in 55±6 additional relevant labels per hour. Conclusions: Automatic partitioning of a LUS dataset by view significantly increases annotator efficiency, resulting in higher throughput relevant to the annotating task at hand. The strategy described in this work can be applied to other hierarchical annotation schemes.
APA, Harvard, Vancouver, ISO, and other styles
11

Pozharkova, I. N. "Context-Dependent Annotation Method in Emergency Monitoring Information Systems." Informacionnye Tehnologii 28, no. 1 (January 17, 2022): 43–47. http://dx.doi.org/10.17587/it.28.43-47.

Full text
Abstract:
The article presents the method of context-dependent annotations used in solving problems of emergency monitoring on the basis of information systems. The method is based on a spectral language model that allows solving various information search problems taking into account the specific features of the applied area. The functional model of emergency monitoring task in IDEF0 notation is presented. The task of context-dependent annotating operational summaries as a basis for generating preliminary reports is formulated. The main problems that arise in solving this problem on a large volume of initial data and can be critical in conditions of fast-developing emergencies are identified. The problem of context-dependent annotations in the conditions of existing restrictions was set, the main language units used in solving it are described. A flowchart of solving the problem of context-dependent annotating taking into account the specifics of the subject area is presented, and the implementation of each stage of this algorithm is described in detail. Described is a method of determining relevance of a text fragment to a target query based on a spectral language model. Basic assessments of annotation quality are described. A comparative analysis of the quality and speed of constructing annotations manually and on the basis of the presented method using assessory estimates was carried out. The effectiveness of the context-dependent annotation method in processing a large number of documents in conditions of fast-developing emergency situations requiring emergency decision-making is shown.
APA, Harvard, Vancouver, ISO, and other styles
12

Bauer, Matthias, and Angelika Zirker. "Explanatory Annotation of Literary Texts and the Reader: Seven Types of Problems." International Journal of Humanities and Arts Computing 11, no. 2 (October 2017): 212–32. http://dx.doi.org/10.3366/ijhac.2017.0193.

Full text
Abstract:
While most literary scholars wish to help readers understand literary texts by providing them with explanatory annotations, we want to go a step further and enable them, on the basis of structured information, to arrive at interpretations of their own. We therefore seek to establish a concept of explanatory annotation that is reader-oriented and combines hermeneutics with the opportunities provided by digital methods. In a first step, we are going to present a few examples of existing annotations that apparently do not take into account readerly needs. To us, they represent seven types of common problems in explanatory annotation. We then introduce a possible model of best practice which is based on categories and structured along the lines of the following questions: What kind(s) of annotations do improve text comprehension? Which contexts must be considered when annotating? Is it possible to develop a concept of the reader on the basis of annotations—and can, in turn, annotations address a particular kind of readership, i.e.: in how far can annotations be(come) individualised?
APA, Harvard, Vancouver, ISO, and other styles
13

Wood, Valerie, Seth Carbon, Midori A. Harris, Antonia Lock, Stacia R. Engel, David P. Hill, Kimberly Van Auken, et al. "Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns." Open Biology 10, no. 9 (September 2020): 200149. http://dx.doi.org/10.1098/rsob.200149.

Full text
Abstract:
Biological processes are accomplished by the coordinated action of gene products. Gene products often participate in multiple processes, and can therefore be annotated to multiple Gene Ontology (GO) terms. Nevertheless, processes that are functionally, temporally and/or spatially distant may have few gene products in common, and co-annotation to unrelated processes probably reflects errors in literature curation, ontology structure or automated annotation pipelines. We have developed an annotation quality control workflow that uses rules based on mutually exclusive processes to detect annotation errors, based on and validated by case studies including the three we present here: fission yeast protein-coding gene annotations over time; annotations for cohesin complex subunits in human and model species; and annotations using a selected set of GO biological process terms in human and five model species. For each case study, we reviewed available GO annotations, identified pairs of biological processes which are unlikely to be correctly co-annotated to the same gene products (e.g. amino acid metabolism and cytokinesis), and traced erroneous annotations to their sources. To date we have generated 107 quality control rules, and corrected 289 manual annotations in eukaryotes and over 52 700 automatically propagated annotations across all taxa.
APA, Harvard, Vancouver, ISO, and other styles
14

Hayat, Hassan, Carles Ventura, and Agata Lapedriza. "Modeling Subjective Affect Annotations with Multi-Task Learning." Sensors 22, no. 14 (July 13, 2022): 5245. http://dx.doi.org/10.3390/s22145245.

Full text
Abstract:
In supervised learning, the generalization capabilities of trained models are based on the available annotations. Usually, multiple annotators are asked to annotate the dataset samples and, then, the common practice is to aggregate the different annotations by computing average scores or majority voting, and train and test models on these aggregated annotations. However, this practice is not suitable for all types of problems, especially when the subjective information of each annotator matters for the task modeling. For example, emotions experienced while watching a video or evoked by other sources of content, such as news headlines, are subjective: different individuals might perceive or experience different emotions. The aggregated annotations in emotion modeling may lose the subjective information and actually represent an annotation bias. In this paper, we highlight the weaknesses of models that are trained on aggregated annotations for modeling tasks related to affect. More concretely, we compare two generic Deep Learning architectures: a Single-Task (ST) architecture and a Multi-Task (MT) architecture. While the ST architecture models single emotional perception each time, the MT architecture jointly models every single annotation and the aggregated annotations at once. Our results show that the MT approach can more accurately model every single annotation and the aggregated annotations when compared to methods that are directly trained on the aggregated annotations. Furthermore, the MT approach achieves state-of-the-art results on the COGNIMUSE, IEMOCAP, and SemEval_2007 benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
15

Rao, Xun, Jiasheng Wang, Wenjing Ran, Mengzhu Sun, and Zhe Zhao. "Deep-Learning-Based Annotation Extraction Method for Chinese Scanned Maps." ISPRS International Journal of Geo-Information 12, no. 10 (October 14, 2023): 422. http://dx.doi.org/10.3390/ijgi12100422.

Full text
Abstract:
One of a map’s fundamental elements is its annotations, and extracting these annotations is an important step in enabling machine intelligence to understand scanned map data. Due to the complexity of the characters and lines, extracting annotations from scanned Chinese maps is difficult, and there is currently little research in this area. A deep-learning-based framework for extracting annotations from scanned Chinese maps is presented in the paper. Improved the EAST annotation detection model and CRNN annotation recognition model based on transfer learning make up the two primary parts of this framework. Several sets of the comparative tests for annotation detection and recognition were created in order to assess the efficacy of this method for extracting annotations from scanned Chinese maps. The experimental findings show the following: (i) The suggested annotation detection approach in this study revealed precision, recall, and h-mean values of 0.8990, 0.8389, and 0.8635, respectively. These measures demonstrate improvements over the currently popular models of −0.0354 to 0.0907, 0.0131 to 0.2735, and 0.0467 to 0.1919, respectively. (ii) The proposed annotation recognition method in this study revealed precision, recall, and h-mean values of 0.9320, 0.8956, and 0.9134, respectively. These measurements demonstrate improvements over the currently popular models of 0.0294 to 0.1049, 0.0498 to 0.1975, and 0.0402 to 0.1582, respectively.
APA, Harvard, Vancouver, ISO, and other styles
16

Attik, Mohammed, Malik Missen, Mickaël Coustaty, Gyu Choi, Fahd Alotaibi, Nadeem Akhtar, Muhammad Jhandir, V. Prasath, Nadeem Salamat, and Mujtaba Husnain. "OpinionML—Opinion Markup Language for Sentiment Representation." Symmetry 11, no. 4 (April 15, 2019): 545. http://dx.doi.org/10.3390/sym11040545.

Full text
Abstract:
It is the age of the social web, where people express themselves by giving their opinions about various issues, from their personal life to the world’s political issues. This process generates a lot of opinion data on the web that can be processed for valuable information, and therefore, semantic annotation of opinions becomes an important task. Unfortunately, existing opinion annotation schemes have failed to satisfy annotation challenges and cannot even adhere to the basic definition of opinion. Opinion holders, topical features and temporal expressions are major components of an opinion that remain ignored in existing annotation schemes. In this work, we propose OpinionML, a new Markup Language, that aims to compensate for the issues that existing typical opinion markup languages fail to resolve. We present a detailed discussion about existing annotation schemes and their associated problems. We argue that OpinionML is more robust, flexible and easier for annotating opinion data. Its modular approach while implementing a logical model provides us with a flexible and easier model of annotation. OpinionML can be considered a step towards “information symmetry”. It is an effort for consistent sentiment annotations across the research community. We perform experiments to prove robustness of the proposed OpinionML and the results demonstrate its capability of retrieving significant components of opinion segments. We also propose OpinionML ontology in an effort to make OpinionML more inter-operable. The ontology proposed is more complete than existing opinion ontologies like Marl and Onyx. A comprehensive comparison of the proposed ontology with existing sentiment ontologies Marl and Onyx proves its worth.
APA, Harvard, Vancouver, ISO, and other styles
17

Li, Wei, Haiyu Song, Hongda Zhang, Houjie Li, and Pengjie Wang. "The Image Annotation Refinement in Embedding Feature Space based on Mutual Information." International Journal of Circuits, Systems and Signal Processing 16 (January 10, 2022): 191–201. http://dx.doi.org/10.46300/9106.2022.16.23.

Full text
Abstract:
The ever-increasing size of images has made automatic image annotation one of the most important tasks in the fields of machine learning and computer vision. Despite continuous efforts in inventing new annotation algorithms and new models, results of the state-of-the-art image annotation methods are often unsatisfactory. In this paper, to further improve annotation refinement performance, a novel approach based on weighted mutual information to automatically refine the original annotations of images is proposed. Unlike the traditional refinement model using only visual feature, the proposed model use semantic embedding to properly map labels and visual features to a meaningful semantic space. To accurately measure the relevance between the particular image and its original annotations, the proposed model utilize all available information including image-to-image, label-to-label and image-to-label. Experimental results conducted on three typical datasets show not only the validity of the refinement, but also the superiority of the proposed algorithm over existing ones. The improvement largely benefits from our proposed mutual information method and utilizing all available information.
APA, Harvard, Vancouver, ISO, and other styles
18

Cooling, Michael T., and Peter Hunter. "The CellML Metadata Framework 2.0 Specification." Journal of Integrative Bioinformatics 12, no. 2 (June 1, 2015): 86–103. http://dx.doi.org/10.1515/jib-2015-260.

Full text
Abstract:
Summary The CellML Metadata Framework 2.0 is a modular framework that describes how semantic annotations should be made about mathematical models encoded in the CellML (www.cellml.org) format, and their elements. In addition to the Core specification, there are several satellite specifications, each designed to cater for model annotation in a different context. Basic Model Information, Citation, License and Biological Annotation specifications are presented.
APA, Harvard, Vancouver, ISO, and other styles
19

Meunier, Loïc, Denis Baurain, and Luc Cornet. "AMAW: automated gene annotation for non-model eukaryotic genomes." F1000Research 12 (February 16, 2023): 186. http://dx.doi.org/10.12688/f1000research.129161.1.

Full text
Abstract:
Background: The annotation of genomes is a crucial step regarding the analysis of new genomic data and resulting insights, and this especially for emerging organisms which allow researchers to access unexplored lineages, so as to expand our knowledge of poorly represented taxonomic groups. Complete pipelines for eukaryotic genome annotation have been proposed for more than a decade, but the issue is still challenging. One of the most widely used tools in the field is MAKER2, an annotation pipeline using experimental evidence (mRNA-seq and proteins) and combining different gene prediction tools. MAKER2 enables individual laboratories and small-scale projects to annotate non-model organisms for which pre-existing gene models are not available. The optimal use of MAKER2 requires gathering evidence data (by searching and assembling transcripts, and/or collecting homologous proteins from related organisms), elaborating the best annotation strategy (training of gene models) and efficiently orchestrating the different steps of the software in a grid computing environment, which is tedious, time-consuming and requires a great deal of bioinformatic skills. Methods: To address these issues, we present AMAW (Automated MAKER2 Annotation Wrapper), a wrapper pipeline for MAKER2 that automates the above-mentioned tasks. Importantly, AMAW also exists as a Singularity container recipe easy to deploy on a grid computer, thereby overcoming the tricky installation of MAKER2. Use case: The performance of AMAW is illustrated through the annotation of a selection of 32 protist genomes, for which we compared its annotations with those produced with gene models directly available in AUGUSTUS. Conclusions: Importantly, AMAW also exists as a Singularity container recipe easy to deploy on a grid computer, thereby overcoming the tricky installation of MAKER2
APA, Harvard, Vancouver, ISO, and other styles
20

Yeh, Eric, William Jarrold, and Joshua Jordan. "Leveraging Psycholinguistic Resources and Emotional Sequence Models for Suicide Note Emotion Annotation." Biomedical Informatics Insights 5s1 (January 2012): BII.S8979. http://dx.doi.org/10.4137/bii.s8979.

Full text
Abstract:
We describe the submission entered by SRI International and UC Davis for the I2B2 NLP Challenge Track 2. Our system is based on a machine learning approach and employs a combination of lexical, syntactic, and psycholinguistic features. In addition, we model the sequence and locations of occurrence of emotions found in the notes. We discuss the effect of these features on the emotion annotation task, as well as the nature of the notes themselves. We also explore the use of bootstrapping to help account for what appeared to be annotator fatigue in the data. We conclude a discussion of future avenues for improving the approach for this task, and also discuss how annotations at the word span level may be more appropriate for this task than annotations at the sentence level.
APA, Harvard, Vancouver, ISO, and other styles
21

Nichyporuk, Brennan, Jillian Cardinell, Justin Szeto, Raghav Mehta, Jean-Pierre Falet, Douglas L. Arnold, Sotirios A. Tsaftaris, and Tal Arbel. "Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation." Machine Learning for Biomedical Imaging 1, December 2022 (December 15, 2022): 1–37. http://dx.doi.org/10.59275/j.melba.2022-2d93.

Full text
Abstract:
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
APA, Harvard, Vancouver, ISO, and other styles
22

Ma, Qin Yi, Li Hua Song, Da Peng Xie, and Mao Jun Zhou. "Development of CAD Model Annotation System Based on Design Intent." Applied Mechanics and Materials 863 (February 2017): 368–72. http://dx.doi.org/10.4028/www.scientific.net/amm.863.368.

Full text
Abstract:
Most of the product design on the market is variant design or adaptive design, which need to reuse existing product design knowledge. A key aspect of reusing existing CAD model is correctly define and understand the design intents behind of existing CAD model, and this paper introduces a CAD model annotation system based on design intent. Design intents contained all design information of entire life cycle from modeling, analysis to manufacturing are marked onto the CAD model using PMI module in UG to improve the readability of the CAD model. Second, given the problems such as management difficulties, no filter and retrieval functions, this paper proposes an annotation manager system based on UG redevelopment by filtration, retrieval, grouping and other functions to reduce clutter on the 3D annotations and be convenient for users to view needed all kinds of annotations. Finally, design information is represented both internally within the 3D model and externally on a XML file.
APA, Harvard, Vancouver, ISO, and other styles
23

Mannai, Zayneb, Anis Kalboussi, and Ahmed Hadj Kacem. "Towards a Standard of Modelling Annotations in the E-Health Domain." Health Informatics - An International Journal 10, no. 04 (November 30, 2021): 1–10. http://dx.doi.org/10.5121/hiij.2021.10401.

Full text
Abstract:
A large number of annotation systems in e-health domain have been implemented in the literature. Several factors distinguish these systems from one another. In fact, each of these systems is based on a separate paradigm, resulting in a disorganized and unstructured vision. As part of our research, we attempted to categorize them based on the functionalities provided by each system, and we also proposed a model of annotations that integrates both the health professional and the patient in the process of annotating the medical file.
APA, Harvard, Vancouver, ISO, and other styles
24

BALEY, Julien. "Leveraging graph algorithms to speed up the annotation of large rhymed corpora." Cahiers de Linguistique Asie Orientale 51, no. 1 (March 17, 2022): 46–80. http://dx.doi.org/10.1163/19606028-bja10019.

Full text
Abstract:
Abstract Rhyming patterns play a crucial role in the phonological reconstruction of earlier stages of Chinese. The past few years have seen the emergence of the use of graphs to model rhyming patterns, notably with List’s (2016) proposal to use graph community detection as a way to go beyond the limits of the link-and-bind method and test new hypotheses regarding phonological reconstruction. List’s approach requires the existence of a rhyme-annotated corpus; such corpora are rare and prohibitively expensive to produce. The present paper solves this problem by introducing several strategies to automate annotation. Among others, the main contribution is the use of graph community detection itself to build an automatic annotator. This annotator requires no previous annotation, no knowledge of phonology, and automatically adapts to corpora of different periods by learning their rhyme categories. Through a series of case studies, we demonstrate the viability of the approach in quickly annotating hundreds of thousands of poems with high accuracy.
APA, Harvard, Vancouver, ISO, and other styles
25

Salek, Mahyar, Yoram Bachrach, and Peter Key. "Hotspotting — A Probabilistic Graphical Model For Image Object Localization Through Crowdsourcing." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (June 29, 2013): 1156–62. http://dx.doi.org/10.1609/aaai.v27i1.8465.

Full text
Abstract:
Object localization is an image annotation task which consists of finding the location of a target object in an image. It is common to crowdsource annotation tasks and aggregate responses to estimate the true annotation. While for other kinds of annotations consensus is simple and powerful, it cannot be applied to object localization as effectively due to the task's rich answer space and inherent noise in responses. We propose a probabilistic graphical model to localize objects in images based on responses from the crowd. We improve upon natural aggregation methods such as the mean and the median by simultaneously estimating the difficulty level of each question and skill level of every participant. We empirically evaluate our model on crowdsourced data and show that our method outperforms simple aggregators both in estimating the true locations and in ranking participants by their ability. We also propose a simple adaptive sourcing scheme that works well for very sparse datasets.
APA, Harvard, Vancouver, ISO, and other styles
26

Zhang, Hansong, Shikun Li, Dan Zeng, Chenggang Yan, and Shiming Ge. "Coupled Confusion Correction: Learning from Crowds with Sparse Annotations." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 15 (March 24, 2024): 16732–40. http://dx.doi.org/10.1609/aaai.v38i15.29613.

Full text
Abstract:
As the size of the datasets getting larger, accurately annotating such datasets is becoming more impractical due to the expensiveness on both time and economy. Therefore, crowd-sourcing has been widely adopted to alleviate the cost of collecting labels, which also inevitably introduces label noise and eventually degrades the performance of the model. To learn from crowd-sourcing annotations, modeling the expertise of each annotator is a common but challenging paradigm, because the annotations collected by crowd-sourcing are usually highly-sparse. To alleviate this problem, we propose Coupled Confusion Correction (CCC), where two models are simultaneously trained to correct the confusion matrices learned by each other. Via bi-level optimization, the confusion matrices learned by one model can be corrected by the distilled data from the other. Moreover, we cluster the ``annotator groups'' who share similar expertise so that their confusion matrices could be corrected together. In this way, the expertise of the annotators, especially of those who provide seldom labels, could be better captured. Remarkably, we point out that the annotation sparsity not only means the average number of labels is low, but also there are always some annotators who provide very few labels, which is neglected by previous works when constructing synthetic crowd-sourcing annotations. Based on that, we propose to use Beta distribution to control the generation of the crowd-sourcing labels so that the synthetic annotations could be more consistent with the real-world ones. Extensive experiments are conducted on two types of synthetic datasets and three real-world datasets, the results of which demonstrate that CCC significantly outperforms state-of-the-art approaches. Source codes are available at: https://github.com/Hansong-Zhang/CCC.
APA, Harvard, Vancouver, ISO, and other styles
27

Pimenov, I. S. "Analyzing Disagreements in Argumentation Annotation of Scientific Texts in Russian Language." NSU Vestnik. Series: Linguistics and Intercultural Communication 21, no. 2 (September 9, 2023): 89–104. http://dx.doi.org/10.25205/1818-7935-2023-21-2-89-104.

Full text
Abstract:
This paper presents the analysis of inter-annotator disagreements in modeling argumentation in scientific papers. The aim of the study is to specify annotation guidelines for the typical disagreement cases. The analysis focuses on inter-annotator disagreements at three annotation levels: theses identification, links construction between theses, specification of reasoning models for these links. The dataset contains 20 argumentation annotations for 10 scientific papers from two thematic areas, where two experts have independently annotated each text. These 20 annotations include 917 theses and 773 arguments. The annotation of each text has consisted in modelling its argumentation structure in accordance with Argument Interchange Format. The use of this model results in construction of an oriented graph with two node types (information nodes for statements, scheme nodes for links between them and reasoning models in these links) for an annotated text. Identification of reasoning models follows Walton’s classification. To identify disagreements between annotators, we perform an automatic comparison of graphs that represent an argumentation structure of the same text. This comparison includes three stages: 1) identification of theses that are present in one graph and absent in another; 2) detection of links that connect the corresponding theses between graphs in a different manner; 3) identification of different reasoning models specified for the same links. Next, an expert analysis of the automatically identified discrepancies enables specification of the typical disagreement cases based on the structural properties of argumentation graphs (positioning of theses, configuration of links across statements at different distances in the text, the ratio between the overall frequency of a reasoning model in annotations and the frequency of disagreements over its identification). The study shows that the correspondence values between argumentation graphs reach on average 78 % for theses, 55 % for links, 60 % for reasoning models. Typical disagreement cases include 1) detection of theses expressed in a text without explicit justification; 2) construction of links between theses in the same paragraph or at a distance of four and more paragraphs; 3) identification of two specific reasoning models (connected respectively to the 40 % and 33 % of disagreements); 4) confusion over functionally different schemes due to the perception of links by annotators in different aspects. The study results in formulating annotation guidelines for minimizing typical disagreement cases at each level of argumentation structures.
APA, Harvard, Vancouver, ISO, and other styles
28

Wu, Aihua. "Ranking Biomedical Annotations with Annotator’s Semantic Relevancy." Computational and Mathematical Methods in Medicine 2014 (2014): 1–11. http://dx.doi.org/10.1155/2014/258929.

Full text
Abstract:
Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator’s knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user’s vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.
APA, Harvard, Vancouver, ISO, and other styles
29

Yuan, Guowen, Ben Kao, and Tien-Hsuan Wu. "CEMA – Cost-Efficient Machine-Assisted Document Annotations." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 11043–50. http://dx.doi.org/10.1609/aaai.v37i9.26308.

Full text
Abstract:
We study the problem of semantically annotating textual documents that are complex in the sense that the documents are long, feature rich, and domain specific. Due to their complexity, such annotation tasks require trained human workers, which are very expensive in both time and money. We propose CEMA, a method for deploying machine learning to assist humans in complex document annotation. CEMA estimates the human cost of annotating each document and selects the set of documents to be annotated that strike the best balance between model accuracy and human cost. We conduct experiments on complex annotation tasks in which we compare CEMA against other document selection and annotation strategies. Our results show that CEMA is the most cost-efficient solution for those tasks.
APA, Harvard, Vancouver, ISO, and other styles
30

Braylan, Alexander, Madalyn Marabella, Omar Alonso, and Matthew Lease. "A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks." Journal of Artificial Intelligence Research 78 (December 11, 2023): 901–73. http://dx.doi.org/10.1613/jair.1.14388.

Full text
Abstract:
Human annotations are vital to supervised learning, yet annotators often disagree on the correct label, especially as annotation tasks increase in complexity. A common strategy to improve label quality is to ask multiple annotators to label the same item and then aggregate their labels. To date, many aggregation models have been proposed for simple categorical or numerical annotation tasks, but far less work has considered more complex annotation tasks, such as those involving open-ended, multivariate, or structured responses. Similarly, while a variety of bespoke models have been proposed for specific tasks, our work is the first we are aware of to introduce aggregation methods that generalize across many, diverse complex tasks, including sequence labeling, translation, syntactic parsing, ranking, bounding boxes, and keypoints. This generality is achieved by applying readily available task-specific distance functions, then devising a task-agnostic method to model these distances between labels, rather than the labels themselves. This article presents a unified treatment of our prior work on complex annotation modeling and extends that work with investigation of three new research questions. First, how do complex annotation task and dataset properties impact aggregation accuracy? Second, how should a task owner navigate the many modeling choices in order to maximize aggregation accuracy? Finally, what tests and diagnoses can verify that aggregation models are specified correctly for the given data? To understand how various factors impact accuracy and to inform model selection, we conduct large-scale simulation studies and broad experiments on real, complex datasets. Regarding testing, we introduce the concept of unit tests for aggregation models and present a suite of such tests to ensure that a given model is not mis-specified and exhibits expected behavior. Beyond investigating these research questions above, we discuss the foundational concept and nature of annotation complexity, present a new aggregation model as a conceptual bridge between traditional models and our own, and contribute a new general semisupervised learning method for complex label aggregation that outperforms prior work.
APA, Harvard, Vancouver, ISO, and other styles
31

Bilal, Mühenad, Ranadheer Podishetti, Leonid Koval, Mahmoud A. Gaafar, Daniel Grossmann, and Markus Bregulla. "The Effect of Annotation Quality on Wear Semantic Segmentation by CNN." Sensors 24, no. 15 (July 23, 2024): 4777. http://dx.doi.org/10.3390/s24154777.

Full text
Abstract:
In this work, we investigate the impact of annotation quality and domain expertise on the performance of Convolutional Neural Networks (CNNs) for semantic segmentation of wear on titanium nitride (TiN) and titanium carbonitride (TiCN) coated end mills. Using an innovative measurement system and customized CNN architecture, we found that domain expertise significantly affects model performance. Annotator 1 achieved maximum mIoU scores of 0.8153 for abnormal wear and 0.7120 for normal wear on TiN datasets, whereas Annotator 3 with the lowest expertise achieved significantly lower scores. Sensitivity to annotation inconsistencies and model hyperparameters were examined, revealing that models for TiCN datasets showed a higher coefficient of variation (CV) of 16.32% compared to 8.6% for TiN due to the subtle wear characteristics, highlighting the need for optimized annotation policies and high-quality images to improve wear segmentation.
APA, Harvard, Vancouver, ISO, and other styles
32

Spetale, Flavio E., Javier Murillo, Gabriela V. Villanova, Pilar Bulacio, and Elizabeth Tapia. "FGGA-lnc: automatic gene ontology annotation of lncRNA sequences based on secondary structures." Interface Focus 11, no. 4 (June 11, 2021): 20200064. http://dx.doi.org/10.1098/rsfs.2020.0064.

Full text
Abstract:
The study of long non-coding RNAs (lncRNAs), greater than 200 nucleotides, is central to understanding the development and progression of many complex diseases. Unlike proteins, the functionality of lncRNAs is only subtly encoded in their primary sequence. Current in-silico lncRNA annotation methods mostly rely on annotations inferred from interaction networks. But extensive experimental studies are required to build these networks. In this work, we present a graph-based machine learning method called FGGA-lnc for the automatic gene ontology (GO) annotation of lncRNAs across the three GO subdomains. We build upon FGGA (factor graph GO annotation), a computational method originally developed to annotate protein sequences from non-model organisms. In the FGGA-lnc version, a coding-based approach is introduced to fuse primary sequence and secondary structure information of lncRNA molecules. As a result, lncRNA sequences become sequences of a higher-order alphabet allowing supervised learning methods to assess individual GO-term annotations. Raw GO annotations obtained in this way are unaware of the GO structure and therefore likely to be inconsistent with it. The message-passing algorithm embodied by factor graph models overcomes this problem. Evaluations of the FGGA-lnc method on lncRNA data, from model and non-model organisms, showed promising results suggesting it as a candidate to satisfy the huge demand for functional annotations arising from high-throughput sequencing technologies.
APA, Harvard, Vancouver, ISO, and other styles
33

Ramakrishnaiah, Yashpal, Adam P. Morris, Jasbir Dhaliwal, Melcy Philip, Levin Kuhlmann, and Sonika Tyagi. "Linc2function: A Comprehensive Pipeline and Webserver for Long Non-Coding RNA (lncRNA) Identification and Functional Predictions Using Deep Learning Approaches." Epigenomes 7, no. 3 (September 15, 2023): 22. http://dx.doi.org/10.3390/epigenomes7030022.

Full text
Abstract:
Long non-coding RNAs (lncRNAs), comprising a significant portion of the human transcriptome, serve as vital regulators of cellular processes and potential disease biomarkers. However, the function of most lncRNAs remains unknown, and furthermore, existing approaches have focused on gene-level investigation. Our work emphasizes the importance of transcript-level annotation to uncover the roles of specific transcript isoforms. We propose that understanding the mechanisms of lncRNA in pathological processes requires solving their structural motifs and interactomes. A complete lncRNA annotation first involves discriminating them from their coding counterparts and then predicting their functional motifs and target bio-molecules. Current in silico methods mainly perform primary-sequence-based discrimination using a reference model, limiting their comprehensiveness and generalizability. We demonstrate that integrating secondary structure and interactome information, in addition to using transcript sequence, enables a comprehensive functional annotation. Annotating lncRNA for newly sequenced species is challenging due to inconsistencies in functional annotations, specialized computational techniques, limited accessibility to source code, and the shortcomings of reference-based methods for cross-species predictions. To address these challenges, we developed a pipeline for identifying and annotating transcript sequences at the isoform level. We demonstrate the effectiveness of the pipeline by comprehensively annotating the lncRNA associated with two specific disease groups. The source code of our pipeline is available under the MIT licensefor local use by researchers to make new predictions using the pre-trained models or to re-train models on new sequence datasets. Non-technical users can access the pipeline through a web server setup.
APA, Harvard, Vancouver, ISO, and other styles
34

Yssel, Anna E. J., Shu-Min Kao, Yves Van de Peer, and Lieven Sterck. "ORCAE-AOCC: A Centralized Portal for the Annotation of African Orphan Crop Genomes." Genes 10, no. 12 (November 20, 2019): 950. http://dx.doi.org/10.3390/genes10120950.

Full text
Abstract:
ORCAE (Online Resource for Community Annotation of Eukaryotes) is a public genome annotation curation resource. ORCAE-AOCC is a branch that is dedicated to the genomes published as part of the African Orphan Crops Consortium (AOCC). The motivation behind the development of the ORCAE platform was to create a knowledge-based website where the research-community can make contributions to improve genome annotations. All changes to any given gene-model or gene description are stored, and the entire annotation history can be retrieved. Genomes can either be set to “public” or “restricted” mode; anonymous users can browse public genomes but cannot make any changes. Aside from providing a user- friendly interface to view genome annotations, the platform also includes tools and information (such as gene expression evidence) that enables authorized users to edit and validate genome annotations. The ORCAE-AOCC platform will enable various stakeholders from around the world to coordinate their efforts to annotate and study underutilized crops.
APA, Harvard, Vancouver, ISO, and other styles
35

Pham, Vinh, Dung Dinh, Eunil Seo, and Tai-Myoung Chung. "COVID-19-Associated Lung Lesion Detection by Annotating Medical Image with Semi Self-Supervised Technique." Electronics 11, no. 18 (September 13, 2022): 2893. http://dx.doi.org/10.3390/electronics11182893.

Full text
Abstract:
Diagnosing COVID-19 infection through the classification of chest images using machine learning techniques faces many controversial problems owing to the intrinsic nature of medical image data and classification architectures. The detection of lesions caused by COVID-19 in the human lung with properties such as location, size, and distribution is more practical and meaningful to medical workers for severity assessment, progress monitoring, and treatment, thus improving patients’ recovery. We proposed a COVID-19-associated lung lesion detector based on an object detection architecture. It correctly learns disease-relevant features by focusing on lung lesion annotation data of medical images. An annotated COVID-19 image dataset is currently nonexistent. We designed our semi-self-supervised method, which can extract knowledge from available annotated pneumonia image data and guide a novice in annotating lesions on COVID-19 images in the absence of a medical specialist. We prepared a sufficient dataset with nearly 8000 lung lesion annotations to train our deep learning model. We comprehensively evaluated our model on a test dataset with nearly 1500 annotations. The results demonstrated that the COVID-19 images annotated by our method significantly enhanced the model’s accuracy by as much as 1.68 times, and our model competes with commercialized solutions. Finally, all experimental data from multiple sources with different annotation data formats are standardized into a unified COCO format and publicly available to the research community to accelerate research on the detection of COVID-19 using deep learning.
APA, Harvard, Vancouver, ISO, and other styles
36

Braylan, Alexander, Madalyn Marabella, Omar Alonso, and Matthew Lease. "A General Model for Aggregating Annotations AcrossSimple, Complex, and Multi-object Annotation Tasks (Abstract Reprint)." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 20 (March 24, 2024): 22693. http://dx.doi.org/10.1609/aaai.v38i20.30593.

Full text
Abstract:
Human annotations are vital to supervised learning, yet annotators often disagree on the correct label, especially as annotation tasks increase in complexity. A common strategy to improve label quality is to ask multiple annotators to label the same item and then aggregate their labels. To date, many aggregation models have been proposed for simple categorical or numerical annotation tasks, but far less work has considered more complex annotation tasks, such as those involving open-ended, multivariate, or structured responses. Similarly, while a variety of bespoke models have been proposed for specific tasks, our work is the first we are aware of to introduce aggregation methods that generalize across many, diverse complex tasks, including sequence labeling, translation, syntactic parsing, ranking, bounding boxes, and keypoints. This generality is achieved by applying readily available task-specific distance functions, then devising a task-agnostic method to model these distances between labels, rather than the labels themselves. This article presents a unified treatment of our prior work on complex annotation modeling and extends that work with investigation of three new research questions. First, how do complex annotation task and dataset properties impact aggregation accuracy? Second, how should a task owner navigate the many modeling choices in order to maximize aggregation accuracy? Finally, what tests and diagnoses can verify that aggregation models are specified correctly for the given data? To understand how various factors impact accuracy and to inform model selection, we conduct large-scale simulation studies and broad experiments on real, complex datasets. Regarding testing, we introduce the concept of unit tests for aggregation models and present a suite of such tests to ensure that a given model is not mis-specified and exhibits expected behavior. Beyond investigating these research questions above, we discuss the foundational concept and nature of annotation complexity, present a new aggregation model as a conceptual bridge between traditional models and our own, and contribute a new general semisupervised learning method for complex label aggregation that outperforms prior work.
APA, Harvard, Vancouver, ISO, and other styles
37

Kojima, Ryosuke, Osamu Sugiyama, Kotaro Hoshiba, Kazuhiro Nakadai, Reiji Suzuki, and Charles E. Taylor. "Bird Song Scene Analysis Using a Spatial-Cue-Based Probabilistic Model." Journal of Robotics and Mechatronics 29, no. 1 (February 20, 2017): 236–46. http://dx.doi.org/10.20965/jrm.2017.p0236.

Full text
Abstract:
[abstFig src='/00290001/22.jpg' width='300' text='Spatial-cue-based probabilistic model' ] This paper addresses bird song scene analysis based on semi-automatic annotation. Research in animal behavior, especially in birds, would be aided by automated or semi-automated systems that can localize sounds, measure their timing, and identify their sources. This is difficult to achieve in real environments, in which several birds at different locations may be singing at the same time. Analysis of recordings from the wild has usually required manual annotation. These annotations may be inaccurate or inconsistent, as they may vary within and between observers. Here we suggest a system that uses automated methods from robot audition, including sound source detection, localization, separation and identification. In robot audition, these technologies are assessed separately, but combining them has often led to poor performance in natural setting. We propose a new Spatial-Cue-Based Probabilistic Model (SCBPM) for their integration focusing on spatial information. A second problem has been that supervised machine learning methods usually require a pre-trained model, which may need a large training set of annotated labels. We have employed a semi-automatic annotation approach, in which a semi-supervised training method is deduced for a new model. This method requires much less pre-annotation. Preliminary experiments with recordings of bird songs from the wild revealed that our system outperformed the identification accuracy of a method based on conventional robot audition.**This paper is an extension of a proceeding of IROS2015.
APA, Harvard, Vancouver, ISO, and other styles
38

Lin, Tai-Pei, Chiou-Ying Yang, Ko-Jiunn Liu, Meng-Yuan Huang, and Yen-Lin Chen. "Immunohistochemical Stain-Aided Annotation Accelerates Machine Learning and Deep Learning Model Development in the Pathologic Diagnosis of Nasopharyngeal Carcinoma." Diagnostics 13, no. 24 (December 18, 2023): 3685. http://dx.doi.org/10.3390/diagnostics13243685.

Full text
Abstract:
Nasopharyngeal carcinoma (NPC) is an epithelial cancer originating in the nasopharynx epithelium. Nevertheless, annotating pathology slides remains a bottleneck in the development of AI-driven pathology models and applications. In the present study, we aim to demonstrate the feasibility of using immunohistochemistry (IHC) for annotation by non-pathologists and to develop an efficient model for distinguishing NPC without the time-consuming involvement of pathologists. For this study, we gathered NPC slides from 251 different patients, comprising hematoxylin and eosin (H&E) slides, pan-cytokeratin (Pan-CK) IHC slides, and Epstein–Barr virus-encoded small RNA (EBER) slides. The annotation of NPC regions in the H&E slides was carried out by a non-pathologist trainee who had access to corresponding Pan-CK IHC slides, both with and without EBER slides. The training process utilized ResNeXt, a deep neural network featuring a residual and inception architecture. In the validation set, NPC exhibited an AUC of 0.896, with a sensitivity of 0.919 and a specificity of 0.878. This study represents a significant breakthrough: the successful application of deep convolutional neural networks to identify NPC without the need for expert pathologist annotations. Our results underscore the potential of laboratory techniques to substantially reduce the workload of pathologists.
APA, Harvard, Vancouver, ISO, and other styles
39

Grudza, Matthew, Brandon Salinel, Sarah Zeien, Matthew Murphy, Jake Adkins, Corey T. Jensen, Curtis Bay, et al. "Methods for improving colorectal cancer annotation efficiency for artificial intelligence-observer training." World Journal of Radiology 15, no. 12 (December 28, 2023): 359–69. http://dx.doi.org/10.4329/wjr.v15.i12.359.

Full text
Abstract:
BACKGROUND Missing occult cancer lesions accounts for the most diagnostic errors in retrospective radiology reviews as early cancer can be small or subtle, making the lesions difficult to detect. Second-observer is the most effective technique for reducing these events and can be economically implemented with the advent of artificial intelligence (AI). AIM To achieve appropriate AI model training, a large annotated dataset is necessary to train the AI models. Our goal in this research is to compare two methods for decreasing the annotation time to establish ground truth: Skip-slice annotation and AI-initiated annotation. METHODS We developed a 2D U-Net as an AI second observer for detecting colorectal cancer (CRC) and an ensemble of 5 differently initiated 2D U-Net for ensemble technique. Each model was trained with 51 cases of annotated CRC computed tomography of the abdomen and pelvis, tested with 7 cases, and validated with 20 cases from The Cancer Imaging Archive cases. The sensitivity, false positives per case, and estimated Dice coefficient were obtained for each method of training. We compared the two methods of annotations and the time reduction associated with the technique. The time differences were tested using Friedman’s two-way analysis of variance. RESULTS Sparse annotation significantly reduces the time for annotation particularly skipping 2 slices at a time (P < 0.001). Reduction of up to 2/3 of the annotation does not reduce AI model sensitivity or false positives per case. Although initializing human annotation with AI reduces the annotation time, the reduction is minimal, even when using an ensemble AI to decrease false positives. CONCLUSION Our data support the sparse annotation technique as an efficient technique for reducing the time needed to establish the ground truth.
APA, Harvard, Vancouver, ISO, and other styles
40

Han, Zhoupeng, Hua Zhang, Weirong He, Li Ba, and Qilong Yuan. "Automatic Annotation of Functional Semantics for 3D Product Model Based on Latent Functional Semantics." Scientific Programming 2023 (February 4, 2023): 1–10. http://dx.doi.org/10.1155/2023/9885859.

Full text
Abstract:
To support effectively function-driven 3D model retrieval in the phase of mechanical product conceptual design and improve the efficiency of functional semantics annotation for 3D models, an approach for functional semantics automatic annotation for mechanical 3D product model based on latent functional semantics is presented. First, the design knowledge and function knowledge of mechanical product model are analyzed, and the ontology-based functional semantics for assembly product is constructed. Then, some concept about functional region is defined, and the 3D product model is decomposed into functional regions with different levels of granularity. The similarity of the functional region is evaluated considering multisource attribute information and geometric shape. Subsequently, the similarity based on latent functional semantics annotation model for functional regions is established, which is employed for annotating automatic latent functional semantics in the 3D product model structure. Finally, mechanical 3D models in the model library are used to verify the effectiveness and feasibility of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
41

Wu, Nannan, Zhaobin Sun, Zengqiang Yan, and Li Yu. "FedA3I: Annotation Quality-Aware Aggregation for Federated Medical Image Segmentation against Heterogeneous Annotation Noise." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 14 (March 24, 2024): 15943–51. http://dx.doi.org/10.1609/aaai.v38i14.29525.

Full text
Abstract:
Federated learning (FL) has emerged as a promising paradigm for training segmentation models on decentralized medical data, owing to its privacy-preserving property. However, existing research overlooks the prevalent annotation noise encountered in real-world medical datasets, which limits the performance ceilings of FL. In this paper, we, for the first time, identify and tackle this problem. For problem formulation, we propose a contour evolution for modeling non-independent and identically distributed (Non-IID) noise across pixels within each client and then extend it to the case of multi-source data to form a heterogeneous noise model (i.e., Non-IID annotation noise across clients). For robust learning from annotations with such two-level Non-IID noise, we emphasize the importance of data quality in model aggregation, allowing high-quality clients to have a greater impact on FL. To achieve this, we propose Federated learning with Annotation quAlity-aware AggregatIon, named FedA3I, by introducing a quality factor based on client-wise noise estimation. Specifically, noise estimation at each client is accomplished through the Gaussian mixture model and then incorporated into model aggregation in a layer-wise manner to up-weight high-quality clients. Extensive experiments on two real-world medical image segmentation datasets demonstrate the superior performance of FedA3I against the state-of-the-art approaches in dealing with cross-client annotation noise. The code is available at https://github.com/wnn2000/FedAAAI.
APA, Harvard, Vancouver, ISO, and other styles
42

Ma, Qin Yi, Mao Jun Zhou, Ming Wei Wang, and Hui Hui Wang. "Semantic Annotation of CAD Model Based on Ontology." Advanced Materials Research 760-762 (September 2013): 1767–72. http://dx.doi.org/10.4028/www.scientific.net/amr.760-762.1767.

Full text
Abstract:
This paper proposes an approach to annotate the CAD models based on ontology with the aim of making the design intent understood by computer and applied in follow product development process, such as FEA. Semantic markup can embed the engineering semantic information such as product function, and design principle into the CAD geometry data through annotating, it makes the analyzers reuse design ideas quickly and conveniently to increase efficiency. The paper presents the design domain ontology and FEA domain ontology, and applies feature technologies and the semantic Web to complete annotation. Finally, a case study is carried out.
APA, Harvard, Vancouver, ISO, and other styles
43

Tanaka, Fumiki, Hiroyuki Abe, Shinji Igari, and Masahiko Onosato. "Integrated Information Model for Design, Machining, and Measuring Using Annotated Features." International Journal of Automation Technology 8, no. 3 (May 5, 2014): 388–95. http://dx.doi.org/10.20965/ijat.2014.p0388.

Full text
Abstract:
Annotations on Geometric Dimensioning and Tolerancing (GD&T), surface roughness, etc. are needed for machining or measuring. However, these annotations are not used for the digital format in the product development process, nor is there any clear, explicit relationship between annotation, machining information, and measuring results. In this research, an integrated information model for design, machining, and measuring based on annotated features is proposed. A model for surface texture is also proposed because surface texture parameters are closely related to machining process parameters. A modeling system for the proposed integrated model is also implemented.
APA, Harvard, Vancouver, ISO, and other styles
44

Wei, Qiang, Yukun Chen, Mandana Salimi, Joshua C. Denny, Qiaozhu Mei, Thomas A. Lasko, Qingxia Chen, et al. "Cost-aware active learning for named entity recognition in clinical text." Journal of the American Medical Informatics Association 26, no. 11 (July 11, 2019): 1314–22. http://dx.doi.org/10.1093/jamia/ocz102.

Full text
Abstract:
Abstract Objective Active Learning (AL) attempts to reduce annotation cost (ie, time) by selecting the most informative examples for annotation. Most approaches tacitly (and unrealistically) assume that the cost for annotating each sample is identical. This study introduces a cost-aware AL method, which simultaneously models both the annotation cost and the informativeness of the samples and evaluates both via simulation and user studies. Materials and Methods We designed a novel, cost-aware AL algorithm (Cost-CAUSE) for annotating clinical named entities; we first utilized lexical and syntactic features to estimate annotation cost, then we incorporated this cost measure into an existing AL algorithm. Using the 2010 i2b2/VA data set, we then conducted a simulation study comparing Cost-CAUSE with noncost-aware AL methods, and a user study comparing Cost-CAUSE with passive learning. Results Our cost model fit empirical annotation data well, and Cost-CAUSE increased the simulation area under the learning curve (ALC) scores by up to 5.6% and 4.9%, compared with random sampling and alternate AL methods. Moreover, in a user annotation task, Cost-CAUSE outperformed passive learning on the ALC score and reduced annotation time by 20.5%–30.2%. Discussion Although AL has proven effective in simulations, our user study shows that a real-world environment is far more complex. Other factors have a noticeable effect on the AL method, such as the annotation accuracy of users, the tiredness of users, and even the physical and mental condition of users. Conclusion Cost-CAUSE saves significant annotation cost compared to random sampling.
APA, Harvard, Vancouver, ISO, and other styles
45

Messer, Dolores, Michael Atchapero, Mark B. Jensen, Michelle S. Svendsen, Anders Galatius, Morten T. Olsen, Jeppe R. Frisvad, et al. "Using virtual reality for anatomical landmark annotation in geometric morphometrics." PeerJ 10 (February 7, 2022): e12869. http://dx.doi.org/10.7717/peerj.12869.

Full text
Abstract:
To study the shape of objects using geometric morphometrics, landmarks are oftentimes collected digitally from a 3D scanned model. The expert may annotate landmarks using software that visualizes the 3D model on a flat screen, and interaction is achieved with a mouse and a keyboard. However, landmark annotation of a 3D model on a 2D display is a tedious process and potentially introduces error due to the perception and interaction limitations of the flat interface. In addition, digital landmark placement can be more time-consuming than direct annotation on the physical object using a tactile digitizer arm. Since virtual reality (VR) is designed to more closely resemble the real world, we present a VR prototype for annotating landmarks on 3D models. We study the impact of VR on annotation performance by comparing our VR prototype to Stratovan Checkpoint, a commonly used commercial desktop software. We use an experimental setup, where four operators placed six landmarks on six grey seal (Halichoerus grypus) skulls in six trials for both systems. This enables us to investigate multiple sources of measurement error. We analyse both for the configuration and for single landmarks. Our analysis shows that annotation in VR is a promising alternative to desktop annotation. We find that annotation precision is comparable between the two systems, with VR being significantly more precise for one of the landmarks. We do not find evidence that annotation in VR is faster than on the desktop, but it is accurate.
APA, Harvard, Vancouver, ISO, and other styles
46

Chen, Chih-Ming, and Ming-Yueh Tsay. "Applications of collaborative annotation system in digital curation, crowdsourcing, and digital humanities." Electronic Library 35, no. 6 (November 6, 2017): 1122–40. http://dx.doi.org/10.1108/el-08-2016-0172.

Full text
Abstract:
Purpose Collaboratively annotating digital texts allow users to add valued information, share ideas and create knowledge. Most importantly, annotated content can help users obtain a deeper and broader understanding of a text compared to digital content without annotations. This work proposes a novel collaborative annotation system (CAS) with four types of multimedia annotations including text annotation, picture annotation, voice annotation and video annotation which can embedded into any HTML Web pages to enable users to collaboratively add and manage annotations on these pages and provide a shared mechanism for discussing shared annotations among multiple users. By applying the CAS in a mashup on static HTML Web pages, this study aims to discuss the applications of CAS in digital curation, crowdsourcing and digital humanities to encourage existing strong relations among them. Design/methodology/approach This work adopted asynchronous JavaScript (Ajax) and a model-view-controller framework to implement a CAS with reading annotation tools for knowledge creating, archiving and sharing services, as well as applying the implemented CAS to support digital curation, crowdsourcing and digital humanities. A questionnaire survey method was used to investigate the ideas and satisfaction of visitors who attended a digital curation with CAS support in the item dimensions of the interactivity with displayed products, the attraction and the content absorption effect. Also, to collect qualitative data that may not be revealed by the questionnaire survey, semi-structured interviews were performed at the end of the digital curation exhibition activity. Additionally, the effects of the crowdsourcing and digital humanities with CAS support on collecting and organizing ideas and opinions for historical events and promoting humanity research outcomes were considered as future works because they all need to take a long time to investigate. Findings Based on the questionnaire survey, this work found that the digital curation with CAS support revealed the highest rating score in terms of the item dimension of attraction effect. The result shows applying CAS to support digital curation is practicable, novel and interesting to visitors. Additionally, this work also successfully applied the developed CAS to crowdsourcing and digital humanities so that the two research fields may be brought into a new ground. Originality/value Based on the CAS, this work developed a novel digital curation approach which has a high degree of satisfaction on attraction effect to visitors, an innovative crowdsourcing platform that combined with a digital archive system to efficiently gather collective intelligence to solve the difficult problems of identifying digital archive contents and a high potential digital humanity research mode that can assist humanities scholars to annotate the texts with distinct interpretation and viewpoints on an ancient map, as well as discuss with other humanities scholars to stimulate discussion on more issues.
APA, Harvard, Vancouver, ISO, and other styles
47

Hussein, Shereen A., Howida Youssry Abd El Naby, and Aliaa A. A. Youssif. "Review: Automatic Semantic Image Annotation." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 15, no. 12 (September 28, 2016): 7290–97. http://dx.doi.org/10.24297/ijct.v15i12.4357.

Full text
Abstract:
There are many approaches for automatic annotation in digital images. Nowadays digital photography is a common technology for capturing and archiving images because of the digital cameras and storage devices reasonable price. As amount of the digital images increase, the problem of annotating a specific image becomes a critical issue. Automated image annotation is creating a model capable of assigning terms to an image in order to describe its content. There are many image annotation techniques that seek to find the correlation between words and image features such as color, shape, and texture to provide an automatically correct annotation words to images which provides an alternative to the time consuming work of manual image annotation. This paper aims to cover a review on different Models (MT, CRM, CSD-Prop, SVD-COS and CSD-SVD) for automating the process of image annotation as an intermediate step in image retrieval process using Corel 5k images.
APA, Harvard, Vancouver, ISO, and other styles
48

Cai, Tingting, Hongping Yan, Kun Ding, Yan Zhang, and Yueyue Zhou. "WSPolyp-SAM: Weakly Supervised and Self-Guided Fine-Tuning of SAM for Colonoscopy Polyp Segmentation." Applied Sciences 14, no. 12 (June 8, 2024): 5007. http://dx.doi.org/10.3390/app14125007.

Full text
Abstract:
Ensuring precise segmentation of colorectal polyps holds critical importance in the early diagnosis and treatment of colorectal cancer. Nevertheless, existing deep learning-based segmentation methods are fully supervised, requiring extensive, precise, manual pixel-level annotation data, which leads to high annotation costs. Additionally, it remains challenging to train large-scale segmentation models when confronted with limited colonoscopy data. To address these issues, we introduce the general segmentation foundation model—the Segment Anything Model (SAM)—into the field of medical image segmentation. Fine-tuning the foundation model is an effective approach to tackle sample scarcity. However, current SAM fine-tuning techniques still rely on precise annotations. To overcome this limitation, we propose WSPolyp-SAM, a novel weakly supervised approach for colonoscopy polyp segmentation. WSPolyp-SAM utilizes weak annotations to guide SAM in generating segmentation masks, which are then treated as pseudo-labels to guide the fine-tuning of SAM, thereby reducing the dependence on precise annotation data. To improve the reliability and accuracy of pseudo-labels, we have designed a series of enhancement strategies to improve the quality of pseudo-labels and mitigate the negative impact of low-quality pseudo-labels. Experimental results on five medical image datasets demonstrate that WSPolyp-SAM outperforms current fully supervised mainstream polyp segmentation networks on the Kvasir-SEG, ColonDB, CVC-300, and ETIS datasets. Furthermore, by using different amounts of training data in weakly supervised and fully supervised experiments, it is found that weakly supervised fine-tuning can save 70% to 73% of annotation time costs compared to fully supervised fine-tuning. This study provides a new perspective on the combination of weakly supervised learning and SAM models, significantly reducing annotation time and offering insights for further development in the field of colonoscopy polyp segmentation.
APA, Harvard, Vancouver, ISO, and other styles
49

Davani, Aida Mostafazadeh, Mark Díaz, and Vinodkumar Prabhakaran. "Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations." Transactions of the Association for Computational Linguistics 10 (2022): 92–110. http://dx.doi.org/10.1162/tacl_a_00449.

Full text
Abstract:
Abstract Majority voting and averaging are common approaches used to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases and values, especially in the case of subjective tasks such as detecting affect, aggression, and hate speech. Annotator disagreements may capture important nuances in such tasks that are often ignored while aggregating annotations to a single ground truth. In order to address this, we investigate the efficacy of multi-annotator models. In particular, our multi-task based approach treats predicting each annotators’ judgements as separate subtasks, while sharing a common learned representation of the task. We show that this approach yields same or better performance than aggregating labels in the data prior to training across seven different binary classification tasks. Our approach also provides a way to estimate uncertainty in predictions, which we demonstrate better correlate with annotation disagreements than traditional methods. Being able to model uncertainty is especially useful in deployment scenarios where knowing when not to make a prediction is important.
APA, Harvard, Vancouver, ISO, and other styles
50

Yang, Haiwang, Maria Jaime, Maxi Polihronakis, Kelvin Kanegawa, Therese Markow, Kenneth Kaneshiro, and Brian Oliver. "Re-annotation of eight Drosophila genomes." Life Science Alliance 1, no. 6 (December 2018): e201800156. http://dx.doi.org/10.26508/lsa.201800156.

Full text
Abstract:
The sequenced genomes of the Drosophila phylogeny are a central resource for comparative work supporting the understanding of the Drosophila melanogaster non-mammalian model system. These have also facilitated evolutionary studies on the selected and random differences that distinguish the thousands of extant species of Drosophila. However, full utility has been hampered by uneven genome annotation. We have generated a large expression profile dataset for nine species of Drosophila and trained a transcriptome assembly approach on D. melanogaster that best matched the extensively curated annotation. We then applied this to the other species to add more than 10000 transcript models per species. We also developed new orthologs to facilitate cross-species comparisons. We validated the new annotation of the distantly related Drosophila grimshawi with an extensive collection of newly sequenced cDNAs. This re-annotation will facilitate understanding both the core commonalities and the species differences in this important group of model organisms, and suggests a strategy for annotating the many forthcoming genomes covering the tree of life.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography