To see the other types of publications on this topic, follow the link: Learning with noisy labels.

Journal articles on the topic 'Learning with noisy labels'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Learning with noisy labels.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Xie, Ming-Kun, and Sheng-Jun Huang. "Partial Multi-Label Learning with Noisy Label Identification." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6454–61. http://dx.doi.org/10.1609/aaai.v34i04.6117.

Full text
Abstract:
Partial multi-label learning (PML) deals with problems where each instance is assigned with a candidate label set, which contains multiple relevant labels and some noisy labels. Recent studies usually solve PML problems with the disambiguation strategy, which recovers ground-truth labels from the candidate label set by simply assuming that the noisy labels are generated randomly. In real applications, however, noisy labels are usually caused by some ambiguous contents of the example. Based on this observation, we propose a partial multi-label learning approach to simultaneously recover the ground-truth information and identify the noisy labels. The two objectives are formalized in a unified framework with trace norm and ℓ1 norm regularizers. Under the supervision of the observed noise-corrupted label matrix, the multi-label classifier and noisy label identifier are jointly optimized by incorporating the label correlation exploitation and feature-induced noise model. Extensive experiments on synthetic as well as real-world data sets validate the effectiveness of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Mingcai, Hao Cheng, Yuntao Du, Ming Xu, Wenyu Jiang, and Chongjun Wang. "Two Wrongs Don’t Make a Right: Combating Confirmation Bias in Learning with Label Noise." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (June 26, 2023): 14765–73. http://dx.doi.org/10.1609/aaai.v37i12.26725.

Full text
Abstract:
Noisy labels damage the performance of deep networks. For robust learning, a prominent two-stage pipeline alternates between eliminating possible incorrect labels and semi-supervised training. However, discarding part of noisy labels could result in a loss of information, especially when the corruption has a dependency on data, e.g., class-dependent or instance-dependent. Moreover, from the training dynamics of a representative two-stage method DivideMix, we identify the domination of confirmation bias: pseudo-labels fail to correct a considerable amount of noisy labels, and consequently, the errors accumulate. To sufficiently exploit information from noisy labels and mitigate wrong corrections, we propose Robust Label Refurbishment (Robust LR)—a new hybrid method that integrates pseudo-labeling and confidence estimation techniques to refurbish noisy labels. We show that our method successfully alleviates the damage of both label noise and confirmation bias. As a result, it achieves state-of-the-art performance across datasets and noise types, namely CIFAR under different levels of synthetic noise and mini-WebVision and ANIMAL-10N with real-world noise.
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Hui, Zhaodong Niu, Quan Sun, and Yabo Li. "Co-Correcting: Combat Noisy Labels in Space Debris Detection." Remote Sensing 14, no. 20 (October 21, 2022): 5261. http://dx.doi.org/10.3390/rs14205261.

Full text
Abstract:
Space debris detection is vital to space missions and space situation awareness. Convolutional neural networks are introduced to detect space debris due to their excellent performance. However, noisy labels, caused by false alarms, exist in space debris detection, and cause ambiguous targets for the training of networks, leading to networks overfitting the noisy labels and losing the ability to detect space debris. To remedy this challenge, we introduce label-noise learning to space debris detection and propose a novel label-noise learning paradigm, termed Co-correcting, to overcome the effects of noisy labels. Co-correcting comprises two identical networks, and the predictions of these networks serve as auxiliary supervised information to mutually correct the noisy labels of their peer networks. In this manner, the effect of noisy labels can be mitigated by the mutual rectification of the two networks. Empirical experiments show that Co-correcting outperforms other state-of-the-art methods of label-noise learning, such as Co-teaching and JoCoR, in space debris detection. Even with a high label noise rate, the network trained via Co-correcting can detect space debris with high detection probability.
APA, Harvard, Vancouver, ISO, and other styles
4

Tang, Xinyu, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal. "Machine Learning with Differentially Private Labels: Mechanisms and Frameworks." Proceedings on Privacy Enhancing Technologies 2022, no. 4 (October 2022): 332–50. http://dx.doi.org/10.56553/popets-2022-0112.

Full text
Abstract:
Label differential privacy is a relaxation of differential privacy for machine learning scenarios where the labels are the only sensitive information that needs to be protected in the training data. For example, imagine a survey from a participant in a university class about their vaccination status. Some attributes of the students are publicly available but their vaccination status is sensitive information and must remain private. Now if we want to train a model that predicts whether a student has received vaccination using only their public information, we can use label-DP. Recent works on label-DP use different ways of adding noise to the labels in order to obtain label-DP models. In this work, we present novel techniques for training models with label-DP guarantees by leveraging unsupervised learning and semi-supervised learning, enabling us to inject less noise while obtaining the same privacy, therefore achieving a better utility-privacy trade-off. We first introduce a framework that starts with an unsupervised classifier f0 and dataset D with noisy label set Y , reduces the noise in Y using f0 , and then trains a new model f using the less noisy dataset. Our noise reduction strategy uses the model f0 to remove the noisy labels that are incorrect with high probability. Then we use semi-supervised learning to train a model using the remaining labels. We instantiate this framework with multiple ways of obtaining the noisy labels and also the base classifier. As an alternative way to reduce the noise, we explore the effect of using unsupervised learning: we only add noise to a majority voting step for associating the learned clusters with a cluster label (as opposed to adding noise to individual labels); the reduced sensitivity enables us to add less noise. Our experiments show that these techniques can significantly outperform the prior works on label-DP.
APA, Harvard, Vancouver, ISO, and other styles
5

Wu, Yichen, Jun Shu, Qi Xie, Qian Zhao, and Deyu Meng. "Learning to Purify Noisy Labels via Meta Soft Label Corrector." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10388–96. http://dx.doi.org/10.1609/aaai.v35i12.17244.

Full text
Abstract:
Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels. Label correction strategy is commonly used to alleviate this issue by identifying suspected noisy labels and then correcting them. Current approaches to correcting corrupted labels usually need manually pre-defined label correction rules, which makes it hard to apply in practice due to the large variations of such manual strategies with respect to different problems. To address this issue, we propose a meta-learning model, aiming at attaining an automatic scheme which can estimate soft labels through meta-gradient descent step under the guidance of a small amount of noise-free meta data. By viewing the label correction procedure as a meta-process and using a meta-learner to automatically correct labels, our method can adaptively obtain rectified soft labels gradually in iteration according to current training problems. Besides, our method is model-agnostic and can be combined with any other existing classification models with ease to make it available to noisy label cases. Comprehensive experiments substantiate the superiority of our method in both synthetic and real-world problems with noisy labels compared with current state-of-the-art label correction strategies.
APA, Harvard, Vancouver, ISO, and other styles
6

Zheng, Guoqing, Ahmed Hassan Awadallah, and Susan Dumais. "Meta Label Correction for Noisy Label Learning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 11053–61. http://dx.doi.org/10.1609/aaai.v35i12.17319.

Full text
Abstract:
Leveraging weak or noisy supervision for building effective machine learning models has long been an important research problem. Its importance has further increased recently due to the growing need for large-scale datasets to train deep learning models. Weak or noisy supervision could originate from multiple sources including non-expert annotators or automatic labeling based on heuristics or user interaction signals. There is an extensive amount of previous work focusing on leveraging noisy labels. Most notably, recent work has shown impressive gains by using a meta-learned instance re-weighting approach where a meta-learning framework is used to assign instance weights to noisy labels. In this paper, we extend this approach via posing the problem as a label correction problem within a meta-learning framework. We view the label correction procedure as a meta-process and propose a new meta-learning based framework termed MLC (Meta Label Correction) for learning with noisy labels. Specifically, a label correction network is adopted as a meta-model to produce corrected labels for noisy labels while the main model is trained to leverage the corrected labels. Both models are jointly trained by solving a bi-level optimization problem. We run extensive experiments with different label noise levels and types on both image recognition and text classification tasks. We compare the re-weighing and correction approaches showing that the correction framing addresses some of the limitations of re-weighting. We also show that the proposed MLC approach outperforms previous methods in both image and language tasks.
APA, Harvard, Vancouver, ISO, and other styles
7

Shi, Jialin, Chenyi Guo, and Ji Wu. "A Hybrid Robust-Learning Architecture for Medical Image Segmentation with Noisy Labels." Future Internet 14, no. 2 (January 26, 2022): 41. http://dx.doi.org/10.3390/fi14020041.

Full text
Abstract:
Deep-learning models require large amounts of accurately labeled data. However, for medical image segmentation, high-quality labels rely on expert experience, and less-experienced operators provide noisy labels. How one might mitigate the negative effects caused by noisy labels for 3D medical image segmentation has not been fully investigated. In this paper, our purpose is to propose a novel hybrid robust-learning architecture to combat noisy labels for 3D medical image segmentation. Our method consists of three components. First, we focus on the noisy annotations of slices and propose a slice-level label-quality awareness method, which automatically generates label-quality scores for slices in a set. Second, we propose a shape-awareness regularization loss based on distance transform maps to introduce prior shape information and provide extra performance gains. Third, based on a re-weighting strategy, we propose an end-to-end hybrid robust-learning architecture to weaken the negative effects caused by noisy labels. Extensive experiments are performed on two representative datasets (i.e., liver segmentation and multi-organ segmentation). Our hybrid noise-robust architecture has shown competitive performance, compared to other methods. Ablation studies also demonstrate the effectiveness of slice-level label-quality awareness and a shape-awareness regularization loss for combating noisy labels.
APA, Harvard, Vancouver, ISO, and other styles
8

Northcutt, Curtis, Lu Jiang, and Isaac Chuang. "Confident Learning: Estimating Uncertainty in Dataset Labels." Journal of Artificial Intelligence Research 70 (April 14, 2021): 1373–411. http://dx.doi.org/10.1613/jair.1.12125.

Full text
Abstract:
Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and identifying label errors in datasets, based on the principles of pruning noisy data, counting with probabilistic thresholds to estimate noise, and ranking examples to train with confidence. Whereas numerous studies have developed these principles independently, here, we combine them, building on the assumption of a class-conditional noise process to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels. This results in a generalized CL which is provably consistent and experimentally performant. We present sufficient conditions where CL exactly finds label errors, and show CL performance exceeding seven recent competitive approaches for learning with noisy labels on the CIFAR dataset. Uniquely, the CL framework is not coupled to a specific data modality or model (e.g., we use CL to find several label errors in the presumed error-free MNIST dataset and improve sentiment classification on text data in Amazon Reviews). We also employ CL on ImageNet to quantify ontological class overlap (e.g., estimating 645 missile images are mislabeled as their parent class projectile), and moderately increase model accuracy (e.g., for ResNet) by cleaning data prior to training. These results are replicable using the open-source cleanlab release.
APA, Harvard, Vancouver, ISO, and other styles
9

Silva, Amila, Ling Luo, Shanika Karunasekera, and Christopher Leckie. "Noise-Robust Learning from Multiple Unsupervised Sources of Inferred Labels." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8315–23. http://dx.doi.org/10.1609/aaai.v36i8.20806.

Full text
Abstract:
Deep Neural Networks (DNNs) generally require large-scale datasets for training. Since manually obtaining clean labels for large datasets is extremely expensive, unsupervised models based on domain-specific heuristics can be used to efficiently infer the labels for such datasets. However, the labels from such inferred sources are typically noisy, which could easily mislead and lessen the generalizability of DNNs. Most approaches proposed in the literature to address this problem assume the label noise depends only on the true class of an instance (i.e., class-conditional noise). However, this assumption is not realistic for the inferred labels as they are typically inferred based on the features of the instances. The few recent attempts to model such instance-dependent (i.e., feature-dependent) noise require auxiliary information about the label noise (e.g., noise rates or clean samples). This work proposes a theoretically motivated framework to correct label noise in the presence of multiple labels inferred from unsupervised models. The framework consists of two modules: (1) MULTI-IDNC, a novel approach to correct label noise that is instance-dependent yet not class-conditional; (2) MULTI-CCNC, which extends an existing class-conditional noise-robust approach to yield improved class-conditional noise correction using multiple noisy label sources. We conduct experiments using nine real-world datasets for three different classification tasks (images, text and graph nodes). Our results show that our approach achieves notable improvements (e.g., 6.4% in accuracy) against state-of-the-art baselines while dealing with both instance-dependent and class-conditional noise in inferred label sources.
APA, Harvard, Vancouver, ISO, and other styles
10

Yan, Xuguo, Xuhui Xia, Lei Wang, and Zelin Zhang. "A Progressive Deep Neural Network Training Method for Image Classification with Noisy Labels." Applied Sciences 12, no. 24 (December 12, 2022): 12754. http://dx.doi.org/10.3390/app122412754.

Full text
Abstract:
Deep neural networks (DNNs) require large amounts of labeled data for model training. However, label noise is a common problem in datasets due to the difficulty of classification and high cost of labeling processes. Introducing the concepts of curriculum learning and progressive learning, this paper presents a novel solution that is able to handle massive noisy labels and improve model generalization ability. It proposes a new network model training strategy that considers mislabeled samples directly in the network training process. The new learning curriculum is designed to measures the complexity of the data with their distribution density in a feature space. The sample data in each category are then divided into easy-to-classify (clean samples), relatively easy-to-classify, and hard-to-classify (noisy samples) subsets according to the smallest intra-class local density with each cluster. On this basis, DNNs are trained progressively in three stages, from easy to hard, i.e., from clean to noisy samples. The experimental results demonstrate that the accuracy of image classification can be improved through data augmentation, and the classification accuracy of the proposed method is clearly higher than that of standard Inception_v2 for the NEU dataset after data augmentation, when the proportion of noisy labels in the training set does not exceed 60%. With 50% noisy labels in the training set, the classification accuracy of the proposed method outperformed recent state-of-the-art label noise learning methods, CleanNet and MentorNet. The proposed method also performed well in practical applications, where the number of noisy labels was uncertain and unevenly distributed. In this case, the proposed method not only can alleviate the adverse effects of noisy labels, but it can also improve the generalization ability of standard deep networks and their overall capability.
APA, Harvard, Vancouver, ISO, and other styles
11

Li, Shikun, Shiming Ge, Yingying Hua, Chunhui Zhang, Hao Wen, Tengfei Liu, and Weiqiang Wang. "Coupled-View Deep Classifier Learning from Multiple Noisy Annotators." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4667–74. http://dx.doi.org/10.1609/aaai.v34i04.5898.

Full text
Abstract:
Typically, learning a deep classifier from massive cleanly annotated instances is effective but impractical in many real-world scenarios. An alternative is collecting and aggregating multiple noisy annotations for each instance to train the classifier. Inspired by that, this paper proposes to learn deep classifier from multiple noisy annotators via a coupled-view learning approach, where the learning view from data is represented by deep neural networks for data classification and the learning view from labels is described by a Naive Bayes classifier for label aggregation. Such coupled-view learning is converted to a supervised learning problem under the mutual supervision of the aggregated and predicted labels, and can be solved via alternate optimization to update labels and refine the classifiers. To alleviate the propagation of incorrect labels, small-loss metric is proposed to select reliable instances in both views. A co-teaching strategy with class-weighted loss is further leveraged in the deep classifier learning, which uses two networks with different learning abilities to teach each other, and the diverse errors introduced by noisy labels can be filtered out by peer networks. By these strategies, our approach can finally learn a robust data classifier which less overfits to label noise. Experimental results on synthetic and real data demonstrate the effectiveness and robustness of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
12

Chen, Pengfei, Junjie Ye, Guangyong Chen, Jingwei Zhao, and Pheng-Ann Heng. "Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 13 (May 18, 2021): 11451–61. http://dx.doi.org/10.1609/aaai.v35i13.17364.

Full text
Abstract:
For multi-class classification under class-conditional label noise, we prove that the accuracy metric itself can be robust. We concretize this finding's inspiration in two essential aspects: training and validation, with which we address critical issues in learning with noisy labels. For training, we show that maximizing training accuracy on sufficiently many noisy samples yields an approximately optimal classifier. For validation, we prove that a noisy validation set is reliable, addressing the critical demand of model selection in scenarios like hyperparameter-tuning and early stopping. Previously, model selection using noisy validation samples has not been theoretically justified. We verify our theoretical results and additional claims with extensive experiments. We show characterizations of models trained with noisy labels, motivated by our theoretical results, and verify the utility of a noisy validation set by showing the impressive performance of a framework termed noisy best teacher and student (NTS). Our code is released.
APA, Harvard, Vancouver, ISO, and other styles
13

Yi, Rumeng, Dayan Guan, Yaping Huang, and Shijian Lu. "Class-Independent Regularization for Learning with Noisy Labels." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3276–84. http://dx.doi.org/10.1609/aaai.v37i3.25434.

Full text
Abstract:
Training deep neural networks (DNNs) with noisy labels often leads to poorly generalized models as DNNs tend to memorize the noisy labels in training. Various strategies have been developed for improving sample selection precision and mitigating the noisy label memorization issue. However, most existing works adopt a class-dependent softmax classifier that is vulnerable to noisy labels by entangling the classification of multi-class features. This paper presents a class-independent regularization (CIR) method that can effectively alleviate the negative impact of noisy labels in DNN training. CIR regularizes the class-dependent softmax classifier by introducing multi-binary classifiers each of which takes care of one class only. Thanks to its class-independent nature, CIR is tolerant to noisy labels as misclassification by one binary classifier does not affect others. For effective training of CIR, we design a heterogeneous adaptive co-teaching strategy that forces the class-independent and class-dependent classifiers to focus on sample selection and image classification, respectively, in a cooperative manner. Extensive experiments show that CIR achieves superior performance consistently across multiple benchmarks with both synthetic and real images. Code is available at https://github.com/RumengYi/CIR.
APA, Harvard, Vancouver, ISO, and other styles
14

Guo, Biyang, Songqiao Han, Xiao Han, Hailiang Huang, and Ting Lu. "Label Confusion Learning to Enhance Text Classification Models." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 14 (May 18, 2021): 12929–36. http://dx.doi.org/10.1609/aaai.v35i14.17529.

Full text
Abstract:
Representing the true label as one-hot vector is the common practice in training text classification models. However, the one-hot representation may not adequately reflect the relation between the instance and labels, as labels are often not completely independent and instances may relate to multiple labels in practice. The inadequate one-hot representations tend to train the model to be over-confident, which may result in arbitrary prediction and model overfitting, especially for confused datasets (datasets with very similar labels) or noisy datasets (datasets with labeling errors). While training models with label smoothing can ease this problem in some degree, it still fails to capture the realistic relation among labels. In this paper, we propose a novel Label Confusion Model (LCM) as an enhancement component to current popular text classification models. LCM can learn label confusion to capture semantic overlap among labels by calculating the similarity between instance and labels during training and generate a better label distribution to replace the original one-hot label vector, thus improving the final classification performance. Extensive experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models. Further experiments also verify that LCM is especially helpful for confused or noisy datasets and superior to the label smoothing method.
APA, Harvard, Vancouver, ISO, and other styles
15

Nushi, Besmira, Adish Singla, Andreas Krause, and Donald Kossmann. "Learning and Feature Selection under Budget Constraints in Crowdsourcing." Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 4 (September 21, 2016): 159–68. http://dx.doi.org/10.1609/hcomp.v4i1.13278.

Full text
Abstract:
The cost of data acquisition limits the amount of labeled data available for machine learning algorithms, both at the training and the testing phase. This problem is further exacerbated in real-world crowdsourcing applications where labels are aggregated from multiple noisy answers. We tackle classification problems where the underlying feature labels are unknown to the algorithm and a (noisy) label of the desired feature can be acquired at a fixed cost. This problem has two types of budget constraints - the total cost of feature labels available for learning at the training phase, and the cost of features to use during the testing phase for classification. We propose a novel budgeted learning and feature selection algorithm, B-LEAFS, for jointly tackling this problem in the presence of noise. Experimental evaluation on synthetic and real-world crowdsourcing data demonstrate the practical applicability of our approach.
APA, Harvard, Vancouver, ISO, and other styles
16

Ko, Jongwoo, Bongsoo Yi, and Se-Young Yun. "A Gift from Label Smoothing: Robust Training with Adaptive Label Smoothing via Auxiliary Classifier under Label Noise." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 7 (June 26, 2023): 8325–33. http://dx.doi.org/10.1609/aaai.v37i7.26004.

Full text
Abstract:
As deep neural networks can easily overfit noisy labels, robust training in the presence of noisy labels is becoming an important challenge in modern deep learning. While existing methods address this problem in various directions, they still produce unpredictable sub-optimal results since they rely on the posterior information estimated by the feature extractor corrupted by noisy labels. Lipschitz regularization successfully alleviates this problem by training a robust feature extractor, but it requires longer training time and expensive computations. Motivated by this, we propose a simple yet effective method, called ALASCA, which efficiently provides a robust feature extractor under label noise. ALASCA integrates two key ingredients: (1) adaptive label smoothing based on our theoretical analysis that label smoothing implicitly induces Lipschitz regularization, and (2) auxiliary classifiers that enable practical application of intermediate Lipschitz regularization with negligible computations. We conduct wide-ranging experiments for ALASCA and combine our proposed method with previous noise-robust methods on several synthetic and real-world datasets. Experimental results show that our framework consistently improves the robustness of feature extractors and the performance of existing baselines with efficiency.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhao, Tianna, Yuanjian Zhang, and Witold Pedrycz. "Robust Multi-Label Classification with Enhanced Global and Local Label Correlation." Mathematics 10, no. 11 (May 30, 2022): 1871. http://dx.doi.org/10.3390/math10111871.

Full text
Abstract:
Data representation is of significant importance in minimizing multi-label ambiguity. While most researchers intensively investigate label correlation, the research on enhancing model robustness is preliminary. Low-quality data is one of the main reasons that model robustness degrades. Aiming at the cases with noisy features and missing labels, we develop a novel method called robust global and local label correlation (RGLC). In this model, subspace learning reconstructs intrinsic latent features immune from feature noise. The manifold learning ensures that outputs obtained by matrix factorization are similar in the low-rank latent label if the latent features are similar. We examine the co-occurrence of global and local label correlation with the constructed latent features and the latent labels. Extensive experiments demonstrate that the classification performance with integrated information is statistically superior over a collection of state-of-the-art approaches across numerous domains. Additionally, the proposed model shows promising performance on multi-label when noisy features and missing labels occur, demonstrating the robustness of multi-label classification.
APA, Harvard, Vancouver, ISO, and other styles
18

Nie, Binling, and Chenyang Li. "Distantly Supervised Named Entity Recognition with Self-Adaptive Label Correction." Applied Sciences 12, no. 15 (July 29, 2022): 7659. http://dx.doi.org/10.3390/app12157659.

Full text
Abstract:
Named entity recognition has achieved remarkable success on benchmarks with high-quality manual annotations. Such annotations are labor-intensive and time-consuming, thus unavailable in real-world scenarios. An emerging interest is to generate low-cost but noisy labels via distant supervision, hence noisy label learning algorithms are in demand. In this paper, a unified self-adaptive learning framework termed Self-Adaptive Label cOrrection (SALO) is proposed. SALO adaptively performs a label correction process, both in an implicit and an explicit manners, turning noisy labels into correct ones, thus benefiting model training. The experimental results on four benchmark datasets demonstrated the superiority of SALO over the state-of-the-art distantly supervised methods. Moreover, a better version of noisy labels by ensembling several semantic matching methods was built. Experiments were carried out and consistent improvements were observed, validating the generalization of the proposed SALO.
APA, Harvard, Vancouver, ISO, and other styles
19

Zhang, Minxue, Ning Xu, and Xin Geng. "Feature-Induced Label Distribution for Learning with Noisy Labels." Pattern Recognition Letters 155 (March 2022): 107–13. http://dx.doi.org/10.1016/j.patrec.2022.02.011.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Li, Guozheng, Peng Wang, Qiqing Luo, Yanhe Liu, and Wenjun Ke. "Online Noisy Continual Relation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 13059–66. http://dx.doi.org/10.1609/aaai.v37i11.26534.

Full text
Abstract:
Recent work for continual relation learning has achieved remarkable progress. However, most existing methods only focus on tackling catastrophic forgetting to improve performance in the existing setup, while continually learning relations in the real-world must overcome many other challenges. One is that the data possibly comes in an online streaming fashion with data distributions gradually changing and without distinct task boundaries. Another is that noisy labels are inevitable in real-world, as relation samples may be contaminated by label inconsistencies or labeled with distant supervision. In this work, therefore, we propose a novel continual relation learning framework that simultaneously addresses both online and noisy relation learning challenges. Our framework contains three key modules: (i) a sample separated online purifying module that divides the online data stream into clean and noisy samples, (ii) a self-supervised online learning module that circumvents inferior training signals caused by noisy data, and (iii) a semi-supervised offline finetuning module that ensures the participation of both clean and noisy samples. Experimental results on FewRel, TACRED and NYT-H with real-world noise demonstrate that our framework greatly outperforms the combinations of the state-of-the-art online continual learning and noisy label learning methods.
APA, Harvard, Vancouver, ISO, and other styles
21

Yan, Shaotian, Xiang Tian, Rongxin Jiang, and Yaowu Chen. "FGCM: Noisy Label Learning via Fine-Grained Confidence Modeling." Applied Sciences 12, no. 22 (November 10, 2022): 11406. http://dx.doi.org/10.3390/app122211406.

Full text
Abstract:
A small portion of mislabeled data can easily limit the performance of deep neural networks (DNNs) due to their high capacity for memorizing random labels. Thus, robust learning from noisy labels has become a key challenge for deep learning due to inadequate datasets with high-quality annotations. Most existing methods involve training models on clean sets by dividing clean samples from noisy ones, resulting in large amounts of mislabeled data being unused. To address this problem, we propose categorizing training samples into five fine-grained clusters based on the difficulty experienced by DNN models when learning them and label correctness. A novel fine-grained confidence modeling (FGCM) framework is proposed to cluster samples into these five categories; with each cluster, FGCM decides whether to accept the cluster data as they are, accept them with label correction, or accept them as unlabeled data. By applying different strategies to the fine-grained clusters, FGCM can better exploit training data than previous methods. Extensive experiments on widely used benchmarks CIFAR-10, CIFAR-100, clothing1M, and WebVision with different ratios and types of label noise demonstrate the superiority of our FGCM.
APA, Harvard, Vancouver, ISO, and other styles
22

Wang, Zixiao, Junwu Weng, Chun Yuan, and Jue Wang. "Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 2751–58. http://dx.doi.org/10.1609/aaai.v37i3.25375.

Full text
Abstract:
Learning with noisy label is a classic problem that has been extensively studied for image tasks, but much less for video in the literature. A straightforward migration from images to videos without considering temporal semantics and computational cost is not a sound choice. In this paper, we propose two new strategies for video analysis with noisy labels: 1) a lightweight channel selection method dubbed as Channel Truncation for feature-based label noise detection. This method selects the most discriminative channels to split clean and noisy instances in each category. 2) A novel contrastive strategy dubbed as Noise Contrastive Learning, which constructs the relationship between clean and noisy instances to regularize model training. Experiments on three well-known benchmark datasets for video classification show that our proposed truNcatE-split-contrAsT (NEAT) significantly outperforms the existing baselines. By reducing the dimension to 10% of it, our method achieves over 0.4 noise detection F1-score and 5% classification accuracy improvement on Mini-Kinetics dataset under severe noise (symmetric-80%). Thanks to Noise Contrastive Learning, the average classification accuracy improvement on Mini-Kinetics and Sth-Sth-V1 is over 1.6%.
APA, Harvard, Vancouver, ISO, and other styles
23

Wang, Deng-Bao, Yong Wen, Lujia Pan, and Min-Ling Zhang. "Learning from Noisy Labels with Complementary Loss Functions." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 10111–19. http://dx.doi.org/10.1609/aaai.v35i11.17213.

Full text
Abstract:
Recent researches reveal that deep neural networks are sensitive to label noises hence leading to poor generalization performance in some tasks. Although different robust loss functions have been proposed to remedy this issue, they suffer from an underfitting problem, thus are not sufficient to learn accurate models. On the other hand, the commonly used Cross Entropy (CE) loss, which shows high performance in standard supervised learning (with clean supervision), is non-robust to label noise. In this paper, we propose a general framework to learn robust deep neural networks with complementary loss functions. In our framework, CE and robust loss play complementary roles in a joint learning objective as per their learning sufficiency and robustness properties respectively. Specifically, we find that by exploiting the memorization effect of neural networks, we can easily filter out a proportion of hard samples and generate reliable pseudo labels for easy samples, and thus reduce the label noise to a quite low level. Then, we simply learn with CE on pseudo supervision and robust loss on original noisy supervision. In this procedure, CE can guarantee the sufficiency of optimization while the robust loss can be regarded as the supplement. Experimental results on benchmark classification datasets indicate that the proposed method helps achieve robust and sufficient deep neural network training simultaneously.
APA, Harvard, Vancouver, ISO, and other styles
24

Liu, Xiaoli, Baoping Tang, Qikang Li, and Qichao Yang. "Twin prototype networks with noisy label self-correction for fault diagnosis of wind turbine gearboxes." Measurement Science and Technology 34, no. 3 (December 1, 2022): 035006. http://dx.doi.org/10.1088/1361-6501/aca3c3.

Full text
Abstract:
Abstract Deep strong-supervised learning-based methods have been widely used and have made significant progress in intelligent fault diagnosis for wind turbine (WT) gearboxes. The superior performance of such methods relies on high-quality labels. However, correctly labeling the data is challenging because of the complexity of fault vibration signals and fault modes in real industrial scenarios, resulting in noisy labels in datasets, which significantly restricts the application of strong-supervised fault diagnosis models. In this study, a method based on twin prototype networks with noisy label self-correction was proposed to address fault diagnosis for WT gearboxes with noisy labels. This method introduced a collaborative learning architecture to improve the confirmation bias in the self-training of individual networks and to slow the speed of learning noisy-labeled samples. Simultaneously, the loss distribution of the samples from each network was modeled using the Gaussian mixture model to dynamically identify mislabeled samples in the training dataset. Finally, a collaborative relabeling prototype refinement module was designed to optimize the prototype learning process and enable self-correction of noisy labels. The experiments demonstrated the effectiveness and superiority of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
25

Duan, Yunyan, and Ou Wu. "Learning With Auxiliary Less-Noisy Labels." IEEE Transactions on Neural Networks and Learning Systems 28, no. 7 (July 2017): 1716–21. http://dx.doi.org/10.1109/tnnls.2016.2546956.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Han, Bo, Ivor W. Tsang, Ling Chen, Celina P. Yu, and Sai-Fu Fung. "Progressive Stochastic Learning for Noisy Labels." IEEE Transactions on Neural Networks and Learning Systems 29, no. 10 (October 2018): 5136–48. http://dx.doi.org/10.1109/tnnls.2018.2792062.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Zhao, Pan, Long Tang, and Zhigeng Pan. "Zero-Shot Learning with Noisy Labels." Procedia Computer Science 221 (2023): 763–72. http://dx.doi.org/10.1016/j.procs.2023.08.049.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Xu, Ran, Yue Yu, Hejie Cui, Xuan Kan, Yanqiao Zhu, Joyce Ho, Chao Zhang, and Carl Yang. "Neighborhood-Regularized Self-Training for Learning with Few Labels." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10611–19. http://dx.doi.org/10.1609/aaai.v37i9.26260.

Full text
Abstract:
Training deep neural networks (DNNs) with limited supervision has been a popular research topic as it can significantly alleviate the annotation burden. Self-training has been successfully applied in semi-supervised learning tasks, but one drawback of self-training is that it is vulnerable to the label noise from incorrect pseudo labels. Inspired by the fact that samples with similar labels tend to share similar representations, we develop a neighborhood-based sample selection approach to tackle the issue of noisy pseudo labels. We further stabilize self-training via aggregating the predictions from different rounds during sample selection. Experiments on eight tasks show that our proposed method outperforms the strongest self-training baseline with 1.83% and 2.51% performance gain for text and graph datasets on average. Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36.8% and saves 57.3% of the time when compared with the best baseline. Our code and appendices will be uploaded to: https://github.com/ritaranx/NeST.
APA, Harvard, Vancouver, ISO, and other styles
29

Jian, Ling, Fuhao Gao, Peng Ren, Yunquan Song, and Shihua Luo. "A Noise-Resilient Online Learning Algorithm for Scene Classification." Remote Sensing 10, no. 11 (November 20, 2018): 1836. http://dx.doi.org/10.3390/rs10111836.

Full text
Abstract:
The proliferation of remote sensing imagery motivates a surge of research interest in image processing such as feature extraction and scene recognition, etc. Among them, scene recognition (classification) is a typical learning task that focuses on exploiting annotated images to infer the category of an unlabeled image. Existing scene classification algorithms predominantly focus on static data and are designed to learn discriminant information from clean data. They, however, suffer from two major shortcomings, i.e., the noisy label may negatively affect the learning procedure and learning from scratch may lead to a huge computational burden. Thus, they are not able to handle large-scale remote sensing images, in terms of both recognition accuracy and computational cost. To address this problem, in the paper, we propose a noise-resilient online classification algorithm, which is scalable and robust to noisy labels. Specifically, ramp loss is employed as loss function to alleviate the negative affect of noisy labels, and we iteratively optimize the decision function in Reproducing Kernel Hilbert Space under the framework of Online Gradient Descent (OGD). Experiments on both synthetic and real-world data sets demonstrate that the proposed noise-resilient online classification algorithm is more robust and sparser than state-of-the-art online classification algorithms.
APA, Harvard, Vancouver, ISO, and other styles
30

Zhang, Youqiang, Jin Sun, Hao Shi, Zixian Ge, Qiqiong Yu, Guo Cao, and Xuesong Li. "Agreement and Disagreement-Based Co-Learning with Dual Network for Hyperspectral Image Classification with Noisy Labels." Remote Sensing 15, no. 10 (May 12, 2023): 2543. http://dx.doi.org/10.3390/rs15102543.

Full text
Abstract:
Deep learning-based label noise learning methods provide promising solutions for hyperspectral image (HSI) classification with noisy labels. Currently, label noise learning methods based on deep learning improve their performance by modifying one aspect, such as designing a robust loss function, revamping the network structure, or adding a noise adaptation layer. However, these methods face difficulties in coping with relatively high noise situations. To address this issue, this paper proposes a unified label noise learning framework with a dual-network structure. The goal is to enhance the model’s robustness to label noise by utilizing two networks to guide each other. Specifically, to avoid the degeneration of the dual-network training into self-training, the “disagreement” strategy is incorporated with co-learning. Then, the “agreement” strategy is introduced into the model to ensure that the model iterates in the right direction under high noise conditions. To this end, an agreement and disagreement-based co-learning (ADCL) framework is proposed for HSI classification with noisy labels. In addition, a joint loss function consisting of a supervision loss of two networks and a relative loss between two networks is designed for the dual-network structure. Extensive experiments are conducted on three public HSI datasets to demonstrate the robustness of the proposed method to label noise. Specifically, our method obtains the highest overall accuracy of 98.62%, 90.89%, and 99.02% on the three datasets, respectively, which represents an improvement of 2.58%, 2.27%, and 0.86% compared to the second-best method. In future research, the authors suggest using more networks as backbones to implement the ADCL framework.
APA, Harvard, Vancouver, ISO, and other styles
31

Zhang, Yaojie, Huahu Xu, Junsheng Xiao, and Minjie Bian. "JoSDW: Combating Noisy Labels by Dynamic Weight." Future Internet 14, no. 2 (February 2, 2022): 50. http://dx.doi.org/10.3390/fi14020050.

Full text
Abstract:
The real world is full of noisy labels that lead neural networks to perform poorly because deep neural networks (DNNs) are prone to overfitting label noise. Noise label training is a challenging problem relating to weakly supervised learning. The most advanced existing methods mainly adopt a small loss sample selection strategy, such as selecting the small loss part of the sample for network model training. However, the previous literature stopped here, neglecting the performance of the small loss sample selection strategy while training the DNNs, as well as the performance of different stages, and the performance of the collaborative learning of the two networks from disagreement to an agreement, and making a second classification based on this. We train the network using a comparative learning method. Specifically, a small loss sample selection strategy with dynamic weight is designed. This strategy increases the proportion of agreement based on network predictions, gradually reduces the weight of the complex sample, and increases the weight of the pure sample at the same time. A large number of experiments verify the superiority of our method.
APA, Harvard, Vancouver, ISO, and other styles
32

Chen, Mingxia, Jing Wang, Xueqing Li, and Xiaolong Sun. "Robust Semi-Supervised Manifold Learning Algorithm for Classification." Mathematical Problems in Engineering 2018 (2018): 1–8. http://dx.doi.org/10.1155/2018/2382803.

Full text
Abstract:
In the recent years, manifold learning methods have been widely used in data classification to tackle the curse of dimensionality problem, since they can discover the potential intrinsic low-dimensional structures of the high-dimensional data. Given partially labeled data, the semi-supervised manifold learning algorithms are proposed to predict the labels of the unlabeled points, taking into account label information. However, these semi-supervised manifold learning algorithms are not robust against noisy points, especially when the labeled data contain noise. In this paper, we propose a framework for robust semi-supervised manifold learning (RSSML) to address this problem. The noisy levels of the labeled points are firstly predicted, and then a regularization term is constructed to reduce the impact of labeled points containing noise. A new robust semi-supervised optimization model is proposed by adding the regularization term to the traditional semi-supervised optimization model. Numerical experiments are given to show the improvement and efficiency of RSSML on noisy data sets.
APA, Harvard, Vancouver, ISO, and other styles
33

Li, Weiwei, Yuqing Lu, Lei Chen, and Xiuyi Jia. "Label distribution learning with noisy labels via three-way decisions." International Journal of Approximate Reasoning 150 (November 2022): 19–34. http://dx.doi.org/10.1016/j.ijar.2022.08.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Long, Lingli, Yongjin Zhu, Jun Shao, Zheng Kong, Jian Li, Yanzheng Xiang, and Xu Zhang. "NL2SQL Generation with Noise Labels based on Multi-task Learning." Journal of Physics: Conference Series 2294, no. 1 (June 1, 2022): 012016. http://dx.doi.org/10.1088/1742-6596/2294/1/012016.

Full text
Abstract:
Abstract With the rapid development of artificial intelligence technology, semantic recognition technology is becoming more and more mature, providing the preconditions for the development of natural language to SQL (NL2SQL) technology. In the latest research on NL2SQL, the use of pre-trained models as feature extractors for natural language and table schema has led to a very significant improvement in the effectiveness of the models. However, the current models do not take into account the degradation of the noisy labels on the overall SQL statement generation. It is crucial to reduce the impact of noisy labels on the overall SQL generation task and to maximize the return of accurate answers. To address this issue, we propose a restrictive constraint-based approach to mitigate the impact of noise-labeled labels on other tasks. In addition, parameter sharing approach is used in noiseless-labeled labels to capture each part’s correlations and improve the robustness of the model. In addition, we propose to use Kullback-Leibler divergence to constrain the discrepancy between hard and soft constrained coding of noisy labels. Our model is compared with some recent state-of-the-art methods, and experimental results show a significant improvement over the approach in this paper.
APA, Harvard, Vancouver, ISO, and other styles
35

Yan, Yan, and Yuhong Guo. "Partial Label Learning with Batch Label Correction." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6575–82. http://dx.doi.org/10.1609/aaai.v34i04.6132.

Full text
Abstract:
Partial label (PL) learning tackles the problem where each training instance is associated with a set of candidate labels, among which only one is the true label. In this paper, we propose a simple but effective batch-based partial label learning algorithm named PL-BLC, which tackles the partial label learning problem with batch-wise label correction (BLC). PL-BLC dynamically corrects the label confidence matrix of each training batch based on the current prediction network, and adopts a MixUp data augmentation scheme to enhance the underlying true labels against the redundant noisy labels. In addition, it introduces a teacher model through a consistency cost to ensure the stability of the batch-based prediction network update. Extensive experiments are conducted on synthesized and real-world partial label learning datasets, while the proposed approach demonstrates the state-of-the-art performance for partial label learning.
APA, Harvard, Vancouver, ISO, and other styles
36

Kong, Kyeongbo, Junggi Lee, Youngchul Kwak, Young-Rae Cho, Seong-Eun Kim, and Woo-Jin Song. "Penalty based robust learning with noisy labels." Neurocomputing 489 (June 2022): 112–27. http://dx.doi.org/10.1016/j.neucom.2022.02.030.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Sun, Yi, Yan Tian, Yiping Xu, and Jianxiang Li. "Limited Gradient Descent: Learning With Noisy Labels." IEEE Access 7 (2019): 168296–306. http://dx.doi.org/10.1109/access.2019.2954547.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Sun, Haoliang, Chenhui Guo, Qi Wei, Zhongyi Han, and Yilong Yin. "Learning to rectify for robust learning with noisy labels." Pattern Recognition 124 (April 2022): 108467. http://dx.doi.org/10.1016/j.patcog.2021.108467.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Lin, Chuang, Shanxin Guo, Jinsong Chen, Luyi Sun, Xiaorou Zheng, Yan Yang, and Yingfei Xiong. "Deep Learning Network Intensification for Preventing Noisy-Labeled Samples for Remote Sensing Classification." Remote Sensing 13, no. 9 (April 27, 2021): 1689. http://dx.doi.org/10.3390/rs13091689.

Full text
Abstract:
The deep-learning-network performance depends on the accuracy of the training samples. The training samples are commonly labeled by human visual investigation or inherited from historical land-cover or land-use maps, which usually contain label noise, depending on subjective knowledge and the time of the historical map. Helping the network to distinguish noisy labels during the training process is a prerequisite for applying the model for training across time and locations. This study proposes an antinoise framework, the Weight Loss Network (WLN), to achieve this goal. The WLN contains three main parts: (1) the segmentation subnetwork, which any state-of-the-art segmentation network can replace; (2) the attention subnetwork (λ); and (3) the class-balance coefficient (α). Four types of label noise (an insufficient label, redundant label, missing label and incorrect label) were simulated by dilate and erode processing to test the network’s antinoise ability. The segmentation task was set to extract buildings from the Inria Aerial Image Labeling Dataset, which includes Austin, Chicago, Kitsap County, Western Tyrol and Vienna. The network’s performance was evaluated by comparing it with the original U-Net model by adding noisy training samples with different noise rates and noise levels. The result shows that the proposed antinoise framework (WLN) can maintain high accuracy, while the accuracy of the U-Net model dropped. Specifically, after adding 50% of dilated-label samples at noise level 3, the U-Net model’s accuracy dropped by 12.7% for OA, 20.7% for the Mean Intersection over Union (MIOU) and 13.8% for Kappa scores. By contrast, the accuracy of the WLN dropped by 0.2% for OA, 0.3% for the MIOU and 0.8% for Kappa scores. For eroded-label samples at the same level, the accuracy of the U-Net model dropped by 8.4% for OA, 24.2% for the MIOU and 43.3% for Kappa scores, while the accuracy of the WLN dropped by 4.5% for OA, 4.7% for the MIOU and 0.5% for Kappa scores. This result shows that the antinoise framework proposed in this paper can help current segmentation models to avoid the impact of noisy training labels and has the potential to be trained by a larger remote sensing image set regardless of the inner label error.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhao, QiHao, Wei Hu, Yangyu Huang, and Fan Zhang. "P-DIFF+: Improving learning classifier with noisy labels by Noisy Negative Learning loss." Neural Networks 144 (December 2021): 1–10. http://dx.doi.org/10.1016/j.neunet.2021.07.024.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Xu, Ning, Yun-Peng Liu, and Xin Geng. "Partial Multi-Label Learning with Label Distribution." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6510–17. http://dx.doi.org/10.1609/aaai.v34i04.6124.

Full text
Abstract:
Partial multi-label learning (PML) aims to learn from training examples each associated with a set of candidate labels, among which only a subset are valid for the training example. The common strategy to induce predictive model is trying to disambiguate the candidate label set, such as identifying the ground-truth label via utilizing the confidence of each candidate label or estimating the noisy labels in the candidate label sets. Nonetheless, these strategies ignore considering the essential label distribution corresponding to each instance since the label distribution is not explicitly available in the training set. In this paper, a new partial multi-label learning strategy named Pml-ld is proposed to learn from partial multi-label examples via label enhancement. Specifically, label distributions are recovered by leveraging the topological information of the feature space and the correlations among the labels. After that, a multi-class predictive model is learned by fitting a regularized multi-output regressor with the recovered label distributions. Experimental results on synthetic as well as real-world datasets clearly validate the effectiveness of Pml-ld for solving PML problems.
APA, Harvard, Vancouver, ISO, and other styles
42

Yao, Jiangchao, Hao Wu, Ya Zhang, Ivor W. Tsang, and Jun Sun. "Safeguarded Dynamic Label Regression for Noisy Supervision." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9103–10. http://dx.doi.org/10.1609/aaai.v33i01.33019103.

Full text
Abstract:
Learning with noisy labels is imperative in the Big Data era since it reduces expensive labor on accurate annotations. Previous method, learning with noise transition, has enjoyed theoretical guarantees when it is applied to the scenario with the class-conditional noise. However, this approach critically depends on an accurate pre-estimated noise transition, which is usually impractical. Subsequent improvement adapts the preestimation in the form of a Softmax layer along with the training progress. However, the parameters in the Softmax layer are highly tweaked for the fragile performance and easily get stuck into undesired local minimums. To overcome this issue, we propose a Latent Class-Conditional Noise model (LCCN) that models the noise transition in a Bayesian form. By projecting the noise transition into a Dirichlet-distributed space, the learning is constrained on a simplex instead of some adhoc parametric space. Furthermore, we specially deduce a dynamic label regression method for LCCN to iteratively infer the latent true labels and jointly train the classifier and model the noise. Our approach theoretically safeguards the bounded update of the noise transition, which avoids arbitrarily tuning via a batch of samples. Extensive experiments have been conducted on controllable noise data with CIFAR10 and CIFAR-100 datasets, and the agnostic noise data with Clothing1M and WebVision17 datasets. Experimental results have demonstrated that the proposed model outperforms several state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
43

Sun, Lijuan, Ping Ye, Gengyu Lyu, Songhe Feng, Guojun Dai, and Hua Zhang. "Weakly-supervised multi-label learning with noisy features and incomplete labels." Neurocomputing 413 (November 2020): 61–71. http://dx.doi.org/10.1016/j.neucom.2020.06.101.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Liu, Kun-Lin, Wu-Jun Li, and Minyi Guo. "Emoticon Smoothed Language Models for Twitter Sentiment Analysis." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 1678–84. http://dx.doi.org/10.1609/aaai.v26i1.8353.

Full text
Abstract:
Twitter sentiment analysis (TSA) has become a hot research topic in recent years. The goal of this task is to discover the attitude or opinion of the tweets, which is typically formulated as a machine learning based text classification problem. Some methods use manually labeled data to train fully supervised models, while others use some noisy labels, such as emoticons and hashtags, for model training. In general, we can only get a limited number of training data for the fully supervised models because it is very labor-intensive and time-consuming to manually label the tweets. As for the models with noisy labels, it is hard for them to achieve satisfactory performance due to the noise in the labels although it is easy to get a large amount of data for training. Hence, the best strategy is to utilize both manually labeled data and noisy labeled data for training. However, how to seamlessly integrate these two different kinds of data into the same learning framework is still a challenge. In this paper, we present a novel model, called emoticon smoothed language model (ESLAM), to handle this challenge. The basic idea is to train a language model based on the manually labeled data, and then use the noisy emoticon data for smoothing. Experiments on real data sets demonstrate that ESLAM can effectively integrate both kinds of data to outperform those methods using only one of them.
APA, Harvard, Vancouver, ISO, and other styles
45

Büttner, Martha, Lisa Schneider, Aleksander Krasowski, Joachim Krois, Ben Feldberg, and Falk Schwendicke. "Impact of Noisy Labels on Dental Deep Learning—Calculus Detection on Bitewing Radiographs." Journal of Clinical Medicine 12, no. 9 (April 23, 2023): 3058. http://dx.doi.org/10.3390/jcm12093058.

Full text
Abstract:
Supervised deep learning requires labelled data. On medical images, data is often labelled inconsistently (e.g., too large) with varying accuracies. We aimed to assess the impact of such label noise on dental calculus detection on bitewing radiographs. On 2584 bitewings calculus was accurately labeled using bounding boxes (BBs) and artificially increased and decreased stepwise, resulting in 30 consistently and 9 inconsistently noisy datasets. An object detection network (YOLOv5) was trained on each dataset and evaluated on noisy and accurate test data. Training on accurately labeled data yielded an mAP50: 0.77 (SD: 0.01). When trained on consistently too small BBs model performance significantly decreased on accurate and noisy test data. Model performance trained on consistently too large BBs decreased immediately on accurate test data (e.g., 200% BBs: mAP50: 0.24; SD: 0.05; p < 0.05), but only after drastically increasing BBs on noisy test data (e.g., 70,000%: mAP50: 0.75; SD: 0.01; p < 0.05). Models trained on inconsistent BB sizes showed a significant decrease of performance when deviating 20% or more from the original when tested on noisy data (mAP50: 0.74; SD: 0.02; p < 0.05), or 30% or more when tested on accurate data (mAP50: 0.76; SD: 0.01; p < 0.05). In conclusion, accurate predictions need accurate labeled data in the training process. Testing on noisy data may disguise the effects of noisy training data. Researchers should be aware of the relevance of accurately annotated data, especially when testing model performances.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhang, Qian, Feifei Lee, Ya-gang Wang, Ran Miao, Lei Chen, and Qiu Chen. "An improved noise loss correction algorithm for learning from noisy labels." Journal of Visual Communication and Image Representation 72 (October 2020): 102930. http://dx.doi.org/10.1016/j.jvcir.2020.102930.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Zhao, Yue, Guoqing Zheng, Subhabrata Mukherjee, Robert McCann, and Ahmed Awadallah. "ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 4 (June 26, 2023): 4937–45. http://dx.doi.org/10.1609/aaai.v37i4.25620.

Full text
Abstract:
Existing works on anomaly detection (AD) rely on clean labels from human annotators that are expensive to acquire in practice. In this work, we propose a method to leverage weak/noisy labels (e.g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection. Specifically, we propose ADMoE, the first framework for anomaly detection algorithms to learn from noisy labels. In a nutshell, ADMoE leverages mixture-of-experts (MoE) architecture to encourage specialized and scalable learning from multiple noisy sources. It captures the similarities among noisy labels by sharing most model parameters, while encouraging specialization by building "expert" sub-networks. To further juice out the signals from noisy labels, ADMoE uses them as input features to facilitate expert learning. Extensive results on eight datasets (including a proprietary enterprise security dataset) demonstrate the effectiveness of ADMoE, where it brings up to 34% performance improvement over not using it. Also, it outperforms a total of 13 leading baselines with equivalent network parameters and FLOPS. Notably, ADMoE is model-agnostic to enable any neural network-based detection methods to handle noisy labels, where we showcase its results on both multiple-layer perceptron (MLP) and the leading AD method DeepSAD.
APA, Harvard, Vancouver, ISO, and other styles
48

Luo, Yaoru, Guole Liu, Yuanhao Guo, and Ge Yang. "Deep Neural Networks Learn Meta-Structures from Noisy Labels in Semantic Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 1908–16. http://dx.doi.org/10.1609/aaai.v36i2.20085.

Full text
Abstract:
How deep neural networks (DNNs) learn from noisy labels has been studied extensively in image classification but much less in image segmentation. So far, our understanding of the learning behavior of DNNs trained by noisy segmentation labels remains limited. In this study, we address this deficiency in both binary segmentation of biological microscopy images and multi-class segmentation of natural images. We generate extremely noisy labels by randomly sampling a small fraction (e.g., 10%) or flipping a large fraction (e.g., 90%) of the ground truth labels. When trained with these noisy labels, DNNs provide largely the same segmentation performance as trained by the original ground truth. This indicates that DNNs learn structures hidden in labels rather than pixel-level labels per se in their supervised training for semantic segmentation. We refer to these hidden structures in labels as meta-structures. When DNNs are trained by labels with different perturbations to the meta-structure, we find consistent degradation in their segmentation performance. In contrast, incorporation of meta-structure information substantially improves performance of an unsupervised segmentation model developed for binary semantic segmentation. We define meta-structures mathematically as spatial density distributions and show both theoretically and experimentally how this formulation explains key observed learning behavior of DNNs.
APA, Harvard, Vancouver, ISO, and other styles
49

Zheng, Kecheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, and Zheng-Jun Zha. "Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (May 18, 2021): 3538–46. http://dx.doi.org/10.1609/aaai.v35i4.16468.

Full text
Abstract:
Many unsupervised domain adaptive (UDA) person ReID approaches combine clustering-based pseudo-label prediction with feature fine-tuning. However, because of domain gap, the pseudo-labels are not always reliable and there are noisy/incorrect labels. This would mislead the feature representation learning and deteriorate the performance. In this paper, we propose to estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels, by suppressing the contribution of noisy samples. We build our baseline framework using the mean teacher method together with an additional contrastive loss. We have observed that a sample with a wrong pseudo-label through clustering in general has a weaker consistency between the output of the mean teacher model and the student model. Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the ID classification loss per sample, the triplet loss, and the contrastive loss. Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Ziyang, and Irina Voiculescu. "Dealing with Unreliable Annotations: A Noise-Robust Network for Semantic Segmentation through A Transformer-Improved Encoder and Convolution Decoder." Applied Sciences 13, no. 13 (July 7, 2023): 7966. http://dx.doi.org/10.3390/app13137966.

Full text
Abstract:
Conventional deep learning methods have shown promising results in the medical domain when trained on accurate ground truth data. Pragmatically, due to constraints like lack of time or annotator inexperience, the ground truth data obtained from clinical environments may not always be impeccably accurate. In this paper, we investigate whether the presence of noise in ground truth data can be mitigated. We propose an innovative and efficient approach that addresses the challenge posed by noise in segmentation labels. Our method consists of four key components within a deep learning framework. First, we introduce a Vision Transformer-based modified encoder combined with a convolution-based decoder for the segmentation network, capitalizing on the recent success of self-attention mechanisms. Second, we consider a public CT spine segmentation dataset and devise a preprocessing step to generate (and even exaggerate) noisy labels, simulating real-world clinical situations. Third, to counteract the influence of noisy labels, we incorporate an adaptive denoising learning strategy (ADL) into the network training. Finally, we demonstrate through experimental results that the proposed method achieves noise-robust performance, outperforming existing baseline segmentation methods across multiple evaluation metrics.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography