Journal articles on the topic 'Contrastive loss'

To see the other types of publications on this topic, follow the link: Contrastive loss.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Contrastive loss.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Vito, Valentino, and Lim Yohanes Stefanus. "An Asymmetric Contrastive Loss for Handling Imbalanced Datasets." Entropy 24, no. 9 (September 15, 2022): 1303. http://dx.doi.org/10.3390/e24091303.

Full text
Abstract:
Contrastive learning is a representation learning method performed by contrasting a sample to other similar samples so that they are brought closely together, forming clusters in the feature space. The learning process is typically conducted using a two-stage training architecture, and it utilizes the contrastive loss (CL) for its feature learning. Contrastive learning has been shown to be quite successful in handling imbalanced datasets, in which some classes are overrepresented while some others are underrepresented. However, previous studies have not specifically modified CL for imbalanced datasets. In this work, we introduce an asymmetric version of CL, referred to as ACL, in order to directly address the problem of class imbalance. In addition, we propose the asymmetric focal contrastive loss (AFCL) as a further generalization of both ACL and focal contrastive loss (FCL). The results on the imbalanced FMNIST and ISIC 2018 datasets show that the AFCL is capable of outperforming the CL and FCL in terms of both weighted and unweighted classification accuracies.
APA, Harvard, Vancouver, ISO, and other styles
2

Hoffmann, David T., Nadine Behrmann, Juergen Gall, Thomas Brox, and Mehdi Noroozi. "Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 897–905. http://dx.doi.org/10.1609/aaai.v36i1.19972.

Full text
Abstract:
This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples. In contrast to the standard InfoNCE loss, which requires a strict binary separation of the training pairs into similar and dissimilar samples, RINCE can exploit information about a similarity ranking for learning a corresponding embedding space. We show that the proposed loss function learns favorable embeddings compared to the standard InfoNCE whenever at least noisy ranking information can be obtained or when the definition of positives and negatives is blurry. We demonstrate this for a supervised classification task with additional superclass labels and noisy similarity scores. Furthermore, we show that RINCE can also be applied to unsupervised training with experiments on unsupervised representation learning from videos. In particular, the embedding yields higher classification accuracy, retrieval rates and performs better on out-of-distribution detection than the standard InfoNCE loss.
APA, Harvard, Vancouver, ISO, and other styles
3

Akash, Aditya Kumar, Vishnu Suresh Lokhande, Sathya N. Ravi, and Vikas Singh. "Learning Invariant Representations using Inverse Contrastive Loss." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6582–91. http://dx.doi.org/10.1609/aaai.v35i8.16815.

Full text
Abstract:
Learning invariant representations is a critical first step in a number of machine learning tasks. A common approach is given by the so-called information bottleneck principle in which an application dependent function of mutual information is carefully chosen and optimized. Unfortunately, in practice, these functions are not suitable for optimization purposes since these losses are agnostic of the metric structure of the parameters of the model. In our paper, we introduce a class of losses for learning representations that are invariant to some extraneous variable of interest by inverting the class of contrastive losses, i.e., inverse contrastive loss (ICL). We show that if the extraneous variable is binary, then optimizing ICL is equivalent to optimizing a regularized MMD divergence. More generally, we also show that if we are provided a metric on the sample space, our formulation of ICL can be decomposed into a sum of convex functions of the given distance metric. Our experimental results indicate that models obtained by optimizing ICL achieve significantly better invariance to the extraneous variable for a fixed desired level of accuracy. In a variety of experimental settings, we show applicability of ICL for learning invariant representations for both continuous and discrete protected/extraneous variables. The project page with code is available at https://github.com/adityakumarakash/ICL
APA, Harvard, Vancouver, ISO, and other styles
4

Ahmad, Sajjad, Zahoor Ahmad, and Jong-Myon Kim. "A Centrifugal Pump Fault Diagnosis Framework Based on Supervised Contrastive Learning." Sensors 22, no. 17 (August 26, 2022): 6448. http://dx.doi.org/10.3390/s22176448.

Full text
Abstract:
A novel intelligent centrifugal pump (CP) fault diagnosis method is proposed in this paper. The method is based on the contrast in vibration data obtained from a centrifugal pump (CP) under several operating conditions. The vibration signals data obtained from a CP are non-stationary because of the impulses caused by different faults; thus, traditional time domain and frequency domain analyses such as fast Fourier transform and Walsh transform are not the best option to pre-process the non-stationary signals. First, to visualize the fault-related impulses in vibration data, we computed the kurtogram images of time series vibration sequences. To extract the discriminant features related to faults from the kurtogram images, we used a deep learning tool convolutional encoder (CE) with a supervised contrastive loss. The supervised contrastive loss pulls together samples belonging to the same class, while pushing apart samples belonging to a different class. The convolutional encoder was pretrained on the kurtograms with the supervised contrastive loss to infer the contrasting features belonging to different CP data classes. After pretraining with the supervised contrastive loss, the learned representations of the convolutional encoder were kept as obtained, and a linear classifier was trained above the frozen convolutional encoder, which completed the fault identification. The proposed model was validated with data collected from a real industrial testbed, yielding a high classification accuracy of 99.1% and an error of less than 1%. Furthermore, to prove the proposed model robust, it was validated on CP data with 3.0 and 3.5 bar inlet pressure.
APA, Harvard, Vancouver, ISO, and other styles
5

Anderson, John. "A major restructuring in the English consonant system: the de-linearization of [h] and the de-consonantization of [w] and [j]." English Language and Linguistics 5, no. 2 (September 25, 2001): 199–212. http://dx.doi.org/10.1017/s1360674301000211.

Full text
Abstract:
This article deals in outline with two interrelated aspects of the history of English phonology: aspects that are argued here to involve the loss of contrastive linearization for /h/ and of consonantal status for /w/ and /j/. It is suggested that these histories are clarified if proper attention is paid to contrastivity: it is necessary to identify those aspects of phonological representation which are contrastive, both segmentally and sequentially. Contrastive status can change. In this case, it is proposed here that the position of /h/ in a word has come to be noncontrastive, and that [w] and [j] are no longer contrastively consonantal, but sequential variants of their full-vowel congeners. The characterization of these restructurings involves the recognition of changing patterns of contrastivity. And, crucially, contrastivity involves not just paradigmatic distinctions between segments but also syntagmatic relations between them.
APA, Harvard, Vancouver, ISO, and other styles
6

Cheng, Yixian, and Haiyang Wang. "A modified contrastive loss method for face recognition." Pattern Recognition Letters 125 (July 2019): 785–90. http://dx.doi.org/10.1016/j.patrec.2019.07.025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Li, Yunfan, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. "Contrastive Clustering." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 8547–55. http://dx.doi.org/10.1609/aaai.v35i10.17037.

Full text
Abstract:
In this paper, we propose an online clustering method called Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning. To be specific, for a given dataset, the positive and negative instance pairs are constructed through data augmentations and then projected into a feature space. Therein, the instance- and cluster-level contrastive learning are respectively conducted in the row and column space by maximizing the similarities of positive pairs while minimizing those of negative ones. Our key observation is that the rows of the feature matrix could be regarded as soft labels of instances, and accordingly the columns could be further regarded as cluster representations. By simultaneously optimizing the instance- and cluster-level contrastive loss, the model jointly learns representations and cluster assignments in an end-to-end manner. Besides, the proposed method could timely compute the cluster assignment for each individual, even when the data is presented in streams. Extensive experimental results show that CC remarkably outperforms 17 competitive clustering methods on six challenging image benchmarks. In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19% (39%) performance improvement compared with the best baseline. The code is available at https://github.com/XLearning-SCU/2021-AAAI-CC.
APA, Harvard, Vancouver, ISO, and other styles
8

Ciortan, Madalina, Romain Dupuis, and Thomas Peel. "A Framework Using Contrastive Learning for Classification with Noisy Labels." Data 6, no. 6 (June 9, 2021): 61. http://dx.doi.org/10.3390/data6060061.

Full text
Abstract:
We propose a framework using contrastive learning as a pre-training task to perform image classification in the presence of noisy labels. Recent strategies, such as pseudo-labeling, sample selection with Gaussian Mixture models, and weighted supervised contrastive learning have, been combined into a fine-tuning phase following the pre-training. In this paper, we provide an extensive empirical study showing that a preliminary contrastive learning step brings a significant gain in performance when using different loss functions: non robust, robust, and early-learning regularized. Our experiments performed on standard benchmarks and real-world datasets demonstrate that: (i) the contrastive pre-training increases the robustness of any loss function to noisy labels and (ii) the additional fine-tuning phase can further improve accuracy, but at the cost of additional complexity.
APA, Harvard, Vancouver, ISO, and other styles
9

Tanveer, Muhammad, Hung-Khoon Tan, Hui-Fuang Ng, Maylor Karhang Leung, and Joon Huang Chuah. "Regularization of Deep Neural Network With Batch Contrastive Loss." IEEE Access 9 (2021): 124409–18. http://dx.doi.org/10.1109/access.2021.3110286.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Duan, Jiayi. "Reformatted contrastive learning for image classification via attention mechanism and self-distillation." Journal of Physics: Conference Series 2284, no. 1 (June 1, 2022): 012013. http://dx.doi.org/10.1088/1742-6596/2284/1/012013.

Full text
Abstract:
Abstract Image classification, a basic task for computer vision, is widely achieved using deep learning methods. Recently proposed contrastive learning has made a great progress to feature representation in self-supervised manner, which outperforms traditional contrastive loss, for instance, triplet loss and N-paired loss. In this paper, we propose a reformatted contrastive learning paradigm which is an extension of conventional self-supervised contrast learning. Specifically, multiple positive and negative pairs are constructed for each anchor, thus creating a link between the positive and negative factors. In addition to this, we employ a self-attention module in the network architecture with the aim of focusing on foreground pixels and building long-range dependencies across regions. We also present a self-distillation approach for fine-tuning the classifier using a well-trained feature encoder. which improves the generalization ability of our method. Experiments on CIFAR100 and CUB200-2011 reveal that our method outperforms rival methods, confirming the efficacy of our strategy.
APA, Harvard, Vancouver, ISO, and other styles
11

Fang, Hongchao, and Pengtao Xie. "An End-to-End Contrastive Self-Supervised Learning Framework for Language Understanding." Transactions of the Association for Computational Linguistics 10 (2022): 1324–40. http://dx.doi.org/10.1162/tacl_a_00521.

Full text
Abstract:
Abstract Self-supervised learning (SSL) methods such as Word2vec, BERT, and GPT have shown great effectiveness in language understanding. Contrastive learning, as a recent SSL approach, has attracted increasing attention in NLP. Contrastive learning learns data representations by predicting whether two augmented data instances are generated from the same original data example. Previous contrastive learning methods perform data augmentation and contrastive learning separately. As a result, the augmented data may not be optimal for contrastive learning. To address this problem, we propose a four-level optimization framework that performs data augmentation and contrastive learning end-to-end, to enable the augmented data to be tailored to the contrastive learning task. This framework consists of four learning stages, including training machine translation models for sentence augmentation, pretraining a text encoder using contrastive learning, finetuning a text classification model, and updating weights of translation data by minimizing the validation loss of the classification model, which are performed in a unified way. Experiments on datasets in the GLUE benchmark (Wang et al., 2018a) and on datasets used in Gururangan et al. (2020) demonstrate the effectiveness of our method.
APA, Harvard, Vancouver, ISO, and other styles
12

Gómez-Silva, María J., Arturo de la Escalera, and José M. Armingol. "Deep Learning of Appearance Affinity for Multi-Object Tracking and Re-Identification: A Comparative View." Electronics 9, no. 11 (October 22, 2020): 1757. http://dx.doi.org/10.3390/electronics9111757.

Full text
Abstract:
Recognizing the identity of a query individual in a surveillance sequence is the core of Multi-Object Tracking (MOT) and Re-Identification (Re-Id) algorithms. Both tasks can be addressed by measuring the appearance affinity between people observations with a deep neural model. Nevertheless, the differences in their specifications and, consequently, in the characteristics and constraints of the available training data for each one of these tasks, arise from the necessity of employing different learning approaches to attain each one of them. This article offers a comparative view of the Double-Margin-Contrastive and the Triplet loss function, and analyzes the benefits and drawbacks of applying each one of them to learn an Appearance Affinity model for Tracking and Re-Identification. A batch of experiments have been conducted, and their results support the hypothesis concluded from the presented study: Triplet loss function is more effective than the Contrastive one when an Re-Id model is learnt, and, conversely, in the MOT domain, the Contrastive loss can better discriminate between pairs of images rendering the same person or not.
APA, Harvard, Vancouver, ISO, and other styles
13

Rezaeifar, Shideh, Slava Voloshynovskiy, Meisam Asgari Asgari Jirhandeh, and Vitality Kinakh. "Privacy-Preserving Image Template Sharing Using Contrastive Learning." Entropy 24, no. 5 (May 3, 2022): 643. http://dx.doi.org/10.3390/e24050643.

Full text
Abstract:
With the recent developments of Machine Learning as a Service (MLaaS), various privacy concerns have been raised. Having access to the user’s data, an adversary can design attacks with different objectives, namely, reconstruction or attribute inference attacks. In this paper, we propose two different training frameworks for an image classification task while preserving user data privacy against the two aforementioned attacks. In both frameworks, an encoder is trained with contrastive loss, providing a superior utility-privacy trade-off. In the reconstruction attack scenario, a supervised contrastive loss was employed to provide maximal discrimination for the targeted classification task. The encoded features are further perturbed using the obfuscator module to remove all redundant information. Moreover, the obfuscator module is jointly trained with a classifier to minimize the correlation between private feature representation and original data while retaining the model utility for the classification. For the attribute inference attack, we aim to provide a representation of data that is independent of the sensitive attribute. Therefore, the encoder is trained with supervised and private contrastive loss. Furthermore, an obfuscator module is trained in an adversarial manner to preserve the privacy of sensitive attributes while maintaining the classification performance on the target attribute. The reported results on the CelebA dataset validate the effectiveness of the proposed frameworks.
APA, Harvard, Vancouver, ISO, and other styles
14

Zhu, He, Yang Chen, Guyue Hu, and Shan Yu. "Contrastive Learning via Local Activity." Electronics 12, no. 1 (December 29, 2022): 147. http://dx.doi.org/10.3390/electronics12010147.

Full text
Abstract:
Contrastive learning (CL) helps deep networks discriminate between positive and negative pairs in learning. As a powerful unsupervised pretraining method, CL has greatly reduced the performance gap with supervised training. However, current CL approaches mainly rely on sophisticated augmentations, a large number of negative pairs and chained gradient calculations, which are complex to use. To address these issues, in this paper, we propose the local activity contrast (LAC) algorithm, which is an unsupervised method based on two forward passes and locally defined loss to learn meaningful representations. The learning target of each layer is to minimize the activation value difference between two forward passes, effectively overcoming the limitations of applying CL above mentioned. We demonstrated that LAC could be a very useful pretraining method using reconstruction as the pretext task. Moreover, through pretraining with LAC, the networks exhibited competitive performance in various downstream tasks compared with other unsupervised learning methods.
APA, Harvard, Vancouver, ISO, and other styles
15

Pang, Bo, Deming Zhai, Junjun Jiang, and Xianming Liu. "Fully Unsupervised Person Re-Identification via Selective Contrastive Learning." ACM Transactions on Multimedia Computing, Communications, and Applications 18, no. 2 (May 31, 2022): 1–15. http://dx.doi.org/10.1145/3485061.

Full text
Abstract:
Person re-identification (ReID) aims at searching the same identity person among images captured by various cameras. Existing fully supervised person ReID methods usually suffer from poor generalization capability caused by domain gaps. Unsupervised person ReID has attracted a lot of attention recently, because it works without intensive manual annotation and thus shows great potential in adapting to new conditions. Representation learning plays a critical role in unsupervised person ReID. In this work, we propose a novel selective contrastive learning framework for fully unsupervised feature learning. Specifically, different from traditional contrastive learning strategies, we propose to use multiple positives and adaptively selected negatives for defining the contrastive loss, enabling to learn a feature embedding model with stronger identity discriminative representation. Moreover, we propose to jointly leverage global and local features to construct three dynamic memory banks, among which the global and local ones are used for pairwise similarity computation and the mixture memory bank are used for contrastive loss definition. Experimental results demonstrate the superiority of our method in unsupervised person ReID compared with the state of the art. Our code is available at https://github.com/pangbo1997/Unsup_ReID.git .
APA, Harvard, Vancouver, ISO, and other styles
16

ZOU, Yuanhao, Yufei ZHANG, and Xiaodong ZHAO. "Self-Supervised Time Series Classification Based on LSTM and Contrastive Transformer." Wuhan University Journal of Natural Sciences 27, no. 6 (December 2022): 521–30. http://dx.doi.org/10.1051/wujns/2022276521.

Full text
Abstract:
Time series data has attached extensive attention as multi-domain data, but it is difficult to analyze due to its high dimension and few labels. Self-supervised representation learning provides an effective way for processing such data. Considering the frequency domain features of the time series data itself and the contextual feature in the classification task, this paper proposes an unsupervised Long Short-Term Memory (LSTM) and contrastive transformer-based time series representation model using contrastive learning. Firstly, transforming data with frequency domain-based augmentation increases the ability to represent features in the frequency domain. Secondly, the encoder module with three layers of LSTM and convolution maps the augmented data to the latent space and calculates the temporal loss with a contrastive transformer module and contextual loss. Finally, after self-supervised training, the representation vector of the original data can be got from the pre-trained encoder. Our model achieves satisfied performances on Human Activity Recognition (HAR) and sleepEDF real-life datasets.
APA, Harvard, Vancouver, ISO, and other styles
17

Liu, Mengxin, Wenyuan Tao, Xiao Zhang, Yi Chen, Jie Li, and Chung-Ming Own. "GO Loss: A Gaussian Distribution-Based Orthogonal Decomposition Loss for Classification." Complexity 2019 (December 12, 2019): 1–10. http://dx.doi.org/10.1155/2019/9206053.

Full text
Abstract:
We present a novel loss function, namely, GO loss, for classification. Most of the existing methods, such as center loss and contrastive loss, dynamically determine the convergence direction of the sample features during the training process. By contrast, GO loss decomposes the convergence direction into two mutually orthogonal components, namely, tangential and radial directions, and conducts optimization on them separately. The two components theoretically affect the interclass separation and the intraclass compactness of the distribution of the sample features, respectively. Thus, separately minimizing losses on them can avoid the effects of their optimization. Accordingly, a stable convergence center can be obtained for each of them. Moreover, we assume that the two components follow Gaussian distribution, which is proved as an effective way to accurately model training features for improving the classification effects. Experiments on multiple classification benchmarks, such as MNIST, CIFAR, and ImageNet, demonstrate the effectiveness of GO loss.
APA, Harvard, Vancouver, ISO, and other styles
18

Zhu, Jiaqi, Shuaishi Liu, Siyang Yu, and Yihu Song. "An Extra-Contrast Affinity Network for Facial Expression Recognition in the Wild." Electronics 11, no. 15 (July 22, 2022): 2288. http://dx.doi.org/10.3390/electronics11152288.

Full text
Abstract:
Learning discriminative features for facial expression recognition (FER) in the wild is a challenging task due to the significant intra-class variations, inter-class similarities, and extreme class imbalances. In order to solve these issues, a contrastive-learning-based extra-contrast affinity network (ECAN) method is proposed. The ECAN consists of a feature processing network and two proposed loss functions, namely extra negative supervised contrastive loss (ENSC loss) and multi-view affinity loss (MVA loss). The feature processing network provides current and historical deep features to satisfy the necessary conditions for these loss functions. Specifically, the ENSC loss function simultaneously considers many positive samples and extra negative samples from other minibatches to maximize intra-class similarity and the inter-class separation of deep features, while also automatically turning the attention of the model to majority and minority classes to alleviate the class imbalance issue. The MVA loss function improves upon the center loss function by leveraging additional deep feature groups from other minibatches to dynamically learn more accurate class centers and further enhance the intra-class compactness of deep features. The numerical results obtained using two public wild FER datasets (RAFDB and FER2013) indicate that the proposed method outperforms most state-of-the-art models in FER.
APA, Harvard, Vancouver, ISO, and other styles
19

Jain, Yash, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur. "ColloSSL." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, no. 1 (March 29, 2022): 1–28. http://dx.doi.org/10.1145/3517246.

Full text
Abstract:
A major bottleneck in training robust Human-Activity Recognition models (HAR) is the need for large-scale labeled sensor datasets. Because labeling large amounts of sensor data is an expensive task, unsupervised and semi-supervised learning techniques have emerged that can learn good features from the data without requiring any labels. In this paper, we extend this line of research and present a novel technique called Collaborative Self-Supervised Learning (ColloSSL) which leverages unlabeled data collected from multiple devices worn by a user to learn high-quality features of the data. A key insight that underpins the design of ColloSSL is that unlabeled sensor datasets simultaneously captured by multiple devices can be viewed as natural transformations of each other, and leveraged to generate a supervisory signal for representation learning. We present three technical innovations to extend conventional self-supervised learning algorithms to a multi-device setting: a Device Selection approach which selects positive and negative devices to enable contrastive learning, a Contrastive Sampling algorithm which samples positive and negative examples in a multi-device setting, and a loss function called Multi-view Contrastive Loss which extends standard contrastive loss to a multi-device setting. Our experimental results on three multi-device datasets show that ColloSSL outperforms both fully-supervised and semi-supervised learning techniques in majority of the experiment settings, resulting in an absolute increase of upto 7.9% in F1 score compared to the best performing baselines. We also show that ColloSSL outperforms the fully-supervised methods in a low-data regime, by just using one-tenth of the available labeled data in the best case.
APA, Harvard, Vancouver, ISO, and other styles
20

Qiao, Hezhe, Lin Chen, Zi Ye, and Fan Zhu. "Early Alzheimer’s disease diagnosis with the contrastive loss using paired structural MRIs." Computer Methods and Programs in Biomedicine 208 (September 2021): 106282. http://dx.doi.org/10.1016/j.cmpb.2021.106282.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Zheng, Kecheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, and Zheng-Jun Zha. "Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (May 18, 2021): 3538–46. http://dx.doi.org/10.1609/aaai.v35i4.16468.

Full text
Abstract:
Many unsupervised domain adaptive (UDA) person ReID approaches combine clustering-based pseudo-label prediction with feature fine-tuning. However, because of domain gap, the pseudo-labels are not always reliable and there are noisy/incorrect labels. This would mislead the feature representation learning and deteriorate the performance. In this paper, we propose to estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels, by suppressing the contribution of noisy samples. We build our baseline framework using the mean teacher method together with an additional contrastive loss. We have observed that a sample with a wrong pseudo-label through clustering in general has a weaker consistency between the output of the mean teacher model and the student model. Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the ID classification loss per sample, the triplet loss, and the contrastive loss. Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
22

Zhang, Jiayi, Xingzhi Wang, Dong Zhang, and Dah-Jye Lee. "Semi-Supervised Group Emotion Recognition Based on Contrastive Learning." Electronics 11, no. 23 (December 1, 2022): 3990. http://dx.doi.org/10.3390/electronics11233990.

Full text
Abstract:
The performance of all learning-based group emotion recognition (GER) methods depends on the number of labeled samples. Although there are lots of group emotion images available on the Internet, labeling them manually is a labor-intensive and cost-expensive process. For this reason, datasets for GER are usually small in size, which limits the performance of GER. Considering labeling manually is challenging, using limited labeled images and a large number of unlabeled images in the network training is a potential way to improve the performance of GER. In this work, we propose a semi-supervised group emotion recognition framework based on contrastive learning to learn efficient features from both labeled and unlabeled images. In the proposed method, the unlabeled images are used to pretrain the backbone by a contrastive learning method, and the labeled images are used to fine-tune the network. The unlabeled images are then given pseudo-labels by the fine-tuned network and used for further training. In order to alleviate the uncertainty of the given pseudo-labels, we propose a Weight Cross-Entropy Loss (WCE-Loss) to suppress the influence of the samples with unreliable pseudo-labels in the training process. Experiment results on three prominent benchmark datasets for GER show the effectiveness of the proposed framework and its superiority compared with other competitive state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
23

Tan, Xiaoyan, Yun Zou, Ziyang Guo, Ke Zhou, and Qiangqiang Yuan. "Deep Contrastive Self-Supervised Hashing for Remote Sensing Image Retrieval." Remote Sensing 14, no. 15 (July 29, 2022): 3643. http://dx.doi.org/10.3390/rs14153643.

Full text
Abstract:
Hashing has been widely used for large-scale remote sensing image retrieval due to its outstanding advantages in storage and search speed. Recently, deep hashing methods, which produce discriminative hash codes by building end-to-end deep convolutional networks, have shown promising results. However, training these networks requires numerous labeled images, which are scarce and expensive in remote sensing datasets. In order to solve this problem, we propose a deep unsupervised hashing method, namely deep contrastive self-supervised hashing (DCSH), which uses only unlabeled images to learn accurate hash codes. It eliminates the need for label annotation by maximizing the consistency of different views generated from the same image. More specifically, we assume that the hash codes generated from different views of the same image are similar, and those generated from different images are dissimilar. On the basis of the hypothesis, we can develop a novel loss function containing the temperature-scaled cross-entropy loss and the quantization loss to train the developed deep network end-to-end, resulting in hash codes with semantic similarity preserved. Our proposed network contains four parts. First, each image is transformed into two different views using data augmentation. After that, they are fed into an encoder with the same shared parameters to obtain deep discriminate features. Following this, a hash layer converts the high-dimensional image representations into compact binary codes. Lastly, a novel hash function is introduced to train the proposed network end-to-end and thus guide generated hash codes with semantic similarity. Extensive experiments on two popular benchmark datasets of the UC Merced Land Use Database and the Aerial Image Dataset have demonstrated that our DCSH has significant superiority in remote sensing image retrieval compared with state-of-the-art unsupervised hashing methods.
APA, Harvard, Vancouver, ISO, and other styles
24

Hu, Shengze, Weixin Zeng, Pengfei Zhang, and Jiuyang Tang. "Neural Graph Similarity Computation with Contrastive Learning." Applied Sciences 12, no. 15 (July 29, 2022): 7668. http://dx.doi.org/10.3390/app12157668.

Full text
Abstract:
Computing the similarity between graphs is a longstanding and challenging problem with many real-world applications. Recent years have witnessed a rapid increase in neural-network-based methods, which project graphs into embedding space and devise end-to-end frameworks to learn to estimate graph similarity. Nevertheless, these solutions usually design complicated networks to capture the fine-grained interactions between graphs, and hence have low efficiency. Additionally, they rely on labeled data for training the neural networks and overlook the useful information hidden in the graphs themselves. To address the aforementioned issues, in this work, we put forward a contrastive neural graph similarity learning framework, Conga. Specifically, we utilize vanilla graph convolutional networks to generate the graph representations and capture the cross-graph interactions via a simple multilayer perceptron. We further devise an unsupervised contrastive loss to discriminate the graph embeddings and guide the training process by learning more expressive entity representations. Extensive experiment results on public datasets validate that our proposal has more robust performance and higher efficiency compared with state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
25

Mo, Yujie, Liang Peng, Jie Xu, Xiaoshuang Shi, and Xiaofeng Zhu. "Simple Unsupervised Graph Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7797–805. http://dx.doi.org/10.1609/aaai.v36i7.20748.

Full text
Abstract:
In this paper, we propose a simple unsupervised graph representation learning method to conduct effective and efficient contrastive learning. Specifically, the proposed multiplet loss explores the complementary information between the structural information and neighbor information to enlarge the inter-class variation, as well as adds an upper bound loss to achieve the finite distance between positive embeddings and anchor embeddings for reducing the intra-class variation. As a result, both enlarging inter-class variation and reducing intra-class variation result in small generalization error, thereby obtaining an effective model. Furthermore, our method removes widely used data augmentation and discriminator from previous graph contrastive learning methods, meanwhile available to output low-dimensional embeddings, leading to an efficient model. Experimental results on various real-world datasets demonstrate the effectiveness and efficiency of our method, compared to state-of-the-art methods. The source codes are released at https://github.com/YujieMo/SUGRL.
APA, Harvard, Vancouver, ISO, and other styles
26

Virmani, D., P. Girdhar, P. Jain, and P. Bamdev. "FDREnet: Face Detection and Recognition Pipeline." Engineering, Technology & Applied Science Research 9, no. 2 (April 10, 2019): 3933–38. http://dx.doi.org/10.48084/etasr.2492.

Full text
Abstract:
Face detection and recognition are being studied extensively for their vast applications in security, biometrics, healthcare, and marketing. As a step towards presenting an almost accurate solution to the problem in hand, this paper proposes a face detection and face recognition pipeline - face detection and recognition embedNet (FDREnet). The proposed FDREnet involves face detection through histogram of oriented gradients and uses Siamese technique and contrastive loss to train a deep learning architecture (EmbedNet). The approach allows the EmbedNet to learn how to distinguish facial features apart from recognizing them. This flexibility in learning due to contrastive loss accounts for better accuracy than using traditional deep learning losses. The dataset’s embeddings produced from the trained FDREnet result accuracy of 98.03%, 99.57% and 99.39% for face94, face95, and face96 datasets respectively through SVM clustering. Accuracy of 97.83%, 99.57%, and 99.39% was observed for face94, face95, and face96 datasets respectively through KNN clustering.
APA, Harvard, Vancouver, ISO, and other styles
27

Sun, Ke, Taiping Yao, Shen Chen, Shouhong Ding, Jilin Li, and Rongrong Ji. "Dual Contrastive Learning for General Face Forgery Detection." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 2316–24. http://dx.doi.org/10.1609/aaai.v36i2.20130.

Full text
Abstract:
With various facial manipulation techniques arising, face forgery detection has drawn growing attention due to security concerns. Previous works always formulate face forgery detection as a classification problem based on cross-entropy loss, which emphasizes category-level differences rather than the essential discrepancies between real and fake faces, limiting model generalization in unseen domains. To address this issue, we propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which specially constructs positive and negative paired data and performs designed contrastive learning at different granularities to learn generalized feature representation. Concretely, combined with the hard sample selection strategy, Inter-Instance Contrastive Learning (Inter-ICL) is first proposed to promote task-related discriminative features learning by especially constructing instance pairs. Moreover, to further explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces by constructing local region pairs inside instances. Extensive experiments and visualizations on several datasets demonstrate the generalization of our method against the state-of-the-art competitors. Our Code is available at https://github.com/Tencent/TFace.git.
APA, Harvard, Vancouver, ISO, and other styles
28

Zeng, Jiaqi, and Pengtao Xie. "Contrastive Self-supervised Learning for Graph Classification." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10824–32. http://dx.doi.org/10.1609/aaai.v35i12.17293.

Full text
Abstract:
Graph classification is a widely studied problem and has broad applications. In many real-world problems, the number of labeled graphs available for training classification models is limited, which renders these models prone to overfitting. To address this problem, we propose two approaches based on contrastive self-supervised learning (CSSL) to alleviate overfitting. In the first approach, we use CSSL to pretrain graph encoders on widely-available unlabeled graphs without relying on human-provided labels, then finetune the pretrained encoders on labeled graphs. In the second approach, we develop a regularizer based on CSSL, and solve the supervised classification task and the unsupervised CSSL task simultaneously. To perform CSSL on graphs, given a collection of original graphs, we perform data augmentation to create augmented graphs out of the original graphs. An augmented graph is created by consecutively applying a sequence of graph alteration operations. A contrastive loss is defined to learn graph encoders by judging whether two augmented graphs are from the same original graph. Experiments on various graph classification datasets demonstrate the effectiveness of our proposed methods. The code is available at https://github.com/UCSD-AI4H/GraphSSL.
APA, Harvard, Vancouver, ISO, and other styles
29

Etebari, Zahra, Ali Alizadeh, Mehrdad Naghzguy-Kohan, and Maria Koptjevskaja Tamm. "Development of contrastive-partitive in colloquial Persian." STUF - Language Typology and Universals 73, no. 4 (November 26, 2020): 575–604. http://dx.doi.org/10.1515/stuf-2020-1019.

Full text
Abstract:
AbstractThis article discusses the development of the contrastive-partitive function of the possessive =eš in colloquial Persian. Examples of colloquial Persian show that the third person singular clitic pronoun =eš in some adnominal possessive constructions does not refer to any obvious referent present either in the syntactic structure (co-text) or in the situational context. Instead, the function of =eš, namely contrastive-partitive, is to mark the host as a part and contrast it with other parts of the similar set. The same function is attested in a few languages of Uralic and Turkic group. We believe that the same development has been occurred in possessive =eš in Persian. To describe the process of the development of the contrastive-partitive function, authentic colloquial examples from Internet blogs and formal examples from a historical corpus of New Persian are investigated. It is argued that this non-possessive function of =eš has originated from the whole-part relation in cross-referencing possessives, where both the lexical and clitical possessor =eš are present. The presence of the lexical possessor facilitates the loss of referentiality in =eš and it is developed to denote partitivity. Furthermore, the pragmatic motivation of communicating contrast makes =eš to be further grammaticalized into denoting contrastive-partitive function.
APA, Harvard, Vancouver, ISO, and other styles
30

Guo, Tianyu, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, and Runwei Ding. "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-Supervised Action Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 762–70. http://dx.doi.org/10.1609/aaai.v36i1.19957.

Full text
Abstract:
In recent years, self-supervised representation learning for skeleton-based action recognition has been developed with the advance of contrastive learning methods. The existing contrastive learning methods use normal augmentations to construct similar positive samples, which limits the ability to explore novel movement patterns. In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed. First, the extreme augmentations and the Energy-based Attention-guided Drop Module (EADM) are proposed to obtain diverse positive samples, which bring novel movement patterns to improve the universality of the learned representations. Second, since directly using extreme augmentations may not be able to boost the performance due to the drastic changes in original identity, the Dual Distributional Divergence Minimization Loss (D3M Loss) is proposed to minimize the distribution divergence in a more gentle way. Third, the Nearest Neighbors Mining (NNM) is proposed to further expand positive samples to make the abundant information mining process more reasonable. Exhaustive experiments on NTU RGB+D 60, PKU-MMD, NTU RGB+D 120 datasets have verified that our AimCLR can significantly perform favorably against state-of-the-art methods under a variety of evaluation protocols with observed higher quality action representations. Our code is available at https://github.com/Levigty/AimCLR.
APA, Harvard, Vancouver, ISO, and other styles
31

Maheshwari, Paridhi, Ritwick Chaudhry, and Vishwa Vinay. "Scene Graph Embeddings Using Relative Similarity Supervision." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 3 (May 18, 2021): 2328–36. http://dx.doi.org/10.1609/aaai.v35i3.16333.

Full text
Abstract:
Scene graphs are a powerful structured representation of the underlying content of images, and embeddings derived from them have been shown to be useful in multiple downstream tasks. In this work, we employ a graph convolutional network to exploit structure in scene graphs and produce image embeddings useful for semantic image retrieval. Different from classification-centric supervision traditionally available for learning image representations, we address the task of learning from relative similarity labels in a ranking context. Rooted within the contrastive learning paradigm, we propose a novel loss function that operates on pairs of similar and dissimilar images and imposes relative ordering between them in embedding space. We demonstrate that this Ranking loss, coupled with an intuitive triple sampling strategy, leads to robust representations that outperform well-known contrastive losses on the retrieval task. In addition, we provide qualitative evidence of how retrieved results that utilize structured scene information capture the global context of the scene, different from visual similarity search.
APA, Harvard, Vancouver, ISO, and other styles
32

Li, Shimin, Hang Yan, and Xipeng Qiu. "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (June 28, 2022): 11002–10. http://dx.doi.org/10.1609/aaai.v36i10.21348.

Full text
Abstract:
In dialogue systems, utterances with similar semantics may have distinctive emotions under different contexts. Therefore, modeling long-range contextual emotional relationships with speaker dependency plays a crucial part in dialogue emotion recognition. Meanwhile, distinguishing the different emotion categories is non-trivial since they usually have semantically similar sentiments. To this end, we adopt supervised contrastive learning to make different emotions mutually exclusive to identify similar emotions better. Meanwhile, we utilize an auxiliary response generation task to enhance the model's ability of handling context information, thereby forcing the model to recognize emotions with similar semantics in diverse contexts. To achieve these objectives, we use the pre-trained encoder-decoder model BART as our backbone model since it is very suitable for both understanding and generation tasks. The experiments on four datasets demonstrate that our proposed model obtains significantly more favorable results than the state-of-the-art model in dialogue emotion recognition. The ablation study further demonstrates the effectiveness of supervised contrastive loss and generative loss.
APA, Harvard, Vancouver, ISO, and other styles
33

Ju, Jeongwoo, Heechul Jung, and Junmo Kim. "Extending Contrastive Learning to Unsupervised Redundancy Identification." Applied Sciences 12, no. 4 (February 20, 2022): 2201. http://dx.doi.org/10.3390/app12042201.

Full text
Abstract:
Modern deep neural network (DNN)-based approaches have delivered great performance for computer vision tasks; however, they require a massive annotation cost due to their data-hungry nature. Hence, given a fixed budget and unlabeled examples, improving the quality of examples to be annotated is a clever step to obtain good generalization of DNN. One of key issues that could hurt the quality of examples is the presence of redundancy, in which the most examples exhibit similar visual context (e.g., same background). Redundant examples barely contribute to the performance but rather require additional annotation cost. Hence, prior to the annotation process, identifying redundancy is a key step to avoid unnecessary cost. In this work, we proved that the coreset score based on cosine similarity (cossim) is effective for identifying redundant examples. This is because the collective magnitude of the gradient over redundant examples exhibits a large value compared to the others. As a result, contrastive learning first attempts to reduce the loss of redundancy. Consequently, cossim for the redundancy set exhibited a high value (low coreset score). We first viewed the redundancy identification as the gradient magnitude. In this way, we effectively removed redundant examples from two datasets (KITTI, BDD10K), resulting in a better performance in terms of detection and semantic segmentation.
APA, Harvard, Vancouver, ISO, and other styles
34

Gupta, Devansh, Drishti Bhasin, Sarthak Bhagat, Shagun Uppal, Ponnurangam Kumaraguru, and Rajiv Ratn Shah. "Contrastive Personalization Approach to Suspect Identification (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 11 (June 28, 2022): 12961–62. http://dx.doi.org/10.1609/aaai.v36i11.21617.

Full text
Abstract:
Targeted image retrieval has long been a challenging problem since each person has a different perception of different features leading to inconsistency among users in describing the details of a particular image. Due to this, each user needs a system personalized according to the way they have structured the image in their mind. One important application of this task is suspect identification in forensic investigations where a witness needs to identify the suspect from an existing criminal database. Existing methods require the attributes for each image or suffer from poor latency during training and inference. We propose a new approach to tackle this problem through explicit relevance feedback by introducing a novel loss function and a corresponding scoring function. For this, we leverage contrastive learning on the user feedback to generate the next set of suggested images while improving the level of personalization with each user feedback iteration.
APA, Harvard, Vancouver, ISO, and other styles
35

Paraskevopoulos, Georgios, Petros Pistofidis, Georgios Banoutsos, Efthymios Georgiou, and Vassilis Katsouros. "Multimodal Classification of Safety-Report Observations." Applied Sciences 12, no. 12 (June 7, 2022): 5781. http://dx.doi.org/10.3390/app12125781.

Full text
Abstract:
Modern businesses are obligated to conform to regulations to prevent physical injuries and ill health for anyone present on a site under their responsibility, such as customers, employees and visitors. Safety officers (SOs) are engineers, who perform site audits to businesses, record observations regarding possible safety issues and make appropriate recommendations. In this work, we develop a multimodal machine-learning architecture for the analysis and categorization of safety observations, given textual descriptions and images taken from the location sites. For this, we utilize a new multimodal dataset, Safety4All, which contains 5344 safety-related observations created by 86 SOs in 486 sites. An observation consists of a short issue description, written by the SOs, accompanied with images where the issue is shown, relevant metadata and a priority score. Our proposed architecture is based on the joint fine tuning of large pretrained language and image neural network models. Specifically, we propose the use of a joint task and contrastive loss, which aligns the text and vision representations in a joint multimodal space. The contrastive loss ensures that inter-modality representation distances are maintained, so that vision and language representations for similar samples are close in the shared multimodal space. We evaluate the proposed model on three tasks, namely, priority classification of input observations, observation assessment and observation categorization. Our experiments show that inspection scene images and textual descriptions provide complementary information, signifying the importance of both modalities. Furthermore, the use of the joint contrastive loss produces strong multimodal representations and outperforms a baseline simple model in tasks fusion. In addition, we train and release a large transformer-based language model for the Greek language based on the Electra architecture.
APA, Harvard, Vancouver, ISO, and other styles
36

Pan, Zhiqiang, and Honghui Chen. "Efficient Graph Collaborative Filtering via Contrastive Learning." Sensors 21, no. 14 (July 7, 2021): 4666. http://dx.doi.org/10.3390/s21144666.

Full text
Abstract:
Collaborative filtering (CF) aims to make recommendations for users by detecting user’s preference from the historical user–item interactions. Existing graph neural networks (GNN) based methods achieve satisfactory performance by exploiting the high-order connectivity between users and items, however they suffer from the poor training efficiency problem and easily introduce bias for information propagation. Moreover, the widely applied Bayesian personalized ranking (BPR) loss is insufficient to provide supervision signals for training due to the extremely sparse observed interactions. To deal with the above issues, we propose the Efficient Graph Collaborative Filtering (EGCF) method. Specifically, EGCF adopts merely one-layer graph convolution to model the collaborative signal for users and items from the first-order neighbors in the user–item interactions. Moreover, we introduce contrastive learning to enhance the representation learning of users and items by deriving the self-supervisions, which is jointly trained with the supervised learning. Extensive experiments are conducted on two benchmark datasets, i.e., Yelp2018 and Amazon-book, and the experimental results demonstrate that EGCF can achieve the state-of-the-art performance in terms of Recall and normalized discounted cumulative gain (NDCG), especially on ranking the target items at right positions. In addition, EGCF shows obvious advantages in the training efficiency compared with the competitive baselines, making it practicable for potential applications.
APA, Harvard, Vancouver, ISO, and other styles
37

Zhou, Fan, Pengyu Wang, Xovee Xu, Wenxin Tai, and Goce Trajcevski. "Contrastive Trajectory Learning for Tour Recommendation." ACM Transactions on Intelligent Systems and Technology 13, no. 1 (February 28, 2022): 1–25. http://dx.doi.org/10.1145/3462331.

Full text
Abstract:
The main objective of Personalized Tour Recommendation (PTR) is to generate a sequence of point-of-interest (POIs) for a particular tourist, according to the user-specific constraints such as duration time, start and end points, the number of attractions planned to visit, and so on. Previous PTR solutions are based on either heuristics for solving the orienteering problem to maximize a global reward with a specified budget or approaches attempting to learn user visiting preferences and transition patterns with the stochastic process or recurrent neural networks. However, existing learning methodologies rely on historical trips to train the model and use the next visited POI as the supervised signal, which may not fully capture the coherence of preferences and thus recommend similar trips to different users, primarily due to the data sparsity problem and long-tailed distribution of POI popularity. This work presents a novel tour recommendation model by distilling knowledge and supervision signals from the trips in a self-supervised manner. We propose Contrastive Trajectory Learning for Tour Recommendation (CTLTR), which utilizes the intrinsic POI dependencies and traveling intent to discover extra knowledge and augments the sparse data via pre-training auxiliary self-supervised objectives. CTLTR provides a principled way to characterize the inherent data correlations while tackling the implicit feedback and weak supervision problems by learning robust representations applicable for tour planning. We introduce a hierarchical recurrent encoder-decoder to identify tourists’ intentions and use the contrastive loss to discover subsequence semantics and their sequential patterns through maximizing the mutual information. Additionally, we observe that a data augmentation step as the preliminary of contrastive learning can solve the overfitting issue resulting from data sparsity. We conduct extensive experiments on a range of real-world datasets and demonstrate that our model can significantly improve the recommendation performance over the state-of-the-art baselines in terms of both recommendation accuracy and visiting orders.
APA, Harvard, Vancouver, ISO, and other styles
38

Tang, Shixiang, Peng Su, Dapeng Chen, and Wanli Ouyang. "Gradient Regularized Contrastive Learning for Continual Domain Adaptation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 3 (May 18, 2021): 2665–73. http://dx.doi.org/10.1609/aaai.v35i3.16370.

Full text
Abstract:
Human beings can quickly adapt to environmental changes by leveraging learning experience. However, adapting deep neural networks to dynamic environments by machine learning algorithms remains a challenge. To better understand this issue, we study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains. The obstacles in this problem are both domain shift and catastrophic forgetting. We propose Gradient Regularized Contrastive Learning (GRCL) to solve the obstacles. At the core of our method, gradient regularization plays two key roles: (1) enforcing the gradient not to harm the discriminative ability of source features which can, in turn, benefit the adaptation ability of the model to target domains; (2) constraining the gradient not to increase the classification loss on old target domains, which enables the model to preserve the performance on old target domains when adapting to an in-coming target domain. Experiments on Digits, DomainNet and Office-Caltech benchmarks demonstrate the strong performance of our approach when compared to the state-of-the-art.
APA, Harvard, Vancouver, ISO, and other styles
39

Wang, Hao, Euijoon Ahn, and Jinman Kim. "Self-Supervised Representation Learning Framework for Remote Physiological Measurement Using Spatiotemporal Augmentation Loss." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 2431–39. http://dx.doi.org/10.1609/aaai.v36i2.20143.

Full text
Abstract:
Recent advances in supervised deep learning methods are enabling remote measurements of photoplethysmography-based physiological signals using facial videos. The performance of these supervised methods, however, are dependent on the availability of large labelled data. Contrastive learning as a self-supervised method has recently achieved state-of-the-art performances in learning representative data features by maximising mutual information between different augmented views. However, existing data augmentation techniques for contrastive learning are not designed to learn physiological signals from videos and often fail when there are complicated noise and subtle and periodic colour/shape variations between video frames. To address these problems, we present a novel self-supervised spatiotemporal learning framework for remote physiological signal representation learning, where there is a lack of labelled training data. Firstly, we propose a landmark-based spatial augmentation that splits the face into several informative parts based on the Shafer’s dichromatic reflection model to characterise subtle skin colour fluctuations. We also formulate a sparsity-based temporal augmentation exploiting Nyquist–Shannon sampling theorem to effectively capture periodic temporal changes by modelling physiological signal features. Furthermore, we introduce a constrained spatiotemporal loss which generates pseudo-labels for augmented video clips. It is used to regulate the training process and handle complicated noise. We evaluated our framework on 3 public datasets and demonstrated superior performances than other self-supervised methods and achieved competitive accuracy compared to the state-of-the-art supervised methods. Code is available at https://github.com/Dylan-H-Wang/SLF-RPM.
APA, Harvard, Vancouver, ISO, and other styles
40

Li, Hang, Li Li, and Hongbing Wang. "Defect Detection for Wear Debris Based on Few-Shot Contrastive Learning." Applied Sciences 12, no. 23 (November 22, 2022): 11893. http://dx.doi.org/10.3390/app122311893.

Full text
Abstract:
In industrial defect detection tasks, the low probability of occurrence of severe industrial defects under normal production conditions has brought a great challenge for data-driven deep learning models that have just a few samples. Contrastive learning based on a sample pair makes it possible to obtain a great number of training samples and learn effective features quickly. In the field of industrial defect detection, the features of some defect instances have small category variance, and the scale of some defect instances has a great diversity. We propose a few-shot object detection network based on contrastive learning and multi-scale feature fusion. Aligned contrastive loss is adopted to increase the instance-level intra-class compactness and inter-class variance, and the misalignment problem is alleviated to a certain extent. A multi-scale fusion module is designed to recognize multi-scale defects by adaptively fusing features from different resolutions with the idea of exploiting the support branch’s information. The robustness and efficiency of the proposed method were evaluated on an industrial wear debris defect dataset and the MS COCO dataset.
APA, Harvard, Vancouver, ISO, and other styles
41

Chen, Qiang, and Yinong Chen. "Multi-view 3D model retrieval based on enhanced detail features with contrastive center loss." Multimedia Tools and Applications 81, no. 8 (February 15, 2022): 10407–26. http://dx.doi.org/10.1007/s11042-022-12281-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Deepak, S., and P. M. Ameer. "Retrieval of brain MRI with tumor using contrastive loss based similarity on GoogLeNet encodings." Computers in Biology and Medicine 125 (October 2020): 103993. http://dx.doi.org/10.1016/j.compbiomed.2020.103993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Zhang, Xinyun, Binwu Zhu, Xufeng Yao, Qi Sun, Ruiyu Li, and Bei Yu. "Context-Based Contrastive Learning for Scene Text Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3353–61. http://dx.doi.org/10.1609/aaai.v36i3.20245.

Full text
Abstract:
Pursuing accurate and robust recognizers has been a long-lasting goal for scene text recognition (STR) researchers. Recently, attention-based methods have demonstrated their effectiveness and achieved impressive results on public benchmarks. The attention mechanism enables models to recognize scene text with severe visual distortions by leveraging contextual information. However, recent studies revealed that the implicit over-reliance of context leads to catastrophic out-of-vocabulary performance. On the contrary to the superior accuracy of the seen text, models are prone to misrecognize unseen text even with good image quality. We propose a novel framework, Context-based contrastive learning (ConCLR), to alleviate this issue. Our proposed method first generates characters with different contexts via simple image concatenation operations and then optimizes contrastive loss on their embeddings. By pulling together clusters of identical characters within various contexts and pushing apart clusters of different characters in embedding space, ConCLR suppresses the side-effect of overfitting to specific contexts and learns a more robust representation. Experiments show that ConCLR significantly improves out-of-vocabulary generalization and achieves state-of-the-art performance on public benchmarks together with attention-based recognizers.
APA, Harvard, Vancouver, ISO, and other styles
44

Kim, Daeha, and Byung Cheol Song. "Contrastive Adversarial Learning for Person Independent Facial Emotion Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 7 (May 18, 2021): 5948–56. http://dx.doi.org/10.1609/aaai.v35i7.16743.

Full text
Abstract:
Since most facial emotion recognition (FER) methods significantly rely on supervision information, they have a limit to analyzing emotions independently of persons. On the other hand, adversarial learning is a well-known approach for generalized representation learning because it never requires supervision information. This paper presents a new adversarial learning for FER. In detail, the proposed learning enables the FER network to better understand complex emotional elements inherent in strong emotions by adversarially learning weak emotion samples based on strong emotion samples. As a result, the proposed method can recognize the emotions independently of persons because it understands facial expressions more accurately. In addition, we propose a contrastive loss function for efficient adversarial learning. Finally, the proposed adversarial learning scheme was theoretically verified, and it was experimentally proven to show state of the art (SOTA) performance.
APA, Harvard, Vancouver, ISO, and other styles
45

Ma, Ziping, Dongxiu Feng, Jingyu Wang, and Hu Ma. "Retinal OCTA Image Segmentation Based on Global Contrastive Learning." Sensors 22, no. 24 (December 14, 2022): 9847. http://dx.doi.org/10.3390/s22249847.

Full text
Abstract:
The automatic segmentation of retinal vessels is of great significance for the analysis and diagnosis of retinal related diseases. However, the imbalanced data in retinal vascular images remain a great challenge. Current image segmentation methods based on deep learning almost always focus on local information in a single image while ignoring the global information of the entire dataset. To solve the problem of data imbalance in optical coherence tomography angiography (OCTA) datasets, this paper proposes a medical image segmentation method (contrastive OCTA segmentation net, COSNet) based on global contrastive learning. First, the feature extraction module extracts the features of OCTA image input and maps them to the segment head and the multilayer perceptron (MLP) head, respectively. Second, a contrastive learning module saves the pixel queue and pixel embedding of each category in the feature map into the memory bank, generates sample pairs through a mixed sampling strategy to construct a new contrastive loss function, and forces the network to learn local information and global information simultaneously. Finally, the segmented image is fine tuned to restore positional information of deep vessels. The experimental results show the proposed method can improve the accuracy (ACC), the area under the curve (AUC), and other evaluation indexes of image segmentation compared with the existing methods. This method could accomplish segmentation tasks in imbalanced data and extend to other segmentation tasks.
APA, Harvard, Vancouver, ISO, and other styles
46

Chen, Liang, Yihang Lou, Jianzhong He, Tao Bai, and Minghua Deng. "Evidential Neighborhood Contrastive Learning for Universal Domain Adaptation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6258–67. http://dx.doi.org/10.1609/aaai.v36i6.20575.

Full text
Abstract:
Universal domain adaptation (UniDA) aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain without any constraints on the label sets. However, domain shift and category shift make UniDA extremely challenging, mainly attributed to the requirement of identifying both shared “known” samples and private “unknown” samples. Previous methods barely exploit the intrinsic manifold structure relationship between two domains for feature alignment, and they rely on the softmax-based scores with class competition nature to detect underlying “unknown” samples. Therefore, in this paper, we propose a novel evidential neighborhood contrastive learning framework called TNT to address these issues. Specifically, TNT first proposes a new domain alignment principle: semantically consistent samples should be geometrically adjacent to each other, whether within or across domains. From this criterion, a cross-domain multi-sample contrastive loss based on mutual nearest neighbors is designed to achieve common category matching and private category separation. Second, toward accurate “unknown” sample detection, TNT introduces a class competition-free uncertainty score from the perspective of evidential deep learning. Instead of setting a single threshold, TNT learns a category-aware heterogeneous threshold vector to reject diverse “unknown” samples. Extensive experiments on three benchmarks demonstrate that TNT significantly outperforms previous state-of-the-art UniDA methods.
APA, Harvard, Vancouver, ISO, and other styles
47

Chen, Haoyu, Hao Tang, Zitong Yu, Nicu Sebe, and Guoying Zhao. "Geometry-Contrastive Transformer for Generalized 3D Pose Transfer." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 258–66. http://dx.doi.org/10.1609/aaai.v36i1.19901.

Full text
Abstract:
We present a customized 3D mesh Transformer model for the pose transfer task. As the 3D pose transfer essentially is a deformation procedure dependent on the given meshes, the intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism. Specifically, we propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies across the given meshes. Moreover, locally, a simple yet efficient central geodesic contrastive loss is further proposed to improve the regional geometric-inconsistency learning. At last, we present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task towards unknown spaces. The massive experimental results prove the efficacy of our approach by showing state-of-the-art quantitative performances on SMPL-NPT, FAUST and our new proposed dataset SMG-3D datasets, as well as promising qualitative results on MG-cloth and SMAL datasets. It's demonstrated that our method can achieve robust 3D pose transfer and be generalized to challenging meshes from unknown spaces on cross-dataset tasks. The code and dataset are made available. Code is available: https://github.com/mikecheninoulu/CGT.
APA, Harvard, Vancouver, ISO, and other styles
48

Cho, Jungchan. "Synthetic Source Universal Domain Adaptation through Contrastive Learning." Sensors 21, no. 22 (November 12, 2021): 7539. http://dx.doi.org/10.3390/s21227539.

Full text
Abstract:
Universal domain adaptation (UDA) is a crucial research topic for efficient deep learning model training using data from various imaging sensors. However, its development is affected by unlabeled target data. Moreover, the nonexistence of prior knowledge of the source and target domain makes it more challenging for UDA to train models. I hypothesize that the degradation of trained models in the target domain is caused by the lack of direct training loss to improve the discriminative power of the target domain data. As a result, the target data adapted to the source representations is biased toward the source domain. I found that the degradation was more pronounced when I used synthetic data for the source domain and real data for the target domain. In this paper, I propose a UDA method with target domain contrastive learning. The proposed method enables models to leverage synthetic data for the source domain and train the discriminativeness of target features in an unsupervised manner. In addition, the target domain feature extraction network is shared with the source domain classification task, preventing unnecessary computational growth. Extensive experimental results on VisDa-2017 and MNIST to SVHN demonstrated that the proposed method significantly outperforms the baseline by 2.7% and 5.1%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
49

Liu, Pingping, Lida Shi, Zhuang Miao, Baixin Jin, and Qiuzhan Zhou. "Relative Distribution Entropy Loss Function in CNN Image Retrieval." Entropy 22, no. 3 (March 11, 2020): 321. http://dx.doi.org/10.3390/e22030321.

Full text
Abstract:
Convolutional neural networks (CNN) is the most mainstream solution in the field of image retrieval. Deep metric learning is introduced into the field of image retrieval, focusing on the construction of pair-based loss function. However, most pair-based loss functions of metric learning merely take common vector similarity (such as Euclidean distance) of the final image descriptors into consideration, while neglecting other distribution characters of these descriptors. In this work, we propose relative distribution entropy (RDE) to describe the internal distribution attributes of image descriptors. We combine relative distribution entropy with the Euclidean distance to obtain the relative distribution entropy weighted distance (RDE-distance). Moreover, the RDE-distance is fused with the contrastive loss and triplet loss to build the relative distributed entropy loss functions. The experimental results demonstrate that our method attains the state-of-the-art performance on most image retrieval benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhao, Xusheng, and Jinglei Liu. "Leveraging Deep Features Enhance and Semantic-Preserving Hashing for Image Retrieval." Electronics 11, no. 15 (July 30, 2022): 2391. http://dx.doi.org/10.3390/electronics11152391.

Full text
Abstract:
The hash method can convert high-dimensional data into simple binary code, which has the advantages of fast speed and small storage capacity in large-scale image retrieval and is gradually being favored by an increasing number of people. However, the traditional hash method has two common shortcomings, which affect the accuracy of image retrieval. First, most of the traditional hash methods extract many irrelevant image features, resulting in partial information bias in the binary code produced by the hash method. Furthermore, the binary code made by the traditional hash method cannot maintain the semantic similarity of the image. To find solutions to these two problems, we try a new network architecture that adds a feature enhancement layer to better extract image features, remove redundant features, and express the similarity between images through contrastive loss, thereby constructing compact exact binary code. In summary, we use the relationship between labels and image features to model them, better preserve the semantic relationship and reduce redundant features, and use a contrastive loss to compare the similarity between images, using a balance loss to produce the resulting binary code. The numbers of 0s and 1s are balanced, resulting in a more compact binary code. Extensive experiments on three commonly used datasets—CIFAR-10, NUS-WIDE, and SVHN—display that our approach (DFEH) can express good performance compared with the other most advanced approaches.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography