Log in

Relevant bibliographies by topics / Maximum Mean Discrepancy (MMD) / Journal articles

To see the other types of publications on this topic, follow the link: Maximum Mean Discrepancy (MMD).

Journal articles on the topic 'Maximum Mean Discrepancy (MMD)'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Maximum Mean Discrepancy (MMD).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Huang, Qihang, Yulin He, and Zhexue Huang. "A Novel Maximum Mean Discrepancy-Based Semi-Supervised Learning Algorithm." Mathematics 10, no. 1 (December 23, 2021): 39. http://dx.doi.org/10.3390/math10010039.

Full text

Abstract:

To provide more external knowledge for training self-supervised learning (SSL) algorithms, this paper proposes a maximum mean discrepancy-based SSL (MMD-SSL) algorithm, which trains a well-performing classifier by iteratively refining the classifier using highly confident unlabeled samples. The MMD-SSL algorithm performs three main steps. First, a multilayer perceptron (MLP) is trained based on the labeled samples and is then used to assign labels to unlabeled samples. Second, the unlabeled samples are divided into multiple groups with the k-means clustering algorithm. Third, the maximum mean discrepancy (MMD) criterion is used to measure the distribution consistency between k-means-clustered samples and MLP-classified samples. The samples having a consistent distribution are labeled as highly confident samples and used to retrain the MLP. The MMD-SSL algorithm performs an iterative training until all unlabeled samples are consistently labeled. We conducted extensive experiments on 29 benchmark data sets to validate the rationality and effectiveness of the MMD-SSL algorithm. Experimental results show that the generalization capability of the MLP algorithm can gradually improve with the increase of labeled samples and the statistical analysis demonstrates that the MMD-SSL algorithm can provide better testing accuracy and kappa values than 10 other self-training and co-training SSL algorithms.

APA, Harvard, Vancouver, ISO, and other styles

2

Zhou, Zhaokun, Yuanhong Zhong, Xiaoming Liu, Qiang Li, and Shu Han. "DC-MMD-GAN: A New Maximum Mean Discrepancy Generative Adversarial Network Using Divide and Conquer." Applied Sciences 10, no. 18 (September 14, 2020): 6405. http://dx.doi.org/10.3390/app10186405.

Full text

Abstract:

Generative adversarial networks (GANs) have a revolutionary influence on sample generation. Maximum mean discrepancy GANs (MMD-GANs) own competitive performance when compared with other GANs. However, the loss function of MMD-GANs is an empirical estimate of maximum mean discrepancy (MMD) and not precise in measuring the distance between sample distributions, which inhibits MMD-GANs training. We propose an efficient divide-and-conquer model, called DC-MMD-GANs, which constrains the loss function of MMD to tight bound on the deviation between empirical estimate and expected value of MMD and accelerates the training process. DC-MMD-GANs contain a division step and conquer step. In the division step, we learn the embedding of training images based on auto-encoder, and partition the training images into adaptive subsets through k-means clustering based on the embedding. In the conquer step, sub-models are fed with subsets separately and trained synchronously. The loss function values of all sub-models are integrated to compute a new weight-sum loss function. The new loss function with tight deviation bound provides more precise gradients for improving performance. Experimental results show that with a fixed number of iterations, DC-MMD-GANs can converge faster, and achieve better performance compared with the standard MMD-GANs on celebA and CIFAR-10 datasets.

APA, Harvard, Vancouver, ISO, and other styles

3

Xu, Haoji. "Generate Faces Using Ladder Variational Autoencoder with Maximum Mean Discrepancy (MMD)." Intelligent Information Management 10, no. 04 (2018): 108–13. http://dx.doi.org/10.4236/iim.2018.104009.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Sun, Jiancheng. "Complex Network Construction of Univariate Chaotic Time Series Based on Maximum Mean Discrepancy." Entropy 22, no. 2 (January 24, 2020): 142. http://dx.doi.org/10.3390/e22020142.

Full text

Abstract:

The analysis of chaotic time series is usually a challenging task due to its complexity. In this communication, a method of complex network construction is proposed for univariate chaotic time series, which provides a novel way to analyze time series. In the process of complex network construction, how to measure the similarity between the time series is a key problem to be solved. Due to the complexity of chaotic systems, the common metrics is hard to measure the similarity. Consequently, the proposed method first transforms univariate time series into high-dimensional phase space to increase its information, then uses Gaussian mixture model (GMM) to represent time series, and finally introduces maximum mean discrepancy (MMD) to measure the similarity between GMMs. The Lorenz system is used to validate the correctness and effectiveness of the proposed method for measuring the similarity.

APA, Harvard, Vancouver, ISO, and other styles

5

Zhang, Xiangqing, Yan Feng, Shun Zhang, Nan Wang, Shaohui Mei, and Mingyi He. "Semi-Supervised Person Detection in Aerial Images with Instance Segmentation and Maximum Mean Discrepancy Distance." Remote Sensing 15, no. 11 (June 4, 2023): 2928. http://dx.doi.org/10.3390/rs15112928.

Full text

Abstract:

Detecting sparse, small, lost persons with only a few pixels in high-resolution aerial images was, is, and remains an important and difficult mission, in which a vital role is played by accurate monitoring and intelligent co-rescuing for the search and rescue (SaR) system. However, many problems have not been effectively solved in existing remote-vision-based SaR systems, such as the shortage of person samples in SaR scenarios and the low tolerance of small objects for bounding boxes. To address these issues, a copy-paste mechanism (ISCP) with semi-supervised object detection (SSOD) via instance segmentation and maximum mean discrepancy distance is proposed (MMD), which can provide highly robust, multi-task, and efficient aerial-based person detection for the prototype SaR system. Specifically, numerous pseudo-labels are obtained by accurately segmenting the instances of synthetic ISCP samples to obtain their boundaries. The SSOD trainer then uses soft weights to balance the prediction entropy of the loss function between the ground truth and unreliable labels. Moreover, a novel evaluation metric MMD for anchor-based detectors is proposed to elegantly compute the IoU of the bounding boxes. Extensive experiments and ablation studies on Heridal and optimized public datasets demonstrate that our approach is effective and achieves state-of-the-art person detection performance in aerial images.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhao, Ji, and Deyu Meng. "FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test." Neural Computation 27, no. 6 (June 2015): 1345–72. http://dx.doi.org/10.1162/neco_a_00732.

Full text

Abstract:

The maximum mean discrepancy (MMD) is a recently proposed test statistic for the two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this study we propose an efficient method called FastMMD. The core idea of FastMMD is to equivalently transform the MMD with shift-invariant kernels into the amplitude expectation of a linear combination of sinusoid components based on Bochner’s theorem and Fourier transform (Rahimi & Recht, 2007 ). Taking advantage of sampling the Fourier transform, FastMMD decreases the time complexity for MMD calculation from [Formula: see text] to [Formula: see text], where N and d are the size and dimension of the sample set, respectively. Here, L is the number of basis functions for approximating kernels that determines the approximation accuracy. For kernels that are spherically invariant, the computation can be further accelerated to [Formula: see text] by using the Fastfood technique (Le, Sarlós, & Smola, 2013 ). The uniform convergence of our method has also been theoretically proved in both unbiased and biased estimates. We also provide a geometric explanation for our method, ensemble of circular discrepancy, which helps us understand the insight of MMD and we hope will lead to more extensive metrics for assessing the two-sample test task. Experimental results substantiate that the accuracy of FastMMD is similar to that of MMD and with faster computation and lower variance than existing MMD approximation methods.

APA, Harvard, Vancouver, ISO, and other styles

7

Williamson, Sinead A., and Jette Henderson. "Understanding Collections of Related Datasets Using Dependent MMD Coresets." Information 12, no. 10 (September 23, 2021): 392. http://dx.doi.org/10.3390/info12100392.

Full text

Abstract:

Understanding how two datasets differ can help us determine whether one dataset under-represents certain sub-populations, and provides insights into how well models will generalize across datasets. Representative points selected by a maximum mean discrepancy (MMD) coreset can provide interpretable summaries of a single dataset, but are not easily compared across datasets. In this paper, we introduce dependent MMD coresets, a data summarization method for collections of datasets that facilitates comparison of distributions. We show that dependent MMD coresets are useful for understanding multiple related datasets and understanding model generalization between such datasets.

APA, Harvard, Vancouver, ISO, and other styles

8

Li, Kangji, Borui Wei, Qianqian Tang, and Yufei Liu. "A Data-Efficient Building Electricity Load Forecasting Method Based on Maximum Mean Discrepancy and Improved TrAdaBoost Algorithm." Energies 15, no. 23 (November 22, 2022): 8780. http://dx.doi.org/10.3390/en15238780.

Full text

Abstract:

Building electricity load forecasting plays an important role in building energy management, peak demand and power grid security. In the past two decades, a large number of data-driven models have been applied to building and larger-scale energy consumption predictions. Although these models have been successful in specific cases, their performances would be greatly affected by the quantity and quality of the building data. Moreover, for older buildings with sparse data, or new buildings with no historical data, accurate predictions are difficult to achieve. Aiming at such a data silos problem caused by the insufficient data collection in the building energy consumption prediction, this study proposes a building electricity load forecasting method based on a similarity judgement and an improved TrAdaBoost algorithm (iTrAdaBoost). The Maximum Mean Discrepancy (MMD) is used to search similar building samples related to the target building from public datasets. Different from general Boosting algorithms, the proposed iTrAdaBoost algorithm iteratively updates the weights of the similar building samples and combines them together with the target building samples for a prediction accuracy improvement. An educational building’s case study is carried out in this paper. The results show that even when the target and source samples belong to different domains, i.e., the geographical location and meteorological condition of the buildings are different, the proposed MMD-iTradaBoost method has a better prediction accuracy in the transfer learning process than the BP or traditional AdaBoost models. In addition, compared with other advanced deep learning models, the proposed method has a simple structure and is easy for engineering implementation.

APA, Harvard, Vancouver, ISO, and other styles

9

Lee, Junghyun, Gwangsu Kim, Mahbod Olfat, Mark Hasegawa-Johnson, and Chang D. Yoo. "Fast and Efficient MMD-Based Fair PCA via Optimization over Stiefel Manifold." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7363–71. http://dx.doi.org/10.1609/aaai.v36i7.20699.

Full text

Abstract:

This paper defines fair principal component analysis (PCA) as minimizing the maximum mean discrepancy (MMD) between the dimensionality-reduced conditional distributions of different protected classes. The incorporation of MMD naturally leads to an exact and tractable mathematical formulation of fairness with good statistical properties. We formulate the problem of fair PCA subject to MMD constraints as a non-convex optimization over the Stiefel manifold and solve it using the Riemannian Exact Penalty Method with Smoothing (REPMS). Importantly, we provide a local optimality guarantee and explicitly show the theoretical effect of each hyperparameter in practical settings, extending previous results. Experimental comparisons based on synthetic and UCI datasets show that our approach outperforms prior work in explained variance, fairness, and runtime.

APA, Harvard, Vancouver, ISO, and other styles

10

Han, Chao, Deyun Zhou, Zhen Yang, Yu Xie, and Kai Zhang. "Discriminative Sparse Filtering for Multi-Source Image Classification." Sensors 20, no. 20 (October 16, 2020): 5868. http://dx.doi.org/10.3390/s20205868.

Full text

Abstract:

Distribution mismatch caused by various resolutions, backgrounds, etc. can be easily found in multi-sensor systems. Domain adaptation attempts to reduce such domain discrepancy by means of different measurements, e.g., maximum mean discrepancy (MMD). Despite their success, such methods often fail to guarantee the separability of learned representation. To tackle this issue, we put forward a novel approach to jointly learn both domain-shared and discriminative representations. Specifically, we model the feature discrimination explicitly for two domains. Alternating discriminant optimization is proposed to obtain discriminative features with an l2 constraint in labeled source domain and sparse filtering is introduced to capture the intrinsic structures exists in the unlabeled target domain. Finally, they are integrated in a unified framework along with MMD to align domains. Extensive experiments compared with state-of-the-art methods verify the effectiveness of our method on cross-domain tasks.

APA, Harvard, Vancouver, ISO, and other styles

11

Song, Mengmeng, Zexiong Zhang, Shungen Xiao, Zicheng Xiong, and Mengwei Li. "Bearing fault diagnosis method using a spatio-temporal neural network based on feature transfer learning." Measurement Science and Technology 34, no. 1 (October 25, 2022): 015119. http://dx.doi.org/10.1088/1361-6501/ac9078.

Full text

Abstract:

Abstract An intelligent bearing fault diagnosis method based requires a large quantity of labeled data. However, in an actual engineering environment, only a tiny amount of unlabeled data can be collected. To solve this problem, we construct a spatio-temporal neural network (STN) model by multi-layer fusion of convolutional neural network (CNN) and long-term memory network features. Then, a model based on feature migration is constructed and a STN is applied as the feature extractor of the network. Finally, the Case Western Reserve University bearing dataset is employed to verify the performance of our proposed model, and the influence of different neural network feature extractors (CNN, recurrent neural network, long- and short-term memory network, STN) and several feature transfer measures [correlation alignment, multiple kernel maximum mean discrepancy, joint maximum mean discrepancy, discriminative joint probability maximum mean discrepancy (DJP-MMD) on the accuracy of the model were compared. The results show that the diagnostic accuracy of the proposed method is over 98%, and the diagnostic accuracy can be maintained at around 99% in most cases when the signal to noise ratio (SNR) is 10 dB. When the SNR is lower than 2 dB, the accuracy of the STN-DJPMMD model is still over 88%.

APA, Harvard, Vancouver, ISO, and other styles

12

Wang, Jinrui, Shanshan Ji, Baokun Han, Huaiqian Bao, and Xingxing Jiang. "Deep Adaptive Adversarial Network-Based Method for Mechanical Fault Diagnosis under Different Working Conditions." Complexity 2020 (July 23, 2020): 1–11. http://dx.doi.org/10.1155/2020/6946702.

Full text

Abstract:

The demand for transfer learning methods for mechanical fault diagnosis has considerably progressed in recent years. However, the existing methods always depend on the maximum mean discrepancy (MMD) in measuring the domain discrepancy. But MMD can not guarantee the different domain features to be similar enough. Inspired by generative adversarial networks (GAN) and domain adversarial training of neural networks (DANN), this study presents a novel deep adaptive adversarial network (DAAN). The DAAN comprises a condition recognition module and domain adversarial learning module. The condition recognition module is constructed with a generator to extract features and classify the health condition of machinery automatically. The domain adversarial learning module is achieved with a discriminator based on Wasserstein distance to learn domain-invariant features. Then spectral normalization (SN) is employed to accelerate convergence. The effectiveness of DAAN is demonstrated through three transfer fault diagnosis experiments, and the results show that the DAAN can converge to zero after approximately 15 training epochs, and all the average testing accuracies in each case can achieve over 92%. It is expected that the proposed DAAN can effectively learn domain-invariant features to bridge the discrepancy between the data from different working conditions.

APA, Harvard, Vancouver, ISO, and other styles

13

Wang, Haoyu, Yuhu Cheng, and Xuesong Wang. "A Novel Hyperspectral Image Classification Method Using Class-Weighted Domain Adaptation Network." Remote Sensing 15, no. 4 (February 10, 2023): 999. http://dx.doi.org/10.3390/rs15040999.

Full text

Abstract:

With the development of science and technology, hyperspectral image (HSI) classification has been studied in depth by researchers as one of the important means of human cognition in living environments and the exploration of surface information. Nevertheless, the shortage of labeled samples is a major difficulty in HSI classification. To address this issue, we propose a novel HSI classification method called class-weighted domain adaptation network (CWDAN). First, the convolutional domain adaption network (ConDAN) is designed to align the marginal distributions and second-order statistics, respectively, of both domains via multi-kernel maximum mean discrepancy (MK-MMD) and CORAL loss. Then, the class-weighted MMD (CWMMD) is defined to simultaneously consider the conditional distribution discrepancy and changes of class prior distributions, and the CWMMD-based domain adaptation term is incorporated into the classical broad learning system (BLS) to construct the weighted conditional broad network (WCBN). The WCBN is applied to reduce the conditional distribution discrepancy and class weight bias across domains, while performing breadth expansion on domain-invariant features to further enhance representation ability. In comparison with several existing mainstream methods, CWDAN has excellent classification performance on eight real HSI data pairs when only using labeled source domain samples.

APA, Harvard, Vancouver, ISO, and other styles

14

Liu, Yi, Hang Xiang, Zhansi Jiang, and Jiawei Xiang. "A Domain Adaption ResNet Model to Detect Faults in Roller Bearings Using Vibro-Acoustic Data." Sensors 23, no. 6 (March 13, 2023): 3068. http://dx.doi.org/10.3390/s23063068.

Full text

Abstract:

Intelligent fault diagnosis of roller bearings is facing two important problems, one is that train and test datasets have the same distribution, and the other is the installation positions of accelerometer sensors are limited in industrial environments, and the collected signals are often polluted by background noise. In the recent years, the discrepancy between train and test datasets is decreased by introducing the idea of transfer learning to solve the first issue. In addition, the non-contact sensors will replace the contact sensors. In this paper, a domain adaption residual neural network (DA-ResNet) model using maximum mean discrepancy (MMD) and a residual connection is constructed for cross-domain diagnosis of roller bearings based on acoustic and vibration data. MMD is used to minimize the distribution discrepancy between the source and target domains, thereby improving the transferability of the learned features. Acoustic and vibration signals from three directions are simultaneously sampled to provide more complete bearing information. Two experimental cases are conducted to test the ideas presented. The first is to verify the necessity of multi-source data, and the second is to demonstrate that transfer operation can improve recognition accuracy in fault diagnosis.

APA, Harvard, Vancouver, ISO, and other styles

15

Xiao, Li, Qi Chen, Shuping Hou, Zhi Yan, and Yiming Tian. "Detection of an Incipient Fault for Dual Three-Phase PMSMs Using a Modified Autoencoder." Electronics 11, no. 22 (November 15, 2022): 3741. http://dx.doi.org/10.3390/electronics11223741.

Full text

Abstract:

For the detection of incipient interturn short-circuit (IITSC) faults of machines without shutting them down, there are still shortcomings of insufficient incipient fault features and a high false alarm rate. This is especially the case for dual three-phase permanent magnet synchronous motors (PMSMs) with complex winding structures, and this kind of incipient fault detection is more complicated. To solve this detection difficulty, an IITSC detection method for dual three-phase PMSMs is proposed based on a modified deep autoencoder (MDAE). This autoencoder (AE) adopts an improved distribution metric combined with the maximum mean discrepancy (MMD) and the maximum covariance discrepancy (MCD) to extract the fault feature from the common features, which can improve the feature difference between the normal state and the incipient fault state. Then, the permutation entropy of the extracted features is calculated to detect the IITSC faults. The results illustrate that this method can not only detect IITSC faults online effectively and robustly, but also reduce the false alarm rate of the fault detection for dual three-phase PMSMs.

APA, Harvard, Vancouver, ISO, and other styles

16

Futami, Futoshi, Zhenghang Cui, Issei Sato, and Masashi Sugiyama. "Bayesian Posterior Approximation via Greedy Particle Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3606–13. http://dx.doi.org/10.1609/aaai.v33i01.33013606.

Full text

Abstract:

In Bayesian inference, the posterior distributions are difficult to obtain analytically for complex models such as neural networks. Variational inference usually uses a parametric distribution for approximation, from which we can easily draw samples. Recently discrete approximation by particles has attracted attention because of its high expression ability. An example is Stein variational gradient descent (SVGD), which iteratively optimizes particles. Although SVGD has been shown to be computationally efficient empirically, its theoretical properties have not been clarified yet and no finite sample bound of the convergence rate is known. Another example is the Stein points (SP) method, which minimizes kernelized Stein discrepancy directly. Althoughafinitesampleboundisassuredtheoretically, SP is computationally inefficient empirically, especially in high-dimensional problems. In this paper, we propose a novel method named maximum mean discrepancy minimization by the Frank-Wolfe algorithm (MMD-FW), which minimizes MMD in a greedy way by the FW algorithm. Our method is computationally efficient empirically and we show that its finite sample convergence bound is in a linear order in finite dimensions.

APA, Harvard, Vancouver, ISO, and other styles

17

Du, Yuntao, Ruiting Zhang, Xiaowen Zhang, Yirong Yao, Hengyang Lu, and Chongjun Wang. "Learning transferable and discriminative features for unsupervised domain adaptation." Intelligent Data Analysis 26, no. 2 (March 14, 2022): 407–25. http://dx.doi.org/10.3233/ida-215813.

Full text

Abstract:

Although achieving remarkable progress, it is very difficult to induce a supervised classifier without any labeled data. Unsupervised domain adaptation is able to overcome this challenge by transferring knowledge from a labeled source domain to an unlabeled target domain. Transferability and discriminability are two key criteria for characterizing the superiority of feature representations to enable successful domain adaptation. In this paper, a novel method called learning TransFerable and Discriminative Features for unsupervised domain adaptation (TFDF) is proposed to optimize these two objectives simultaneously. On the one hand, distribution alignment is performed to reduce domain discrepancy and learn more transferable representations. Instead of adopting Maximum Mean Discrepancy (MMD) which only captures the first-order statistical information to measure distribution discrepancy, we adopt a recently proposed statistic called Maximum Mean and Covariance Discrepancy (MMCD), which can not only capture the first-order statistical information but also capture the second-order statistical information in the reproducing kernel Hilbert space (RKHS). On the other hand, we propose to explore both local discriminative information via manifold regularization and global discriminative information via minimizing the proposed class confusion objective to learn more discriminative features, respectively. We integrate these two objectives into the Structural Risk Minimization (RSM) framework and learn a domain-invariant classifier. Comprehensive experiments are conducted on five real-world datasets and the results verify the effectiveness of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

18

Wang, Z., T. Li, L. Pan, and Z. Kang. "SCENE SEMANTIC SEGMENTATION FROM INDOOR RGB-D IMAGES USING ENCODE-DECODER FULLY CONVOLUTIONAL NETWORKS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 12, 2017): 397–404. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-397-2017.

Full text

Abstract:

With increasing attention for the indoor environment and the development of low-cost RGB-D sensors, indoor RGB-D images are easily acquired. However, scene semantic segmentation is still an open area, which restricts indoor applications. The depth information can help to distinguish the regions which are difficult to be segmented out from the RGB images with similar color or texture in the indoor scenes. How to utilize the depth information is the key problem of semantic segmentation for RGB-D images. In this paper, we propose an Encode-Decoder Fully Convolutional Networks for RGB-D image classification. We use Multiple Kernel Maximum Mean Discrepancy (MK-MMD) as a distance measure to find common and special features of RGB and D images in the network to enhance performance of classification automatically. To explore better methods of applying MMD, we designed two strategies; the first calculates MMD for each feature map, and the other calculates MMD for whole batch features. Based on the result of classification, we use the full connect CRFs for the semantic segmentation. The experimental results show that our method can achieve a good performance on indoor RGB-D image semantic segmentation.

APA, Harvard, Vancouver, ISO, and other styles

19

Cheng, Xiuyuan, Alexander Cloninger, and Ronald R. Coifman. "Two-sample statistics based on anisotropic kernels." Information and Inference: A Journal of the IMA 9, no. 3 (December 10, 2019): 677–719. http://dx.doi.org/10.1093/imaiai/iaz018.

Full text

Abstract:

Abstract The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between $n$ data points and a set of $n_R$ reference points, where $n_R$ can be drastically smaller than $n$. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as $\|p-q\| \sqrt{n} \to \infty $, and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.

APA, Harvard, Vancouver, ISO, and other styles

20

Chen, Chao, Zhihang Fu, Zhihong Chen, Sheng Jin, Zhaowei Cheng, Xinyu Jin, and Xian-sheng Hua. "HoMM: Higher-Order Moment Matching for Unsupervised Domain Adaptation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3422–29. http://dx.doi.org/10.1609/aaai.v34i04.5745.

Full text

Abstract:

Minimizing the discrepancy of feature distributions between different domains is one of the most promising directions in unsupervised domain adaptation. From the perspective of moment matching, most existing discrepancy-based methods are designed to match the second-order or lower moments, which however, have limited expression of statistical characteristic for non-Gaussian distributions. In this work, we propose a Higher-order Moment Matching (HoMM) method, and further extend the HoMM into reproducing kernel Hilbert spaces (RKHS). In particular, our proposed HoMM can perform arbitrary-order moment matching, we show that the first-order HoMM is equivalent to Maximum Mean Discrepancy (MMD) and the second-order HoMM is equivalent to Correlation Alignment (CORAL). Moreover, HoMM (order≥ 3) is expected to perform fine-grained domain alignment as higher-order statistics can approximate more complex, non-Gaussian distributions. Besides, we also exploit the pseudo-labeled target samples to learn discriminative representations in the target domain, which further improves the transfer performance. Extensive experiments are conducted, showing that our proposed HoMM consistently outperforms the existing moment matching methods by a large margin. Codes are available at https://github.com/chenchao666/HoMM-Master

APA, Harvard, Vancouver, ISO, and other styles

21

Liu, Jian, and Liming Feng. "Diversity Evolutionary Policy Deep Reinforcement Learning." Computational Intelligence and Neuroscience 2021 (August 3, 2021): 1–11. http://dx.doi.org/10.1155/2021/5300189.

Full text

Abstract:

The reinforcement learning algorithms based on policy gradient may fall into local optimal due to gradient disappearance during the update process, which in turn affects the exploration ability of the reinforcement learning agent. In order to solve the above problem, in this paper, the cross-entropy method (CEM) in evolution policy, maximum mean difference (MMD), and twin delayed deep deterministic policy gradient algorithm (TD3) are combined to propose a diversity evolutionary policy deep reinforcement learning (DEPRL) algorithm. By using the maximum mean discrepancy as a measure of the distance between different policies, some of the policies in the population maximize the distance between them and the previous generation of policies while maximizing the cumulative return during the gradient update. Furthermore, combining the cumulative returns and the distance between policies as the fitness of the population encourages more diversity in the offspring policies, which in turn can reduce the risk of falling into local optimal due to the disappearance of the gradient. The results in the MuJoCo test environment show that DEPRL has achieved excellent performance on continuous control tasks; especially in the Ant-v2 environment, the return of DEPRL ultimately achieved a nearly 20% improvement compared to TD3.

APA, Harvard, Vancouver, ISO, and other styles

22

Tahmoresnezhad, Jafar, and Sattar Hashemi. "An Efficient yet Effective Random Partitioning and Feature Weighting Approach for Transfer Learning." International Journal of Pattern Recognition and Artificial Intelligence 30, no. 02 (February 2016): 1651003. http://dx.doi.org/10.1142/s0218001416510034.

Full text

Abstract:

One of the serious challenges in machine learning and pattern recognition is to transfer knowledge from related but different domains to a new unlabeled domain. Feature selection with maximum mean discrepancy (f-MMD) is a novel and effective approach to transfer knowledge from source domain (training set) into target domain (test set) where training and test sets are drawn from different distributions. However, f-MMD has serious challenges in facing datasets with large number of samples and features. Moreover, f-MMD ignores the feature-label relation in finding the reduced representation of dataset. In this paper, we exploit jointly transfer learning and class discrimination to cope with domain shift problem on which the distribution difference is considerably large. We therefore put forward a novel transfer learning and class discrimination approach, referred to as RandOm k-samplesets feature Weighting Approach (ROWA). Specifically, ROWA reduces the distribution difference across domains in an unsupervised manner where no label is available in the test set. Moreover, ROWA exploits feature-label relation to separate various classes alongside the domain transfer, and augments the relation of selected features and source domain labels. In this work, we employ disjoint/overlapping small-sized samplesets to iteratively converge to final solution. Employment of local sets along with a novel optimization problem constructs a robust and effective reduced representation for adaptation across domains. Extensive experiments on real and synthetic datasets verify that ROWA can significantly outperform state-of-the-art transfer learning approaches.

APA, Harvard, Vancouver, ISO, and other styles

23

Tay, Sebastian Shenghong, Xinyi Xu, Chuan Sheng Foo, and Bryan Kian Hsiang Low. "Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 9448–56. http://dx.doi.org/10.1609/aaai.v36i9.21177.

Full text

Abstract:

This paper presents a novel collaborative generative modeling (CGM) framework that incentivizes collaboration among self-interested parties to contribute data to a pool for training a generative model (e.g., GAN), from which synthetic data are drawn and distributed to the parties as rewards commensurate to their contributions. Distributing synthetic data as rewards (instead of trained models or money) offers task- and model-agnostic benefits for downstream learning tasks and is less likely to violate data privacy regulation. To realize the framework, we firstly propose a data valuation function using maximum mean discrepancy (MMD) that values data based on its quantity and quality in terms of its closeness to the true data distribution and provide theoretical results guiding the kernel choice in our MMD-based data valuation function. Then, we formulate the reward scheme as a linear optimization problem that when solved, guarantees certain incentives such as fairness in the CGM framework. We devise a weighted sampling algorithm for generating synthetic data to be distributed to each party as reward such that the value of its data and the synthetic data combined matches its assigned reward value by the reward scheme. We empirically show using simulated and real-world datasets that the parties' synthetic data rewards are commensurate to their contributions.

APA, Harvard, Vancouver, ISO, and other styles

24

Zhang, Quanling, Ningze Tang, Xing Fu, Hao Peng, Cuimei Bo, and Cunsong Wang. "A Multi-Scale Attention Mechanism Based Domain Adversarial Neural Network Strategy for Bearing Fault Diagnosis." Actuators 12, no. 5 (April 27, 2023): 188. http://dx.doi.org/10.3390/act12050188.

Full text

Abstract:

There are a large number of bearings in aircraft engines that are subjected to extreme operating conditions, such as high temperature, high speed, and heavy load, and their fatigue, wear, and other failure problems seriously affect the reliability of the engine. The complex and variable bearing operating conditions can lead to differences in the distribution of data between the source and target operating conditions, as well as insufficient labels. To solve the above challenges, a multi-scale attention mechanism-based domain adversarial neural network strategy for bearing fault diagnosis (MADANN) is proposed and verified using Case Western Reserve University bearing data and PT500mini mechanical bearing data in this paper. First, a multi-scale feature extractor with an attention mechanism is proposed to extract more discriminative multi-scale features of the input signal. Subsequently, the maximum mean discrepancy (MMD) is introduced to measure the difference between the distribution of the target domain and the source domain. Finally, the fault diagnosis process of the rolling is realized by minimizing the loss of the feature classifier, the loss of the MMD distance, and maximizing the loss of the domain discriminator. The verification results indicate that the proposed strategy has stronger learning ability and better diagnosis performance than shallow network, deep network, and commonly used domain adaptive models.

APA, Harvard, Vancouver, ISO, and other styles

25

Xu, Kun, Shunming Li, Ranran Li, Jiantao Lu, Xianglian Li, and Mengjie Zeng. "Domain Adaptation Network with Double Adversarial Mechanism for Intelligent Fault Diagnosis." Applied Sciences 11, no. 17 (August 28, 2021): 7983. http://dx.doi.org/10.3390/app11177983.

Full text

Abstract:

Due to the mechanical equipment working under variable speed and load for a long time, the distribution of samples is different (domain shift). The general intelligent fault diagnosis method has a good diagnostic effect only on samples with the same sample distribution, but cannot correctly predict the faults of samples with domain shift in a real situation. To settle this problem, a new intelligent fault diagnosis method, domain adaptation network with double adversarial mechanism (DAN-DAM), is proposed. The DAN-DAM model is mainly composed of a feature extractor, two label classifiers and a domain discriminator. The feature extractor and the two label classifiers form the first adversarial mechanism to achieve class-level alignment. Moreover, the discrepancy between the two classifiers is measured by Wasserstein distance. Meanwhile, the feature extractor and the domain discriminator form the second adversarial mechanism to realize domain-level alignment. In addition, maximum mean discrepancy (MMD) is used to reduce the distance between the extracted features of two domains. The DAN-DAM model is verified by multiple transfer experiments on some datasets. According to the transfer experiment results, the DAN-DAM model has a good diagnosis effect for the domain shift samples. Moreover, the diagnostic accuracy is generally higher than other mainstream diagnostic methods.

APA, Harvard, Vancouver, ISO, and other styles

26

Wang, Li, Guoqiang Liu, Chao Zhang, Yu Yang, and Jinhao Qiu. "FEM Simulation-Based Adversarial Domain Adaptation for Fatigue Crack Detection Using Lamb Wave." Sensors 23, no. 4 (February 9, 2023): 1943. http://dx.doi.org/10.3390/s23041943.

Full text

Abstract:

Lamb wave-based damage detection technology shows great potential for structural integrity assessment. However, conventional damage features based damage detection methods and data-driven intelligent damage detection methods highly rely on expert knowledge and sufficient labeled data for training, for which collecting is usually expensive and time-consuming. Therefore, this paper proposes an automated fatigue crack detection method using Lamb wave based on finite element method (FEM) and adversarial domain adaptation. FEM-simulation was used to obtain simulated response signals under various conditions to solve the problem of the insufficient labeled data in practice. Due to the distribution discrepancy between simulated signals and experimental signals, the detection performance of classifier just trained with simulated signals will drop sharply on the experimental signals. Then, Domain-adversarial neural network (DANN) with maximum mean discrepancy (MMD) was used to achieve discriminative and domain-invariant feature extraction between simulation source domain and experiment target domain, and the unlabeled experimental signals samples will be accurately classified. The proposed method is validated by fatigue tests on center-hole metal specimens. The results show that the proposed method presents superior detection ability compared to other methods and can be used as an effective tool for cross-domain damage detection.

APA, Harvard, Vancouver, ISO, and other styles

27

Yang, Bingru, Qi Li, Liang Chen, Changqing Shen, and Sundararajan Natarajan. "Bearing Fault Diagnosis Based on Multilayer Domain Adaptation." Shock and Vibration 2020 (September 29, 2020): 1–11. http://dx.doi.org/10.1155/2020/8873960.

Full text

Abstract:

Bearing fault diagnosis plays a vitally important role in practical industrial scenarios. Deep learning-based fault diagnosis methods are usually performed on the hypothesis that the training set and test set obey the same probability distribution, which is hard to satisfy under the actual working conditions. This paper proposes a novel multilayer domain adaptation (MLDA) method, which can diagnose the compound fault and single fault of multiple sizes simultaneously. A special designed residual network for the fault diagnosis task is pretrained to extract domain-invariant features. The multikernel maximum mean discrepancy (MK-MMD) and pseudo-label learning are adopted in multiple layers to take both marginal distributions and conditional distributions into consideration. A total of 12 transfer tasks in the fault diagnosis problem are conducted to verify the performance of MLDA. Through the comparisons of different signal processing methods, different parameter settings, and different models, it is proved that the proposed MLDA model can effectively extract domain-invariant features and achieve satisfying results.

APA, Harvard, Vancouver, ISO, and other styles

28

Banerjee, Subhankar, and Shayok Chakraborty. "Deterministic Mini-batch Sequencing for Training Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6723–31. http://dx.doi.org/10.1609/aaai.v35i8.16831.

Full text

Abstract:

Recent advancements in the field of deep learning have dramatically improved the performance of machine learning models in a variety of applications, including computer vision, text mining, speech processing and fraud detection among others. Mini-batch gradient descent is the standard algorithm to train deep models, where mini-batches of a fixed size are sampled randomly from the training data and passed through the network sequentially. In this paper, we present a novel algorithm to generate a deterministic sequence of mini-batches to train a deep neural network (rather than a random sequence). Our rationale is to select a mini-batch by minimizing the Maximum Mean Discrepancy (MMD) between the already selected mini-batches and the unselected training samples. We pose the mini-batch selection as a constrained optimization problem and derive a linear programming relaxation to determine the sequence of mini-batches. To the best of our knowledge, this is the first research effort that uses the MMD criterion to determine a sequence of mini-batches to train a deep neural network. The proposed mini-batch sequencing strategy is deterministic and independent of the underlying network architecture and prediction task. Our extensive empirical analyses on three challenging datasets corroborate the merit of our framework over competing baselines. We further study the performance of our framework on two other applications besides classification (regression and semantic segmentation) to validate its generalizability.

APA, Harvard, Vancouver, ISO, and other styles

29

Tian, Jinghui, Dongying Han, Lifeng Xiao, and Peiming Shi. "Multi-scale deep coupling convolutional neural network with heterogeneous sensor data for intelligent fault diagnosis." Journal of Intelligent & Fuzzy Systems 41, no. 1 (August 11, 2021): 2225–38. http://dx.doi.org/10.3233/jifs-210932.

Full text

Abstract:

With the innovation and development of detection technology, various types of sensors are installed to monitor the operating status of equipment in modern industry. Compared with the same type of sensors for monitoring, heterogeneous sensors can collect more comprehensive complementary fault information. Due to the large distribution differences and serious noise pollution of heterogeneous sensor data collected in industrial sites, this brings certain challenges to the development of heterogeneous data fusion strategies. In view of the large distribution difference in the feature spatial of heterogeneous data and the difficulty of effective fusion of fault information, this paper presents a multi-scale deep coupling convolutional neural network (MDCN), which is used to map the heterogeneous fault information from different feature spaces to the common spaces for full fusion. Specifically, a multi-scale convolution module (MSC) with multiple filters of different sizes is adopted to extract multi-scale fault features of heterogeneous sensor data. Then, the maximum mean discrepancy (MMD) is applied to measure the distance between different spatial features in the coupling layer, and the common failure information in the heterogeneous data is mined by minimizing MMD to fuse effectively in order to identify the failure state of the device. The validity of this method is verified by the data collected on a first-level parallel gearbox mixed fault experiment platform.

APA, Harvard, Vancouver, ISO, and other styles

30

Zang, Shaofei, Xinghai Li, Jianwei Ma, Yongyi Yan, Jiwei Gao, and Yuan Wei. "TSTELM: Two-Stage Transfer Extreme Learning Machine for Unsupervised Domain Adaptation." Computational Intelligence and Neuroscience 2022 (July 18, 2022): 1–18. http://dx.doi.org/10.1155/2022/1582624.

Full text

Abstract:

As a single-layer feedforward network (SLFN), extreme learning machine (ELM) has been successfully applied for classification and regression in machine learning due to its faster training speed and better generalization. However, it will perform poorly for domain adaptation in which the distributions between training data and testing data are inconsistent. In this article, we propose a novel ELM called two-stage transfer extreme learning machine (TSTELM) to solve this problem. At the statistical matching stage, we adopt maximum mean discrepancy (MMD) to narrow the distribution difference of the output layer between domains. In addition, at the subspace alignment stage, we align the source and target model parameters, design target cross-domain mean approximation, and add the output weight approximation to further promote the knowledge transferring across domains. Moreover, the prediction of test sample is jointly determined by the ELM parameters generated at the two stages. Finally, we investigate the proposed approach in classification task and conduct experiments on four public domain adaptation datasets. The result indicates that TSTELM could effectively enhance the knowledge transfer ability of ELM with higher accuracy than other existing transfer and non-transfer classifiers.

APA, Harvard, Vancouver, ISO, and other styles

31

Sun, Wei, Jie Zhou, Bintao Sun, Yuqing Zhou, and Yongying Jiang. "Markov Transition Field Enhanced Deep Domain Adaptation Network for Milling Tool Condition Monitoring." Micromachines 13, no. 6 (May 31, 2022): 873. http://dx.doi.org/10.3390/mi13060873.

Full text

Abstract:

Tool condition monitoring (TCM) is of great importance for improving the manufacturing efficiency and surface quality of workpieces. Data-driven machine learning methods are widely used in TCM and have achieved many good results. However, in actual industrial scenes, labeled data are not available in time in the target domain that significantly affect the performance of data-driven methods. To overcome this problem, a new TCM method combining the Markov transition field (MTF) and the deep domain adaptation network (DDAN) is proposed. A few vibration signals collected in the TCM experiments were represented in 2D images through MTF to enrich the features of the raw signals. The transferred ResNet50 was used to extract deep features of these 2D images. DDAN was employed to extract deep domain-invariant features between the source and target domains, in which the maximum mean discrepancy (MMD) is applied to measure the distance between two different distributions. TCM experiments show that the proposed method significantly outperforms the other three benchmark methods and is more robust under varying working conditions.

APA, Harvard, Vancouver, ISO, and other styles

32

Ding, Renjie, Xue Li, Lanshun Nie, Jiazhen Li, Xiandong Si, Dianhui Chu, Guozhong Liu, and Dechen Zhan. "Empirical Study and Improvement on Deep Transfer Learning for Human Activity Recognition." Sensors 19, no. 1 (December 24, 2018): 57. http://dx.doi.org/10.3390/s19010057.

Full text

Abstract:

Human activity recognition (HAR) based on sensor data is a significant problem in pervasive computing. In recent years, deep learning has become the dominating approach in this field, due to its high accuracy. However, it is difficult to make accurate identification for the activities of one individual using a model trained on data from other users. The decline on the accuracy of recognition restricts activity recognition in practice. At present, there is little research on the transferring of deep learning model in this field. This is the first time as we known, an empirical study was carried out on deep transfer learning between users with unlabeled data of target. We compared several widely-used algorithms and found that Maximum Mean Discrepancy (MMD) method is most suitable for HAR. We studied the distribution of features generated from sensor data. We improved the existing method from the aspect of features distribution with center loss and get better results. The observations and insights in this study have deepened the understanding of transfer learning in the activity recognition field and provided guidance for further research.

APA, Harvard, Vancouver, ISO, and other styles

33

Wang, Kai, Wei Zhao, Aidong Xu, Peng Zeng, and Shunkun Yang. "One-Dimensional Multi-Scale Domain Adaptive Network for Bearing-Fault Diagnosis under Varying Working Conditions." Sensors 20, no. 21 (October 23, 2020): 6039. http://dx.doi.org/10.3390/s20216039.

Full text

Abstract:

Data-driven bearing-fault diagnosis methods have become a research hotspot recently. These methods have to meet two premises: (1) the distributions of the data to be tested and the training data are the same; (2) there are a large number of high-quality labeled data. However, machines usually work under different working conditions in practice, which challenges these prerequisites due to the fact that the data distributions under different working conditions are different. In this paper, the one-dimensional Multi-Scale Domain Adaptive Network (1D-MSDAN) is proposed to address this issue. The 1D-MSDAN is a kind of deep transfer model, which uses both feature adaptation and classifier adaptation to guide the multi-scale convolutional neural network to perform bearing-fault diagnosis under varying working conditions. Feature adaptation is performed by both multi-scale feature adaptation and multi-level feature adaptation, which helps in finding domain-invariant features by minimizing the distribution discrepancy between different working conditions by using the Multi-kernel Maximum Mean Discrepancy (MK-MMD). Furthermore, classifier adaptation is performed by entropy minimization in the target domain to bridge the source classifier and target classifier to further eliminate domain discrepancy. The Case Western Reserve University (CWRU) bearing database is used to validate the proposed 1D-MSDAN. The experimental results show that the diagnostic accuracy for the 12 transfer tasks performed by 1D-MSDAN was superior to that of the mainstream transfer learning models for bearing-fault diagnosis under variable working conditions. In addition, the transfer learning performance of 1D-MSDAN for multi-target domain adaptation and real industrial scenarios was also verified.

APA, Harvard, Vancouver, ISO, and other styles

34

Park, Hyo-Seok, Seong-Joong Kim, Andrew L. Stewart, Seok-Woo Son, and Kyong-Hwan Seo. "Mid-Holocene Northern Hemisphere warming driven by Arctic amplification." Science Advances 5, no. 12 (December 2019): eaax8203. http://dx.doi.org/10.1126/sciadv.aax8203.

Full text

Abstract:

The Holocene thermal maximum was characterized by strong summer solar heating that substantially increased the summertime temperature relative to preindustrial climate. However, the summer warming was compensated by weaker winter insolation, and the annual mean temperature of the Holocene thermal maximum remains ambiguous. Using multimodel mid-Holocene simulations, we show that the annual mean Northern Hemisphere temperature is strongly correlated with the degree of Arctic amplification and sea ice loss. Additional model experiments show that the summer Arctic sea ice loss persists into winter and increases the mid- and high-latitude temperatures. These results are evaluated against four proxy datasets to verify that the annual mean northern high-latitude temperature during the mid-Holocene was warmer than the preindustrial climate, because of the seasonally rectified temperature increase driven by the Arctic amplification. This study offers a resolution to the “Holocene temperature conundrum”, a well-known discrepancy between paleo-proxies and climate model simulations of Holocene thermal maximum.

APA, Harvard, Vancouver, ISO, and other styles

35

Sun, Han, Xinyi Chen, Ling Wang, Dong Liang, Ningzhong Liu, and Huiyu Zhou. "C2DAN: An Improved Deep Adaptation Network with Domain Confusion and Classifier Adaptation." Sensors 20, no. 12 (June 26, 2020): 3606. http://dx.doi.org/10.3390/s20123606.

Full text

Abstract:

Deep neural networks have been successfully applied in domain adaptation which uses the labeled data of source domain to supplement useful information for target domain. Deep Adaptation Network (DAN) is one of these efficient frameworks, it utilizes Multi-Kernel Maximum Mean Discrepancy (MK-MMD) to align the feature distribution in a reproducing kernel Hilbert space. However, DAN does not perform very well in feature level transfer, and the assumption that source and target domain share classifiers is too strict in different adaptation scenarios. In this paper, we further improve the adaptability of DAN by incorporating Domain Confusion (DC) and Classifier Adaptation (CA). To achieve this, we propose a novel domain adaptation method named C2DAN. Our approach first enables Domain Confusion (DC) by using a domain discriminator for adversarial training. For Classifier Adaptation (CA), a residual block is added to the source domain classifier in order to learn the difference between source classifier and target classifier. Beyond validating our framework on the standard domain adaptation dataset office-31, we also introduce and evaluate on the Comprehensive Cars (CompCars) dataset, and the experiment results demonstrate the effectiveness of the proposed framework C2DAN.

APA, Harvard, Vancouver, ISO, and other styles

36

Zhang, Yongchao, Zhaohui Ren, and Shihua Zhou. "A New Deep Convolutional Domain Adaptation Network for Bearing Fault Diagnosis under Different Working Conditions." Shock and Vibration 2020 (July 24, 2020): 1–14. http://dx.doi.org/10.1155/2020/8850976.

Full text

Abstract:

Effective fault diagnosis methods can ensure the safe and reliable operation of the machines. In recent years, deep learning technology has been applied to diagnose various mechanical equipment faults. However, in real industries, the data distribution under different working conditions is often different, which leads to serious degradation of diagnostic performance. In order to solve the issue, this study proposes a new deep convolutional domain adaptation network (DCDAN) method for bearing fault diagnosis. This method implements cross-domain fault diagnosis by using the labeled source domain data and the unlabeled target domain data as training data. In DCDAN, firstly, a convolutional neural network is applied to extract features of source domain data and target domain data. Then, the domain distribution discrepancy is reduced through minimizing probability distribution distance of multiple kernel maximum mean discrepancies (MK-MMD) and maximizing the domain recognition error of domain classifier. Finally, the source domain classification error is minimized. Extensive experiments on two rolling bearing datasets verify that the proposed method can implement accurate cross-domain fault diagnosis under different working conditions. The study may provide a promising tool for bearing fault diagnosis under different working conditions.

APA, Harvard, Vancouver, ISO, and other styles

37

Zhang, Jun, Wen Yao, Xiaoqian Chen, and Ling Feng. "Transferable Post-hoc Calibration on Pretrained Transformers in Noisy Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 13940–48. http://dx.doi.org/10.1609/aaai.v37i11.26632.

Full text

Abstract:

Recent work has demonstrated that pretrained transformers are overconfident in text classification tasks, which can be calibrated by the famous post-hoc calibration method temperature scaling (TS). Character or word spelling mistakes are frequently encountered in real applications and greatly threaten transformer model safety. Research on calibration under noisy settings is rare, and we focus on this direction. Based on a toy experiment, we discover that TS performs poorly when the datasets are perturbed by slight noise, such as swapping the characters, which results in distribution shift. We further utilize two metrics, predictive uncertainty and maximum mean discrepancy (MMD), to measure the distribution shift between clean and noisy datasets, based on which we propose a simple yet effective transferable TS method for calibrating models dynamically. To evaluate the performance of the proposed methods under noisy settings, we construct a benchmark consisting of four noise types and five shift intensities based on the QNLI, AG-News, and Emotion tasks. Experimental results on the noisy benchmark show that (1) the metrics are effective in measuring distribution shift and (2) transferable TS can significantly decrease the expected calibration error (ECE) compared with the competitive baseline ensemble TS by approximately 46.09%.

APA, Harvard, Vancouver, ISO, and other styles

38

Paul, A., K. Vogt, F. Rottensteiner, J. Ostermann, and C. Heipke. "A COMPARISON OF TWO STRATEGIES FOR AVOIDING NEGATIVE TRANSFER IN DOMAIN ADAPTATION BASED ON LOGISTIC REGRESSION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2 (May 30, 2018): 845–52. http://dx.doi.org/10.5194/isprs-archives-xlii-2-845-2018.

Full text

Abstract:

In this paper we deal with the problem of measuring the similarity between training and tests datasets in the context of transfer learning (TL) for image classification. TL tries to transfer knowledge from a source domain, where labelled training samples are abundant but the data may follow a different distribution, to a target domain, where labelled training samples are scarce or even unavailable, assuming that the domains are related. Thus, the requirements w.r.t. the availability of labelled training samples in the target domain are reduced. In particular, if no labelled target data are available, it is inherently difficult to find a robust measure of relatedness between the source and target domains. This is of crucial importance for the performance of TL, because the knowledge transfer between unrelated data may lead to negative transfer, i.e. to a decrease of classification performance after transfer. We address the problem of measuring the relatedness between source and target datasets and investigate three different strategies to predict and, consequently, to avoid negative transfer in this paper. The first strategy is based on circular validation. The second strategy relies on the Maximum Mean Discrepancy (MMD) similarity metric, whereas the third one is an extension of MMD which incorporates the knowledge about the class labels in the source domain. Our method is evaluated using two different benchmark datasets. The experiments highlight the strengths and weaknesses of the investigated methods. We also show that it is possible to reduce the amount of negative transfer using these strategies for a TL method and to generate a consistent performance improvement over the whole dataset.

APA, Harvard, Vancouver, ISO, and other styles

39

Li, Zhaokui, Xiangyi Tang, Wei Li, Chuanyun Wang, Cuiwei Liu, and Jinrong He. "A Two-stage Deep Domain Adaptation Method for Hyperspectral Image Classification." Remote Sensing 12, no. 7 (March 25, 2020): 1054. http://dx.doi.org/10.3390/rs12071054.

Full text

Abstract:

Deep learning has attracted extensive attention in the field of hyperspectral images (HSIs) classification. However, supervised deep learning methods heavily rely on a large amount of label information. To address this problem, in this paper, we propose a two-stage deep domain adaptation method for hyperspectral image classification, which can minimize the data shift between two domains and learn a more discriminative deep embedding space with very few labeled target samples. A deep embedding space is first learned by minimizing the distance between the source domain and the target domain based on Maximum Mean Discrepancy (MMD) criterion. The Spatial–Spectral Siamese Network is then exploited to reduce the data shift and learn a more discriminative deep embedding space by minimizing the distance between samples from different domains but the same class label and maximizes the distance between samples from different domains and class labels based on pairwise loss. For the classification task, the softmax layer is replaced with a linear support vector machine, in which learning minimizes a margin-based loss instead of the cross-entropy loss. The experimental results on two sets of hyperspectral remote sensing images show that the proposed method can outperform several state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

40

Chen, Zhihong, Taiping Yao, Kekai Sheng, Shouhong Ding, Ying Tai, Jilin Li, Feiyue Huang, and Xinyu Jin. "Generalizable Representation Learning for Mixture Domain Face Anti-Spoofing." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (May 18, 2021): 1132–39. http://dx.doi.org/10.1609/aaai.v35i2.16199.

Full text

Abstract:

Face anti-spoofing approach based on domain generalization (DG) has drawn growing attention due to its robustness for unseen scenarios. Existing DG methods assume that the domain label is known. However, in real-world applications, the collected dataset always contains mixture domains, where the domain label is unknown. In this case, most of existing methods may not work. Further, even if we can obtain the domain label as existing methods, we think this is just a sub-optimal partition. To overcome the limitation, we propose domain dynamic adjustment meta-learning (D$^2$AM) without using domain labels, which iteratively divides mixture domains via discriminative domain representation and trains a generalizable face anti-spoofing with meta-learning. Specifically, we design a domain feature based on Instance Normalization (IN) and propose a domain representation learning module (DRLM) to extract discriminative domain features for clustering. Moreover, to reduce the side effect of outliers on clustering performance, we additionally utilize maximum mean discrepancy (MMD) to align the distribution of sample features to a prior distribution, which improves the reliability of clustering. Extensive experiments show that the proposed method outperforms conventional DG-based face anti-spoofing methods, including those utilizing domain labels. Furthermore, we enhance the interpretability through visualization.

APA, Harvard, Vancouver, ISO, and other styles

41

Nguyen-Tang, Thanh, Sunil Gupta, and Svetha Venkatesh. "Distributional Reinforcement Learning via Moment Matching." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9144–52. http://dx.doi.org/10.1609/aaai.v35i10.17104.

Full text

Abstract:

We consider the problem of learning a set of probability distributions from the empirical Bellman dynamics in distributional reinforcement learning (RL), a class of state-of-the-art methods that estimate the distribution, as opposed to only the expectation, of the total return. We formulate a method that learns a finite set of statistics from each return distribution via neural networks, as in the distributional RL literature. Existing distributional RL methods however constrain the learned statistics to predefined functional forms of the return distribution which is both restrictive in representation and difficult in maintaining the predefined statistics. Instead, we learn unrestricted statistics, i.e., deterministic (pseudo-)samples, of the return distribution by leveraging a technique from hypothesis testing known as maximum mean discrepancy (MMD), which leads to a simpler objective amenable to backpropagation. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. We establish sufficient conditions for the contraction of the distributional Bellman operator and provide finite-sample analysis for the deterministic samples in distribution approximation. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines and sets a new record in the Atari games for non-distributed agents.

APA, Harvard, Vancouver, ISO, and other styles

42

Zhang, Qiyang, Zhibin Zhao, Xingwu Zhang, Yilong Liu, Xiaolei Yu, and Xuefeng Chen. "Short-time consistent domain adaptation for rolling bearing fault diagnosis under varying working conditions." Measurement Science and Technology 33, no. 7 (April 6, 2022): 075105. http://dx.doi.org/10.1088/1361-6501/ac5874.

Full text

Abstract:

Abstract Although traditional deep learning improves the accuracy of intelligent fault diagnosis, it suffers from a problem, which is that a change in working conditions may reduce the diagnostic accuracy. The reason for this phenomenon is that a change of working conditions influences the probability distributions. To solve this problem, domain adaptation is adopted to perform intelligent fault diagnosis. However, the design of regularization methods, such as maximum mean discrepancy (MMD), neglects the phenomenon of fault extension. Considering the property of fault extension, the paper sums up a concept named short-time consistency which means that ‘during stable operation, a failure does not expand over a short time period.’ Moreover, short-time consistent regularization is proposed to ensure that the output of the model meets the requirement for short-time consistency, and closed-set regularization is proposed to further solve the problem of ‘types of label drop’ when short-time consistent regularization is used. When the problem occurs, the number of predicted label types in the target domain is smaller than that in the source domain in the closed-set domain adaptation. Two types of regularization, namely entropy-based regularization and regularization based on the L 2 norm, are easily adopted in the final loss function. The proposed method is verified by experiments.

APA, Harvard, Vancouver, ISO, and other styles

43

Hussein, Amir, and Hazem Hajj. "Domain Adaptation with Representation Learning and Nonlinear Relation for Time Series." ACM Transactions on Internet of Things 3, no. 2 (May 31, 2022): 1–26. http://dx.doi.org/10.1145/3502905.

Full text

Abstract:

In many real-world scenarios, machine learning models fall short in prediction performance due to data characteristics changing from training on one source domain to testing on a target domain. There has been extensive research to address this problem with Domain Adaptation (DA) for learning domain invariant features. However, when considering advances for time series, those methods remain limited to the use of hard parameter sharing (HPS) between source and target models, and the use of domain adaptation objective function. To address these challenges, we propose a soft parameter sharing (SPS) DA architecture with representation learning while modeling the relation as non-linear between parameters of source and target models and modeling the adaptation loss function as the squared Maximum Mean Discrepancy (MMD) . The proposed architecture advances the state-of-the-art for time series in the context of activity recognition and in fields with other modalities, where SPS has been limited to a linear relation. An additional contribution of our work is to provide a study that demonstrates the strengths and limitations of HPS versus SPS. Experiment results showed the success of the method in three domain adaptation cases of multivariate time series activity recognition with different users and sensors.

APA, Harvard, Vancouver, ISO, and other styles

44

He, Yiwei, Yingjie Tian, Jingjing Tang, and Yue Ma. "Unsupervised Domain Adaptation Using Exemplar-SVMs with Adaptation Regularization." Complexity 2018 (2018): 1–13. http://dx.doi.org/10.1155/2018/8425821.

Full text

Abstract:

Domain adaptation has recently attracted attention for visual recognition. It assumes that source and target domain data are drawn from the same feature space but different margin distributions and its motivation is to utilize the source domain instances to assist in training a robust classifier for target domain tasks. Previous studies always focus on reducing the distribution mismatch across domains. However, in many real-world applications, there also exist problems of sample selection bias among instances in a domain; this would reduce the generalization performance of learners. To address this issue, we propose a novel model named Domain Adaptation Exemplar Support Vector Machines (DAESVMs) based on exemplar support vector machines (exemplar-SVMs). Our approach aims to address the problems of sample selection bias and domain adaptation simultaneously. Comparing with usual domain adaptation problems, we go a step further in slacking the assumption of i.i.d. First, we formulate the DAESVMs training classifiers with reducing Maximum Mean Discrepancy (MMD) among domains by mapping data into a latent space and preserving properties of original data, and then, we integrate classifiers to make a prediction for target domain instances. Our experiments were conducted on Office and Caltech10 datasets and verify the effectiveness of the model we proposed.

APA, Harvard, Vancouver, ISO, and other styles

45

Zhu, Qiuyu, Liheng Hu, and Rui Wang. "Image Clustering Algorithm Based on Predefined Evenly-Distributed Class Centroids and Composite Cosine Distance." Entropy 24, no. 11 (October 26, 2022): 1533. http://dx.doi.org/10.3390/e24111533.

Full text

Abstract:

The clustering algorithms based on deep neural network perform clustering by obtaining the optimal feature representation. However, in the face of complex natural images, the cluster accuracy of existing clustering algorithms is still relatively low. This paper presents an image clustering algorithm based on predefined evenly-distributed class centroids (PEDCC) and composite cosine distance. Compared with the current popular auto-encoder structure, we design an encoder-only network structure with normalized latent features, and two effective loss functions in latent feature space by replacing the Euclidean distance with a composite cosine distance. We find that (1) contrastive learning plays a key role in the clustering algorithm and greatly improves the quality of learning latent features; (2) compared with the Euclidean distance, the composite cosine distance can be more suitable for the normalized latent features and PEDCC-based Maximum Mean Discrepancy (MMD) loss function; and (3) for complex natural images, a self-supervised pretrained model can be used to effectively improve clustering performance. Several experiments have been carried out on six common data sets, MNIST, Fashion-MNIST, COIL20, CIFAR-10, STL-10 and ImageNet-10. Experimental results show that our method achieves the best clustering effect compared with other latest clustering algorithms.

APA, Harvard, Vancouver, ISO, and other styles

46

Ye, Fei, and Adrian G. Bors. "Lifelong Compression Mixture Model via Knowledge Relationship Graph." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10900–10908. http://dx.doi.org/10.1609/aaai.v37i9.26292.

Full text

Abstract:

Task-Free Continual Learning (TFCL) represents a challenging scenario for lifelong learning because the model, under this paradigm, does not access any task information. The Dynamic Expansion Model (DEM) has shown promising results in this scenario due to its scalability and generalisation power. However, DEM focuses only on addressing forgetting and ignores minimizing the model size, which limits its deployment in practical systems. In this work, we aim to simultaneously address network forgetting and model size optimization by developing the Lifelong Compression Mixture Model (LGMM) equipped with the Maximum Mean Discrepancy (MMD) based expansion criterion for model expansion. A diversity-aware sample selection approach is proposed to selectively store a variety of samples to promote information diversity among the components of the LGMM, which allows more knowledge to be captured with an appropriate model size. In order to avoid having multiple components with similar knowledge in the LGMM, we propose a data-free component discarding mechanism that evaluates a knowledge relation graph matrix describing the relevance between each pair of components. A greedy selection procedure is proposed to identify and remove the redundant components from the LGMM. The proposed discarding mechanism can be performed during or after the training. Experiments on different datasets show that LGMM achieves the best performance for TFCL.

APA, Harvard, Vancouver, ISO, and other styles

47

Zhang, Yizong, Shaobo Li, Qiuchen He, Ansi Zhang, Chuanjiang Li, and Zihao Liao. "An Intelligent Fault Detection Framework for FW-UAV Based on Hybrid Deep Domain Adaptation Networks and the Hampel Filter." International Journal of Intelligent Systems 2023 (June 7, 2023): 1–19. http://dx.doi.org/10.1155/2023/6608967.

Full text

Abstract:

Fixed-wing unmanned aerial vehicles (FW-UAVs) play an essential role in many fields, but the faults of FW-UAV components lead to severe accidents frequently; so, there is a need to continuously explore more intelligent fault detection methods to improve the safety and reliability of FW-UAVs. Deep learning provides advanced solution ideas for future UAV fault detection, but the current lack of UAV monitoring data limits the advantages of deep learning in UAV fault detection, which are both a challenge and an opportunity. In this paper, we mainly consider the data availability of deep learning under various practical flight conditions of FW-UAVs and propose a fault detection framework based on hybrid deep domain adaptation BiLSTM networks and the Hampel filter (HDBNH), the main purpose of which is to learn the knowledge of acquired data for detecting FW-UAV faults in other unknown operating conditions. HDBNH consists of three modules: feature extractor, domain adaptor, and fault detector. The feature extractor is two BiLSTM networks constructed to extract the past and future state features from the time-series flight data. The discrepancy of feature distribution between different domains is effectively reduced in the domain adaptor by a hybrid adversarial and the maximum mean discrepancy (MMD) domain adaptation method. The fault detector consists of a fault classification module and a Hampel filter. According to the continuous and dynamic characteristics of FW-UAV state changes, the Hampel filter is used to detect and correct the predicted values of the fault classification module. Meanwhile, a new state sample preparation strategy is proposed to support the work of HDBNH better. Finally, the effectiveness of HDBNH is confirmed by conducting extensive experiments in real FW-UAV flight data.

APA, Harvard, Vancouver, ISO, and other styles

48

Li, Xianling, Kai Zhang, Weijun Li, Yi Feng, and Ruonan Liu. "A Two-Stage Transfer Regression Convolutional Neural Network for Bearing Remaining Useful Life Prediction." Machines 10, no. 5 (May 12, 2022): 369. http://dx.doi.org/10.3390/machines10050369.

Full text

Abstract:

Recently, deep learning techniques have been successfully used for bearing remaining useful life (RUL) prediction. However, the degradation pattern of bearings can be much different from each other, which leads to the trained model usually not being able to work well for RUL prediction of a new bearing. As a method that can adapt a model trained on source datasets to a different but relative unlabeled target dataset, transfer learning shows the potential to solve this problem. Therefore, we propose a two-stage transfer regression (TR)-based bearing RUL prediction method. Firstly, the incipient fault point (IFP) is detected by a convolutional neural network (CNN) classifier to identity the start time of degradation stage and label the training samples. Then, a transfer regression CNN with multiloss is constructed for RUL prediction, including regression loss, classification loss, maximum mean discrepancy (MMD) and regularization loss, which can not only extract fault information from fault classification loss for RUL prediction, but also minimize the probability distribution distance, thus helping the method to be trained in a domain-invariant way via the transfer regression algorithm. Finally, real data collected from run-to-failure bearing experiments are analyzed by the TR-based CNN method. The results and comparisons with state-of-the-art methods demonstrate the superiority and reliable performance of the proposed method for bearing RUL prediction.

APA, Harvard, Vancouver, ISO, and other styles

49

Tong, Zhe, Wei Li, Bo Zhang, and Meng Zhang. "Bearing Fault Diagnosis Based on Domain Adaptation Using Transferable Features under Different Working Conditions." Shock and Vibration 2018 (June 28, 2018): 1–12. http://dx.doi.org/10.1155/2018/6714520.

Full text

Abstract:

Bearing failure is the most common failure mode in rotating machinery and can result in large financial losses or even casualties. However, complex structures around bearing and actual variable working conditions can lead to large distribution difference of vibration signal between a training set and a test set, which causes the accuracy-dropping problem of fault diagnosis. Thus, how to improve efficiently the performance of bearing fault diagnosis under different working conditions is always a primary challenge. In this paper, a novel bearing fault diagnosis under different working conditions method is proposed based on domain adaptation using transferable features(DATF). The datasets of normal bearing and faulty bearings are obtained through the fast Fourier transformation (FFT) of raw vibration signals under different motor speeds and load conditions. Then we reduce marginal and conditional distributions simultaneously across domains based on maximum mean discrepancy (MMD) in feature space by refining pseudo test labels, which can be obtained by the nearest-neighbor (NN) classifier built on training data, and then a robust transferable feature representation for training and test domains is achieved after several iterations. With the help of the NN classifier trained on transferable features, bearing fault categories are identified accurately in final. Extensive experiment results show that the proposed method under different working conditions can identify the bearing faults accurately and outperforms obviously competitive approaches.

APA, Harvard, Vancouver, ISO, and other styles

50

Ayalew, Melese, Shijie Zhou, Imran Memon, Md Belal Bin Heyat, Faijan Akhtar, and Xiaojuan Zhang. "View-Invariant Spatiotemporal Attentive Motion Planning and Control Network for Autonomous Vehicles." Machines 10, no. 12 (December 9, 2022): 1193. http://dx.doi.org/10.3390/machines10121193.

Full text

Abstract:

Autonomous driving vehicles (ADVs) are sleeping giant intelligent machines that perceive their environment and make driving decisions. Most existing ADSs are built as hand-engineered perception-planning-control pipelines. However, designing generalized handcrafted rules for autonomous driving in an urban environment is complex. An alternative approach is imitation learning (IL) from human driving demonstrations. However, most previous studies on IL for autonomous driving face several critical challenges: (1) poor generalization ability toward the unseen environment due to distribution shift problems such as changes in driving views and weather conditions; (2) lack of interpretability; and (3) mostly trained to learn the single driving task. To address these challenges, we propose a view-invariant spatiotemporal attentive planning and control network for autonomous vehicles. The proposed method first extracts spatiotemporal representations from images of a front and top driving view sequence through attentive Siamese 3DResNet. Then, the maximum mean discrepancy loss (MMD) is employed to minimize spatiotemporal discrepancies between these driving views and produce an invariant spatiotemporal representation, which reduces domain shift due to view change. Finally, the multitasking learning (MTL) method is employed to jointly train trajectory planning and high-level control tasks based on learned representations and previous motions. Results of extensive experimental evaluations on a large autonomous driving dataset with various weather/lighting conditions verified that the proposed method is effective for feasible motion planning and control in autonomous vehicles.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!