Log in

Relevant bibliographies by topics / Out-of-distribution generalization / Journal articles

To see the other types of publications on this topic, follow the link: Out-of-distribution generalization.

Journal articles on the topic 'Out-of-distribution generalization'

Author: Grafiati

Published: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Out-of-distribution generalization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ye, Nanyang, Lin Zhu, Jia Wang, Zhaoyu Zeng, Jiayao Shao, Chensheng Peng, Bikang Pan, Kaican Li, and Jun Zhu. "Certifiable Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10927–35. http://dx.doi.org/10.1609/aaai.v37i9.26295.

Full text

Abstract:

Machine learning methods suffer from test-time performance degeneration when faced with out-of-distribution (OoD) data whose distribution is not necessarily the same as training data distribution. Although a plethora of algorithms have been proposed to mitigate this issue, it has been demonstrated that achieving better performance than ERM simultaneously on different types of distributional shift datasets is challenging for existing approaches. Besides, it is unknown how and to what extent these methods work on any OoD datum without theoretical guarantees. In this paper, we propose a certifiable out-of-distribution generalization method that provides provable OoD generalization performance guarantees via a functional optimization framework leveraging random distributions and max-margin learning for each input datum. With this approach, the proposed algorithmic scheme can provide certified accuracy for each input datum's prediction on the semantic space and achieves better performance simultaneously on OoD datasets dominated by correlation shifts or diversity shifts. Our code is available at https://github.com/ZlatanWilliams/StochasticDisturbanceLearning.

APA, Harvard, Vancouver, ISO, and other styles

2

Yuan, Lingxiao, Harold S. Park, and Emma Lejeune. "Towards out of distribution generalization for problems in mechanics." Computer Methods in Applied Mechanics and Engineering 400 (October 2022): 115569. http://dx.doi.org/10.1016/j.cma.2022.115569.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Liu, Anji, Hongming Xu, Guy Van den Broeck, and Yitao Liang. "Out-of-Distribution Generalization by Neural-Symbolic Joint Training." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 10 (June 26, 2023): 12252–59. http://dx.doi.org/10.1609/aaai.v37i10.26444.

Full text

Abstract:

This paper develops a novel methodology to simultaneously learn a neural network and extract generalized logic rules. Different from prior neural-symbolic methods that require background knowledge and candidate logical rules to be provided, we aim to induce task semantics with minimal priors. This is achieved by a two-step learning framework that iterates between optimizing neural predictions of task labels and searching for a more accurate representation of the hidden task semantics. Notably, supervision works in both directions: (partially) induced task semantics guide the learning of the neural network and induced neural predictions admit an improved semantic representation. We demonstrate that our proposed framework is capable of achieving superior out-of-distribution generalization performance on two tasks: (i) learning multi-digit addition, where it is trained on short sequences of digits and tested on long sequences of digits; (ii) predicting the optimal action in the Tower of Hanoi, where the model is challenged to discover a policy independent of the number of disks in the puzzle.

APA, Harvard, Vancouver, ISO, and other styles

4

Yu, Yemin, Luotian Yuan, Ying Wei, Hanyu Gao, Fei Wu, Zhihua Wang, and Xinhai Ye. "RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 1 (March 24, 2024): 374–82. http://dx.doi.org/10.1609/aaai.v38i1.27791.

Full text

Abstract:

Machine learning-assisted retrosynthesis prediction models have been gaining widespread adoption, though their performances oftentimes degrade significantly when deployed in real-world applications embracing out-of-distribution (OOD) molecules or reactions. Despite steady progress on standard benchmarks, our understanding of existing retrosynthesis prediction models under the premise of distribution shifts remains stagnant. To this end, we first formally sort out two types of distribution shifts in retrosynthesis prediction and construct two groups of benchmark datasets. Next, through comprehensive experiments, we systematically compare state-of-the-art retrosynthesis prediction models on the two groups of benchmarks, revealing the limitations of previous in-distribution evaluation and re-examining the advantages of each model. More remarkably, we are motivated by the above empirical insights to propose two model-agnostic techniques that can improve the OOD generalization of arbitrary off-the-shelf retrosynthesis prediction algorithms. Our preliminary experiments show their high potential with an average performance improvement of 4.6%, and the established benchmarks serve as a foothold for further retrosynthesis prediction research towards OOD generalization.

APA, Harvard, Vancouver, ISO, and other styles

5

Zhu, Lin, Xinbing Wang, Chenghu Zhou, and Nanyang Ye. "Bayesian Cross-Modal Alignment Learning for Few-Shot Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 11461–69. http://dx.doi.org/10.1609/aaai.v37i9.26355.

Full text

Abstract:

Recent advances in large pre-trained models showed promising results in few-shot learning. However, their generalization ability on two-dimensional Out-of-Distribution (OoD) data, i.e., correlation shift and diversity shift, has not been thoroughly investigated. Researches have shown that even with a significant amount of training data, few methods can achieve better performance than the standard empirical risk minimization method (ERM) in OoD generalization. This few-shot OoD generalization dilemma emerges as a challenging direction in deep neural network generalization research, where the performance suffers from overfitting on few-shot examples and OoD generalization errors. In this paper, leveraging a broader supervision source, we explore a novel Bayesian cross-modal image-text alignment learning method (Bayes-CAL) to address this issue. Specifically, the model is designed as only text representations are fine-tuned via a Bayesian modelling approach with gradient orthogonalization loss and invariant risk minimization (IRM) loss. The Bayesian approach is essentially introduced to avoid overfitting the base classes observed during training and improve generalization to broader unseen classes. The dedicated loss is introduced to achieve better image-text alignment by disentangling the causal and non-casual parts of image features. Numerical experiments demonstrate that Bayes-CAL achieved state-of-the-art OoD generalization performances on two-dimensional distribution shifts. Moreover, compared with CLIP-like models, Bayes-CAL yields more stable generalization performances on unseen classes. Our code is available at https://github.com/LinLLLL/BayesCAL.

APA, Harvard, Vancouver, ISO, and other styles

6

Lavda, Frantzeska, and Alexandros Kalousis. "Semi-Supervised Variational Autoencoders for Out-of-Distribution Generation." Entropy 25, no. 12 (December 14, 2023): 1659. http://dx.doi.org/10.3390/e25121659.

Full text

Abstract:

Humans are able to quickly adapt to new situations, learn effectively with limited data, and create unique combinations of basic concepts. In contrast, generalizing out-of-distribution (OOD) data and achieving combinatorial generalizations are fundamental challenges for machine learning models. Moreover, obtaining high-quality labeled examples can be very time-consuming and expensive, particularly when specialized skills are required for labeling. To address these issues, we propose BtVAE, a method that utilizes conditional VAE models to achieve combinatorial generalization in certain scenarios and consequently to generate out-of-distribution (OOD) data in a semi-supervised manner. Unlike previous approaches that use new factors of variation during testing, our method uses only existing attributes from the training data but in ways that were not seen during training (e.g., small objects of a specific shape during training and large objects of the same shape during testing).

APA, Harvard, Vancouver, ISO, and other styles

7

Su, Hang, and Wei Wang. "An Out-of-Distribution Generalization Framework Based on Variational Backdoor Adjustment." Mathematics 12, no. 1 (December 26, 2023): 85. http://dx.doi.org/10.3390/math12010085.

Full text

Abstract:

In practical applications, learning models that can perform well even when the data distribution is different from the training set are essential and meaningful. Such problems are often referred to as out-of-distribution (OOD) generalization problems. In this paper, we propose a method for OOD generalization based on causal inference. Unlike the prevalent OOD generalization methods, our approach does not require the environment labels associated with the data in the training set. We analyze the causes of distributional shifts in data from a causal modeling perspective and then propose a backdoor adjustment method based on variational inference. Finally, we constructed a unique network structure to simulate the variational inference process. The proposed variational backdoor adjustment (VBA) framework can be combined with any mainstream backbone network. In addition to theoretical derivation, we conduct experiments on different datasets to demonstrate that our method performs well in prediction accuracy and generalization gaps. Furthermore, by comparing the VBA framework with other mainstream OOD methods, we show that VBA performs better than mainstream methods.

APA, Harvard, Vancouver, ISO, and other styles

8

Cao, Linfeng, Aofan Jiang, Wei Li, Huaying Wu, and Nanyang Ye. "OoDHDR-Codec: Out-of-Distribution Generalization for HDR Image Compression." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 158–66. http://dx.doi.org/10.1609/aaai.v36i1.19890.

Full text

Abstract:

Recently, deep learning has been proven to be a promising approach in standard dynamic range (SDR) image compression. However, due to the wide luminance distribution of high dynamic range (HDR) images and the lack of large standard datasets, developing a deep model for HDR image compression is much more challenging. To tackle this issue, we view HDR data as distributional shifts of SDR data and the HDR image compression can be modeled as an out-of-distribution generalization (OoD) problem. Herein, we propose a novel out-of-distribution (OoD) HDR image compression framework (OoDHDR-codec). It learns the general representation across HDR and SDR environments, and allows the model to be trained effectively using a large set of SDR datases supplemented with much fewer HDR samples. Specifically, OoDHDR-codec consists of two branches to process the data from two environments. The SDR branch is a standard blackbox network. For the HDR branch, we develop a hybrid system that models luminance masking and tone mapping with white-box modules and performs content compression with black-box neural networks. To improve the generalization from SDR training data on HDR data, we introduce an invariance regularization term to learn the common representation for both SDR and HDR compression. Extensive experimental results show that the OoDHDR codec achieves strong competitive in-distribution performance and state-of-the-art OoD performance. To the best of our knowledge, our proposed approach is the first work to model HDR compression as OoD generalization problems and our OoD generalization algorithmic framework can be applied to any deep compression model in addition to the network architectural choice demonstrated in the paper. Code available at https://github.com/caolinfeng/OoDHDR-codec.

APA, Harvard, Vancouver, ISO, and other styles

9

Deng, Bin, and Kui Jia. "Counterfactual Supervision-Based Information Bottleneck for Out-of-Distribution Generalization." Entropy 25, no. 2 (January 18, 2023): 193. http://dx.doi.org/10.3390/e25020193.

Full text

Abstract:

Learning invariant (causal) features for out-of-distribution (OOD) generalization have attracted extensive attention recently, and among the proposals, invariant risk minimization (IRM) is a notable solution. In spite of its theoretical promise for linear regression, the challenges of using IRM in linear classification problems remain. By introducing the information bottleneck (IB) principle into the learning of IRM, the IB-IRM approach has demonstrated its power to solve these challenges. In this paper, we further improve IB-IRM from two aspects. First, we show that the key assumption of support overlap of invariant features used in IB-IRM guarantees OOD generalization, and it is still possible to achieve the optimal solution without this assumption. Second, we illustrate two failure modes where IB-IRM (and IRM) could fail in learning the invariant features, and to address such failures, we propose a Counterfactual Supervision-based Information Bottleneck (CSIB) learning algorithm that recovers the invariant features. By requiring counterfactual inference, CSIB works even when accessing data from a single environment. Empirical experiments on several datasets verify our theoretical results.

APA, Harvard, Vancouver, ISO, and other styles

10

Ashok, Arjun, Chaitanya Devaguptapu, and Vineeth N. Balasubramanian. "Learning Modular Structures That Generalize Out-of-Distribution (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 11 (June 28, 2022): 12905–6. http://dx.doi.org/10.1609/aaai.v36i11.21589.

Full text

Abstract:

Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.

APA, Harvard, Vancouver, ISO, and other styles

11

Zou, Xin, and Weiwei Liu. "Coverage-Guaranteed Prediction Sets for Out-of-Distribution Data." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 15 (March 24, 2024): 17263–70. http://dx.doi.org/10.1609/aaai.v38i15.29673.

Full text

Abstract:

Out-of-distribution (OOD) generalization has attracted increasing research attention in recent years, due to its promising experimental results in real-world applications. In this paper, we study the confidence set prediction problem in the OOD generalization setting. Split conformal prediction (SCP) is an efficient framework for handling the confidence set prediction problem. However, the validity of SCP requires the examples to be exchangeable, which is violated in the OOD setting. Empirically, we show that trivially applying SCP results in a failure to maintain the marginal coverage when the unseen target domain is different from the source domain. To address this issue, we develop a method for forming confident prediction sets in the OOD setting and theoretically prove the validity of our method. Finally, we conduct experiments on simulated data to empirically verify the correctness of our theory and the validity of our proposed method.

APA, Harvard, Vancouver, ISO, and other styles

12

Bai, Haoyue, Rui Sun, Lanqing Hong, Fengwei Zhou, Nanyang Ye, Han-Jia Ye, S. H. Gary Chan, and Zhenguo Li. "DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6705–13. http://dx.doi.org/10.1609/aaai.v35i8.16829.

Full text

Abstract:

While deep learning demonstrates its strong ability to handle independent and identically distributed (IID) data, it often suffers from out-of-distribution (OoD) generalization, where the test data come from another distribution (w.r.t. the training one). Designing a general OoD generalization framework for a wide range of applications is challenging, mainly due to different kinds of distribution shifts in the real world, such as the shift across domains or the extrapolation of correlation. Most of the previous approaches can only solve one specific distribution shift, leading to unsatisfactory performance when applied to various OoD benchmarks. In this work, we propose DecAug, a novel decomposed feature representation and semantic augmentation approach for OoD generalization. Specifically, DecAug disentangles the category-related and context-related features by orthogonalizing the two gradients (w.r.t. intermediate features) of losses for predicting category and context labels, where category-related features contain causal information of the target object, while context-related features cause distribution shifts between training and test data. Furthermore, we perform gradient-based augmentation on context-related features to improve the robustness of learned representations. Experimental results show that DecAug outperforms other state-of-the-art methods on various OoD datasets, which is among the very few methods that can deal with different types of OoD generalization challenges.

APA, Harvard, Vancouver, ISO, and other styles

13

Fan, Caoyun, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, and Yaohui Jin. "Unlock the Potential of Counterfactually-Augmented Data in Out-Of-Distribution Generalization." Expert Systems with Applications 238 (March 2024): 122066. http://dx.doi.org/10.1016/j.eswa.2023.122066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Ramachandran, Sai Niranjan, Rudrabha Mukhopadhyay, Madhav Agarwal, C. V. Jawahar, and Vinay Namboodiri. "Understanding the Generalization of Pretrained Diffusion Models on Out-of-Distribution Data." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 13 (March 24, 2024): 14767–75. http://dx.doi.org/10.1609/aaai.v38i13.29395.

Full text

Abstract:

This work tackles the important task of understanding out-of-distribution behavior in two prominent types of generative models, i.e., GANs and Diffusion models. Understanding this behavior is crucial in understanding their broader utility and risks as these systems are increasingly deployed in our daily lives. Our first contribution is demonstrating that diffusion spaces outperform GANs' latent spaces in inverting high-quality OOD images. We also provide a theoretical analysis attributing this to the lack of prior holes in diffusion spaces. Our second significant contribution is to provide a theoretical hypothesis that diffusion spaces can be projected onto a bounded hypersphere, enabling image manipulation through geodesic traversal between inverted images. Our analysis shows that different geodesics share common attributes for the same manipulation, which we leverage to perform various image manipulations. We conduct thorough empirical evaluations to support and validate our claims. Finally, our third and final contribution introduces a novel approach to the few-shot sampling for out-of-distribution data by inverting a few images to sample from the cluster formed by the inverted latents. The proposed technique achieves state-of-the-art results for the few-shot generation task in terms of image quality. Our research underscores the promise of diffusion spaces in out-of-distribution imaging and offers avenues for further exploration. Please find more details about our project at \url{http://cvit.iiit.ac.in/research/projects/cvit-projects/diffusionOOD}

APA, Harvard, Vancouver, ISO, and other styles

15

Jia, Tianrui, Haoyang Li, Cheng Yang, Tao Tao, and Chuan Shi. "Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 8 (March 24, 2024): 8562–70. http://dx.doi.org/10.1609/aaai.v38i8.28700.

Full text

Abstract:

Graph neural networks (GNNs) have been demonstrated to perform well in graph representation learning, but always lacking in generalization capability when tackling out-of-distribution (OOD) data. Graph invariant learning methods, backed by the invariance principle among defined multiple environments, have shown effectiveness in dealing with this issue. However, existing methods heavily rely on well-predefined or accurately generated environment partitions, which are hard to be obtained in practice, leading to sub-optimal OOD generalization performances. In this paper, we propose a novel graph invariant learning method based on invariant and variant patterns co-mixup strategy, which is capable of jointly generating mixed multiple environments and capturing invariant patterns from the mixed graph data. Specifically, we first adopt a subgraph extractor to identify invariant subgraphs. Subsequently, we design one novel co-mixup strategy, i.e., jointly conducting environment mixup and invariant mixup. For the environment mixup, we mix the variant environment-related subgraphs so as to generate sufficiently diverse multiple environments, which is important to guarantee the quality of the graph invariant learning. For the invariant mixup, we mix the invariant subgraphs, further encouraging to capture invariant patterns behind graphs while getting rid of spurious correlations for OOD generalization. We demonstrate that the proposed environment mixup and invariant mixup can mutually promote each other. Extensive experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art under various distribution shifts.

APA, Harvard, Vancouver, ISO, and other styles

16

Zhang, Lily H., and Rajesh Ranganath. "Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (June 26, 2023): 15305–12. http://dx.doi.org/10.1609/aaai.v37i12.26785.

Full text

Abstract:

Methods which utilize the outputs or feature representations of predictive models have emerged as promising approaches for out-of-distribution (OOD) detection of image inputs. However, as demonstrated in previous work, these methods struggle to detect OOD inputs that share nuisance values (e.g. background) with in-distribution inputs. The detection of shared-nuisance OOD (SN-OOD) inputs is particularly relevant in real-world applications, as anomalies and in-distribution inputs tend to be captured in the same settings during deployment. In this work, we provide a possible explanation for these failures and propose nuisance-aware OOD detection to address them. Nuisance-aware OOD detection substitutes a classifier trained via Empirical Risk Minimization (ERM) with one that 1. approximates a distribution where the nuisance-label relationship is broken and 2. yields representations that are independent of the nuisance under this distribution, both marginally and conditioned on the label. We can train a classifier to achieve these objectives using Nuisance-Randomized Distillation (NuRD), an algorithm developed for OOD generalization under spurious correlations. Output- and feature-based nuisance-aware OOD detection perform substantially better than their original counterparts, succeeding even when detection based on domain generalization algorithms fails to improve performance.

APA, Harvard, Vancouver, ISO, and other styles

17

Gwon, Kyungpil, and Joonhyuk Yoo. "Out-of-Distribution (OOD) Detection and Generalization Improved by Augmenting Adversarial Mixup Samples." Electronics 12, no. 6 (March 16, 2023): 1421. http://dx.doi.org/10.3390/electronics12061421.

Full text

Abstract:

Deep neural network (DNN) models are usually built based on the i.i.d. (independent and identically distributed), also known as in-distribution (ID), assumption on the training samples and test data. However, when models are deployed in a real-world scenario with some distributional shifts, test data can be out-of-distribution (OOD) and both OOD detection and OOD generalization should be simultaneously addressed to ensure the reliability and safety of applied AI systems. Most existing OOD detectors pursue these two goals separately, and therefore, are sensitive to covariate shift rather than semantic shift. To alleviate this problem, this paper proposes a novel adversarial mixup (AM) training method which simply executes OOD data augmentation to synthesize differently distributed data and designs a new AM loss function to learn how to handle OOD data. The proposed AM generates OOD samples being significantly diverged from the support of training data distribution but not completely disjoint to increase the generalization capability of the OOD detector. In addition, the AM is combined with a distributional-distance-aware OOD detector at inference to detect semantic OOD samples more efficiently while being robust to covariate shift due to data tampering. Experimental evaluation validates that the designed AM is effective on both OOD detection and OOD generalization tasks compared to previous OOD detectors and data mixup methods.

APA, Harvard, Vancouver, ISO, and other styles

18

Maier, Anatol, and Christian Riess. "Reliable Out-of-Distribution Recognition of Synthetic Images." Journal of Imaging 10, no. 5 (May 1, 2024): 110. http://dx.doi.org/10.3390/jimaging10050110.

Full text

Abstract:

Generative adversarial networks (GANs) and diffusion models (DMs) have revolutionized the creation of synthetically generated but realistic-looking images. Distinguishing such generated images from real camera captures is one of the key tasks in current multimedia forensics research. One particular challenge is the generalization to unseen generators or post-processing. This can be viewed as an issue of handling out-of-distribution inputs. Forensic detectors can be hardened by the extensive augmentation of the training data or specifically tailored networks. Nevertheless, such precautions only manage but do not remove the risk of prediction failures on inputs that look reasonable to an analyst but in fact are out of the training distribution of the network. With this work, we aim to close this gap with a Bayesian Neural Network (BNN) that provides an additional uncertainty measure to warn an analyst of difficult decisions. More specifically, the BNN learns the task at hand and also detects potential confusion between post-processing and image generator artifacts. Our experiments show that the BNN achieves on-par performance with the state-of-the-art detectors while producing more reliable predictions on out-of-distribution examples.

APA, Harvard, Vancouver, ISO, and other styles

19

Boccato, Tommaso, Alberto Testolin, and Marco Zorzi. "Learning Numerosity Representations with Transformers: Number Generation Tasks and Out-of-Distribution Generalization." Entropy 23, no. 7 (July 3, 2021): 857. http://dx.doi.org/10.3390/e23070857.

Full text

Abstract:

One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional models assuming that some information is given as input. In the domain of numerical cognition, deep learning architectures have successfully demonstrated that approximate numerosity representations can emerge in multi-layer networks that build latent representations of a set of images with a varying number of items. However, existing models have focused on tasks requiring to conditionally estimate numerosity information from a given image. Here, we focus on a set of much more challenging tasks, which require to conditionally generate synthetic images containing a given number of items. We show that attention-based architectures operating at the pixel level can learn to produce well-formed images approximately containing a specific number of items, even when the target numerosity was not present in the training distribution.

APA, Harvard, Vancouver, ISO, and other styles

20

Chen, Minghui, Cheng Wen, Feng Zheng, Fengxiang He, and Ling Shao. "VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 321–29. http://dx.doi.org/10.1609/aaai.v36i1.19908.

Full text

Abstract:

Invariance to diverse types of image corruption, such as noise, blurring, or colour shifts, is essential to establish robust models in computer vision. Data augmentation has been the major approach in improving the robustness against common corruptions. However, the samples produced by popular augmentation strategies deviate significantly from the underlying data manifold. As a result, performance is skewed toward certain types of corruption. To address this issue, we propose a multi-source vicinal transfer augmentation (VITA) method for generating diverse on-manifold samples. The proposed VITA consists of two complementary parts: tangent transfer and integration of multi-source vicinal samples. The tangent transfer creates initial augmented samples for improving corruption robustness. The integration employs a generative model to characterize the underlying manifold built by vicinal samples, facilitating the generation of on-manifold samples. Our proposed VITA significantly outperforms the current state-of-the-art augmentation methods, demonstrated in extensive experiments on corruption benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

21

Xin, Shiji, Yifei Wang, Jingtong Su, and Yisen Wang. "On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10519–27. http://dx.doi.org/10.1609/aaai.v37i9.26250.

Full text

Abstract:

Despite impressive success in many tasks, deep learning models are shown to rely on spurious features, which will catastrophically fail when generalized to out-of-distribution (OOD) data. Invariant Risk Minimization (IRM) is proposed to alleviate this issue by extracting domain-invariant features for OOD generalization. Nevertheless, recent work shows that IRM is only effective for a certain type of distribution shift (e.g., correlation shift) while it fails for other cases (e.g., diversity shift). Meanwhile, another thread of method, Adversarial Training (AT), has shown better domain transfer performance, suggesting that it has the potential to be an effective candidate for extracting domain-invariant features. This paper investigates this possibility by exploring the similarity between the IRM and AT objectives. Inspired by this connection, we propose Domain-wise Adversarial Training (DAT), an AT-inspired method for alleviating distribution shift by domain-specific perturbations. Extensive experiments show that our proposed DAT can effectively remove domain-varying features and improve OOD generalization under both correlation shift and diversity shift.

APA, Harvard, Vancouver, ISO, and other styles

22

Hassan, A., S. A. Dar, P. B. Ahmad, and B. A. Para. "A new generalization of Aradhana distribution: Properties and applications." Journal of Applied Mathematics, Statistics and Informatics 16, no. 2 (December 1, 2020): 51–66. http://dx.doi.org/10.2478/jamsi-2020-0009.

Full text

Abstract:

Abstract In this paper, we introduce a new generalization of Aradhana distribution called as Weighted Aradhana Distribution (WID). The statistical properties of this distribution are derived and the model parameters are estimated by maximum likelihood estimation. Simulation study of ML estimates of the parameters is carried out in R software. Finally, an application to real data set is presented to examine the significance of newly introduced model.

APA, Harvard, Vancouver, ISO, and other styles

23

Chen, Zhe, Zhiquan Ding, Xiaoling Zhang, Xin Zhang, and Tianqi Qin. "Improving Out-of-Distribution Generalization in SAR Image Scene Classification with Limited Training Samples." Remote Sensing 15, no. 24 (December 17, 2023): 5761. http://dx.doi.org/10.3390/rs15245761.

Full text

Abstract:

For practical maritime SAR image classification tasks with special imaging platforms, scenes to be classified are often different from those in the training sets. The quantity and diversity of the available training data can also be extremely limited. This problem of out-of-distribution (OOD) generalization with limited training samples leads to a sharp drop in the performance of conventional deep learning algorithms. In this paper, a knowledge-guided neural network (KGNN) model is proposed to overcome these challenges. By analyzing the saliency features of various maritime SAR scenes, universal knowledge in descriptive sentences is summarized. A feature integration strategy is designed to assign the descriptive knowledge to the ResNet-18 backbone. Both the individual semantic information and the inherent relations of the entities in SAR images are addressed. The experimental results show that our KGNN method outperforms conventional deep learning models in OOD scenarios with varying training sample sizes and achieves higher robustness in handling distributional shifts caused by weather conditions, terrain type, and sensor characteristics. In addition, the KGNN model converges within many fewer epochs during training. The performance improvement indicates that the KGNN model learns representations guided by beneficial properties for ODD generalization with limited training samples.

APA, Harvard, Vancouver, ISO, and other styles

24

Sha, Naijun. "A New Inference Approach for Type-II Generalized Birnbaum-Saunders Distribution." Stats 2, no. 1 (February 19, 2019): 148–63. http://dx.doi.org/10.3390/stats2010011.

Full text

Abstract:

The Birnbaum-Saunders (BS) distribution, with its generalizations, has been successfully applied in a wide variety of fields. One generalization, type-II generalized BS (denoted as GBS-II), has been developed and attracted considerable attention in recent years. In this article, we propose a new simple and convenient procedure of inference approach for GBS-II distribution. An extensive simulation study is carried out to assess performance of the methods under various settings of parameter values with different sample sizes. Real data are analyzed for illustrative purposes to display the efficiency of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

25

Sharifi-Noghabi, Hossein, Parsa Alamzadeh Harjandi, Olga Zolotareva, Colin C. Collins, and Martin Ester. "Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction." Nature Machine Intelligence 3, no. 11 (November 2021): 962–72. http://dx.doi.org/10.1038/s42256-021-00408-w.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Das, Siddhant, and Markus Nöth. "Times of arrival and gauge invariance." Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 477, no. 2250 (June 2021): 20210101. http://dx.doi.org/10.1098/rspa.2021.0101.

Full text

Abstract:

We revisit the arguments underlying two well-known arrival-time distributions in quantum mechanics, viz., the Aharonov–Bohm–Kijowski (ABK) distribution, applicable for freely moving particles, and the quantum flux (QF) distribution. An inconsistency in the original axiomatic derivation of Kijowski’s result is pointed out, along with an inescapable consequence of the ‘negative arrival times’ inherent to this proposal (and generalizations thereof). The ABK free-particle restriction is lifted in a discussion of an explicit arrival-time set-up featuring a charged particle moving in a constant magnetic field. A natural generalization of the ABK distribution is in this case shown to be critically gauge-dependent. A direct comparison to the QF distribution, which does not exhibit this flaw, is drawn (its acknowledged drawback concerning the quantum backflow effect notwithstanding).

APA, Harvard, Vancouver, ISO, and other styles

27

Zhi Tan, Zhi Tan, and Zhao-Fei Teng Zhi Tan. "Image Domain Generalization Method based on Solving Domain Discrepancy Phenomenon." 電腦學刊 33, no. 3 (June 2022): 171–85. http://dx.doi.org/10.53106/199115992022063303014.

Full text

Abstract:

<p>In order to solve the problem that the recognition performance is obviously degraded when the model trained by known data distribution transfer to unknown data distribution, domain generalization method based on attention mechanism and adversarial training is proposed. Firstly, a multi-level attention mechanism module is designed to capture the underlying abstract information features of the image; Secondly, increases the loss limit of the generative adversarial network，the virtual enhanced domain which can simulate the target domain of unknown data distribution is generated by adversarial training on the premise of ensuring the consistency of data features and semantics; Finally, through the data mixing algorithm, the source domain and virtual enhanced domain are mixed and input into the model to improve the performance of the classifier. The experiment is carried out on five classic digit recognition and CIFAR-10 series datasets. The experimental results show that the model can learn better decision boundary, generate virtual enhanced domain and significantly improve the accuracy of recognition after model transplantation. Comparing to the previous method, our method improves average accuracy by at least 2.5% and 3% respectively. Experiments on five classic digit recognition and CIFAR-10 series datasets which significantly improves the classification average accuracy after model transfer. </p> <p> </p>

APA, Harvard, Vancouver, ISO, and other styles

28

Vasiliuk, Anton, Daria Frolova, Mikhail Belyaev, and Boris Shirokikh. "Limitations of Out-of-Distribution Detection in 3D Medical Image Segmentation." Journal of Imaging 9, no. 9 (September 18, 2023): 191. http://dx.doi.org/10.3390/jimaging9090191.

Full text

Abstract:

Deep learning models perform unreliably when the data come from a distribution different from the training one. In critical applications such as medical imaging, out-of-distribution (OOD) detection methods help to identify such data samples, preventing erroneous predictions. In this paper, we further investigate OOD detection effectiveness when applied to 3D medical image segmentation. We designed several OOD challenges representing clinically occurring cases and found that none of the methods achieved acceptable performance. Methods not dedicated to segmentation severely failed to perform in the designed setups; the best mean false-positive rate at a 95% true-positive rate (FPR) was 0.59. Segmentation-dedicated methods still achieved suboptimal performance, with the best mean FPR being 0.31 (lower is better). To indicate this suboptimality, we developed a simple method called Intensity Histogram Features (IHF), which performed comparably or better in the same challenges, with a mean FPR of 0.25. Our findings highlight the limitations of the existing OOD detection methods with 3D medical images and present a promising avenue for improving them. To facilitate research in this area, we release the designed challenges as a publicly available benchmark and formulate practical criteria to test the generalization of OOD detection beyond the suggested benchmark. We also propose IHF as a solid baseline to contest emerging methods.

APA, Harvard, Vancouver, ISO, and other styles

29

Bogin, Ben, Sanjay Subramanian, Matt Gardner, and Jonathan Berant. "Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering." Transactions of the Association for Computational Linguistics 9 (2021): 195–210. http://dx.doi.org/10.1162/tacl_a_00361.

Full text

Abstract:

Abstract Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-of-the-art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties in generalization to out-of-distribution examples. In this work, we propose a model that computes a representation and denotation for all question spans in a bottom-up, compositional manner using a CKY-style parser. Our model induces latent trees, driven by end-to-end (the answer) supervision only. We show that this inductive bias towards tree structures dramatically improves systematic generalization to out-of- distribution examples, compared to strong baselines on an arithmetic expressions benchmark as well as on C losure, a dataset that focuses on systematic generalization for grounded question answering. On this challenging dataset, our model reaches an accuracy of 96.1%, significantly higher than prior models that almost perfectly solve the task on a random, in-distribution split.

APA, Harvard, Vancouver, ISO, and other styles

30

He, Rundong, Yue Yuan, Zhongyi Han, Fan Wang, Wan Su, Yilong Yin, Tongliang Liu, and Yongshun Gong. "Exploring Channel-Aware Typical Features for Out-of-Distribution Detection." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 11 (March 24, 2024): 12402–10. http://dx.doi.org/10.1609/aaai.v38i11.29132.

Full text

Abstract:

Detecting out-of-distribution (OOD) data is essential to ensure the reliability of machine learning models when deployed in real-world scenarios. Different from most previous test-time OOD detection methods that focus on designing OOD scores, we delve into the challenges in OOD detection from the perspective of typicality and regard the feature’s high-probability region as the feature’s typical set. However, the existing typical-feature-based OOD detection method implies an assumption: the proportion of typical feature sets for each channel is fixed. According to our experimental analysis, each channel contributes differently to OOD detection. Adopting a fixed proportion for all channels results in several channels losing too many typical features or incorporating too many abnormal features, resulting in low performance. Therefore, exploring the channel-aware typical features is crucial to better-separating ID and OOD data. Driven by this insight, we propose expLoring channel-Aware tyPical featureS (LAPS). Firstly, LAPS obtains the channel-aware typical set by calibrating the channel-level typical set with the global typical set from the mean and standard deviation. Then, LAPS rectifies the features into channel-aware typical sets to obtain channel-aware typical features. Finally, LAPS leverages the channel-aware typical features to calculate the energy score for OOD detection. Theoretical and visual analyses verify that LAPS achieves a better bias-variance trade-off. Experiments verify the effectiveness and generalization of LAPS under different architectures and OOD scores.

APA, Harvard, Vancouver, ISO, and other styles

31

Lee, Ingyun, Wooju Lee, and Hyun Myung. "Domain Generalization with Vital Phase Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 4 (March 24, 2024): 2892–900. http://dx.doi.org/10.1609/aaai.v38i4.28070.

Full text

Abstract:

Deep neural networks have shown remarkable performance in image classification. However, their performance significantly deteriorates with corrupted input data. Domain generalization methods have been proposed to train robust models against out-of-distribution data. Data augmentation in the frequency domain is one of such approaches that enable a model to learn phase features to establish domain-invariant representations. This approach changes the amplitudes of the input data while preserving the phases. However, using fixed phases leads to susceptibility to phase fluctuations because amplitudes and phase fluctuations commonly occur in out-of-distribution. In this study, to address this problem, we introduce an approach using finite variation of the phases of input data rather than maintaining fixed phases. Based on the assumption that the degree of domain-invariant features varies for each phase, we propose a method to distinguish phases based on this degree. In addition, we propose a method called vital phase augmentation (VIPAug) that applies the variation to the phases differently according to the degree of domain-invariant features of given phases. The model depends more on the vital phases that contain more domain-invariant features for attaining robustness to amplitude and phase fluctuations. We present experimental evaluations of our proposed approach, which exhibited improved performance for both clean and corrupted data. VIPAug achieved SOTA performance on the benchmark CIFAR-10 and CIFAR-100 datasets, as well as near-SOTA performance on the ImageNet-100 and ImageNet datasets. Our code is available at https://github.com/excitedkid/vipaug.

APA, Harvard, Vancouver, ISO, and other styles

32

Ding, Kun, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, and Chunhong Pan. "Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 2 (March 24, 2024): 1528–36. http://dx.doi.org/10.1609/aaai.v38i2.27918.

Full text

Abstract:

We propose a generalized method for boosting the generalization ability of pre-trained vision-language models (VLMs) while fine-tuning on downstream few-shot tasks. The idea is realized by exploiting out-of-distribution (OOD) detection to predict whether a sample belongs to a base distribution or a novel distribution and then using the score generated by a dedicated competition based scoring function to fuse the zero-shot and few-shot classifier. The fused classifier is dynamic, which will bias towards the zero-shot classifier if a sample is more likely from the distribution pre-trained on, leading to improved base-to-novel generalization ability. Our method is performed only in test stage, which is applicable to boost existing methods without time-consuming re-training. Extensive experiments show that even weak distribution detectors can still improve VLMs' generalization ability. Specifically, with the help of OOD detectors, the harmonic mean of CoOp and ProGrad increase by 2.6 and 1.5 percentage points over 11 recognition datasets in the base-to-novel setting.

APA, Harvard, Vancouver, ISO, and other styles

33

Simmachan, Teerawat, and Wikanda Phaphan. "Generalization of Two-Sided Length Biased Inverse Gaussian Distributions and Applications." Symmetry 14, no. 10 (September 20, 2022): 1965. http://dx.doi.org/10.3390/sym14101965.

Full text

Abstract:

The notion of length-biased distribution can be used to develop adequate models. Length-biased distribution was known as a special case of weighted distribution. In this work, a new class of length-biased distribution, namely the two-sided length-biased inverse Gaussian distribution (TS-LBIG), was introduced. The physical phenomenon of this scenario was described in a case of cracks developing from two sides. Since the probability density function of the original TS-LBIG distribution cannot be written in a closed-form expression, its generalization form was further introduced. Important properties such as the moment-generating function and survival function cannot be provided. We offered a different approach to solving this problem. Some distributional properties were investigated. The parameters were estimated by the method of the moment. Monte Carlo simulation studies were carried out to appraise the performance of the suggested estimators using bias, variance, and mean square error. An application of a real dataset was presented for illustration. The results showed that the suggested estimators performed better than the original study. The proposed distribution provided a more appropriate model than other candidate distributions for fitting based on Akaike information criterion.

APA, Harvard, Vancouver, ISO, and other styles

34

Nain, Philippe. "On a generalization of the preemptive resume priority." Advances in Applied Probability 18, no. 1 (March 1986): 255–73. http://dx.doi.org/10.2307/1427245.

Full text

Abstract:

This paper considers a queueing system with two classes of customers and a single server, where the service policy is of threshold type. As soon as the amount of work required by the class 1 customers is greater than a fixed threshold, the class 1 customers get the server's attention; otherwise the class 2 customers have the priority. Service interruptions can occur for both classes of customers on the basis of the above description of the service mechanism, and in this case the service interruption discipline is preemptive resume priority (PRP). This model, which turns out to be a generalization of the PRP queueing system, has potential applications in computer systems and in communication networks. For Poisson inputs, exponential (arbitrary) servicetime distribution for class 1 (class 2) customers, we derive the Laplace–Stieltjes transform of the stationary joint distribution of the workload of the server, by reducing the analysis to the resolution of a boundary value problem. Explicit formulas are obtained.

APA, Harvard, Vancouver, ISO, and other styles

35

Nain, Philippe. "On a generalization of the preemptive resume priority." Advances in Applied Probability 18, no. 01 (March 1986): 255–73. http://dx.doi.org/10.1017/s0001867800015652.

Full text

Abstract:

This paper considers a queueing system with two classes of customers and a single server, where the service policy is of threshold type. As soon as the amount of work required by the class 1 customers is greater than a fixed threshold, the class 1 customers get the server's attention; otherwise the class 2 customers have the priority. Service interruptions can occur for both classes of customers on the basis of the above description of the service mechanism, and in this case the service interruption discipline is preemptive resume priority (PRP). This model, which turns out to be a generalization of the PRP queueing system, has potential applications in computer systems and in communication networks. For Poisson inputs, exponential (arbitrary) servicetime distribution for class 1 (class 2) customers, we derive the Laplace–Stieltjes transform of the stationary joint distribution of the workload of the server, by reducing the analysis to the resolution of a boundary value problem. Explicit formulas are obtained.

APA, Harvard, Vancouver, ISO, and other styles

36

Zhang, Weifeng, Zhiyuan Wang, Kunpeng Zhang, Ting Zhong, and Fan Zhou. "DyCVAE: Learning Dynamic Causal Factors for Non-stationary Series Domain Generalization (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (June 26, 2023): 16382–83. http://dx.doi.org/10.1609/aaai.v37i13.27051.

Full text

Abstract:

Learning domain-invariant representations is a major task of out-of-distribution generalization. To address this issue, recent efforts have taken into accounting causality, aiming at learning the causal factors with regard to tasks. However, extending existing generalization methods for adapting non-stationary time series may be ineffective, because they fail to model the underlying causal factors due to temporal-domain shifts except for source-domain shifts, as pointed out by recent studies. To this end, we propose a novel model DyCVAE to learn dynamic causal factors. The results on synthetic and real datasets demonstrate the effectiveness of our proposed model for the task of generalization in time series domain.

APA, Harvard, Vancouver, ISO, and other styles

37

Chen, Zhengyu, Teng Xiao, Kun Kuang, Zheqi Lv, Min Zhang, Jinluan Yang, Chengqiang Lu, Hongxia Yang, and Fei Wu. "Learning to Reweight for Generalizable Graph Neural Network." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 8 (March 24, 2024): 8320–28. http://dx.doi.org/10.1609/aaai.v38i8.28673.

Full text

Abstract:

Graph Neural Networks (GNNs) show promising results for graph tasks. However, existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. The fundamental reason for the severe degeneration is that most GNNs are designed based on the I.I.D hypothesis. In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation. In this paper, we study the problem of the generalization ability of GNNs on Out-Of-Distribution (OOD) settings. To solve this problem, we propose the Learning to Reweight for Generalizable Graph Neural Network (L2R-GNN) to enhance the generalization ability for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs. We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability and compares favorably to previous methods in restraining the over-reduced sample size. The variables of graph representation are clustered based on the stability of their correlations, and graph decorrelation method learns weights to remove correlations between the variables of different clusters rather than any two variables. Besides, we introduce an effective stochastic algorithm based on bi-level optimization for the L2R-GNN framework, which enables simultaneously learning the optimal weights and GNN parameters, and avoids the over-fitting issue. Experiments show that L2R-GNN greatly outperforms baselines on various graph prediction benchmarks under distribution shifts.

APA, Harvard, Vancouver, ISO, and other styles

38

Welleck, Sean, Peter West, Jize Cao, and Yejin Choi. "Symbolic Brittleness in Sequence Models: On Systematic Generalization in Symbolic Mathematics." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8629–37. http://dx.doi.org/10.1609/aaai.v36i8.20841.

Full text

Abstract:

Neural sequence models trained with maximum likelihood estimation have led to breakthroughs in many tasks, where success is defined by the gap between training and test performance. However, their ability to achieve stronger forms of generalization remains unclear. We consider the problem of symbolic mathematical integration, as it requires generalizing systematically beyond the training set. We develop a methodology for evaluating generalization that takes advantage of the problem domain's structure and access to a verifier. Despite promising in-distribution performance of sequence-to-sequence models in this domain, we demonstrate challenges in achieving robustness, compositionality, and out-of-distribution generalization, through both carefully constructed manual test suites and a genetic algorithm that automatically finds large collections of failures in a controllable manner. Our investigation highlights the difficulty of generalizing well with the predominant modeling and learning approach, and the importance of evaluating beyond the test set, across different aspects of generalization.

APA, Harvard, Vancouver, ISO, and other styles

39

Nassar, Mazen, Sanku Dey, and Devendra Kumar. "Logarithm Transformed Lomax Distribution with Applications." Calcutta Statistical Association Bulletin 70, no. 2 (November 2018): 122–35. http://dx.doi.org/10.1177/0008068318808135.

Full text

Abstract:

In this article, we introduce a new method for generating distributions which we refer to as logarithm transformed (LT) method. Some statistical properties of the LT method are established. Based on the LT method, we introduce a new generalization of the Lomax distribution that provides better fits than the Lomax distribution and some of its known generalizations. We refer to the new distribution as logarithmic transformed Lomax (LTL) distribution. Various properties of the LTL distribution, including explicit expressions for the moments, quantiles, moment generating function, incomplete moments, conditional moments, Rényi entropy, and order statistics are derived. It appears to be a distribution capable of allowing monotonically decreasing and upside-down bathtub shaped hazard rates depending on its parameters, so it turns out to be quite flexible for analysing non-negative real life data. We discuss the estimation of the model parameters by maximum likelihood method using random censoring scheme. The proposed distribution is utilized to fit a censored data set and the distribution is shown to be more appropriate to the data set than the compared distributions. 2010 Mathematics Subject Classification: 60E05, 60E10, 62E15.

APA, Harvard, Vancouver, ISO, and other styles

40

Lotfollahi, Mohammad, Mohsen Naghipourfar, Fabian J. Theis, and F. Alexander Wolf. "Conditional out-of-distribution generation for unpaired data using transfer VAE." Bioinformatics 36, Supplement_2 (December 2020): i610—i617. http://dx.doi.org/10.1093/bioinformatics/btaa800.

Full text

Abstract:

Abstract Motivation While generative models have shown great success in sampling high-dimensional samples conditional on low-dimensional descriptors (stroke thickness in MNIST, hair color in CelebA, speaker identity in WaveNet), their generation out-of-distribution poses fundamental problems due to the difficulty of learning compact joint distribution across conditions. The canonical example of the conditional variational autoencoder (CVAE), for instance, does not explicitly relate conditions during training and, hence, has no explicit incentive of learning such a compact representation. Results We overcome the limitation of the CVAE by matching distributions across conditions using maximum mean discrepancy in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. As this amount to solving a style-transfer problem, we refer to the model as transfer VAE (trVAE). Benchmarking trVAE on high-dimensional image and single-cell RNA-seq, we demonstrate higher robustness and higher accuracy than existing approaches. We also show qualitatively improved predictions by tackling previously problematic minority classes and multiple conditions in the context of cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively. We further demonstrate that trVAE learns cell-type-specific responses after perturbation and improves the prediction of most cell-type-specific genes by 65%. Availability and implementation The trVAE implementation is available via github.com/theislab/trvae. The results of this article can be reproduced via github.com/theislab/trvae_reproducibility.

APA, Harvard, Vancouver, ISO, and other styles

41

Reyes, Jimmy, Mario A. Rojas, and Jaime Arrué. "A New Generalization of the Student’s t Distribution with an Application in Quantile Regression." Symmetry 13, no. 12 (December 17, 2021): 2444. http://dx.doi.org/10.3390/sym13122444.

Full text

Abstract:

In this work, we present a new generalization of the student’s t distribution. The new distribution is obtained by the quotient of two independent random variables. This quotient consists of a standard Normal distribution divided by the power of a chi square distribution divided by its degrees of freedom. Thus, the new symmetric distribution has heavier tails than the student’s t distribution and extensions of the slash distribution. We develop a procedure to use quantile regression where the response variable or the residuals have high kurtosis. We give the density function expressed by an integral, we obtain some important properties and some useful procedures for making inference, such as moment and maximum likelihood estimators. By way of illustration, we carry out two applications using real data, in the first we provide maximum likelihood estimates for the parameters of the generalized student’s t distribution, student’s t, the extended slash distribution, the modified slash distribution, the slash distribution generalized student’s t test, and the double slash distribution, in the second we perform quantile regression to fit a model where the response variable presents a high kurtosis.

APA, Harvard, Vancouver, ISO, and other styles

42

Mirzadeh, Saeed, and Anis Iranmanesh. "A new class of skew-logistic distribution." Mathematical Sciences 13, no. 4 (October 5, 2019): 375–85. http://dx.doi.org/10.1007/s40096-019-00306-8.

Full text

Abstract:

Abstract In this study, the researchers introduce a new class of the logistic distribution which can be used to model the unimodal data with some skewness present. The new generalization is carried out using the basic idea of Nadarajah (Statistics 48(4):872–895, 2014), called truncated-exponential skew-logistic (TESL) distribution. The TESL distribution is a member of the exponential family; therefore, the skewness parameter can be derived easier. Meanwhile, some important statistical characteristics are presented; the real data set and simulation studies are applied to evaluate the results. Also, the TESL distribution is compared to at least five other skew-logistic distributions.

APA, Harvard, Vancouver, ISO, and other styles

43

Neeleman, Ad, and Kriszta Szendrői. "Radical Pro Drop and the Morphology of Pronouns." Linguistic Inquiry 38, no. 4 (October 2007): 671–714. http://dx.doi.org/10.1162/ling.2007.38.4.671.

Full text

Abstract:

We propose a new generalization governing the crosslinguistic distribution of radical pro drop (the type of pro drop found in Chinese). It occurs only in languages whose pronouns are agglutinating for case, number, or some other nominal feature. Other types of languages cannot omit pronouns freely, although they may have agreement-based pro drop. This generalization can for the most part be derived from three assumptions. (a) Spell-out rules for pronouns may target nonterminal categories. (b) Pro drop is zero spell-out (i.e., deletion) of regular pronouns. (c) Competition between spell-out rules is governed by the Elsewhere Principle. A full derivation relies on an acquisitional strategy motivated by the absence of negative evidence. We test our proposal using data from a sample of twenty languages and The World Atlas of Language Structures (Haspelmath et al. 2005).

APA, Harvard, Vancouver, ISO, and other styles

44

et al., Hassan. "A new generalization of the inverse Lomax distribution with statistical properties and applications." International Journal of ADVANCED AND APPLIED SCIENCES 8, no. 4 (April 2021): 89–97. http://dx.doi.org/10.21833/ijaas.2021.04.011.

Full text

Abstract:

In this paper, we introduce a new generalization of the inverse Lomax distribution with one extra shape parameter, the so-called power inverse Lomax (PIL) distribution, derived by using the power transformation method. We provide a more flexible density function with right-skewed, uni-modal, and reversed J-shapes. The new three-parameter lifetime distribution capable of modeling decreasing, Reversed-J and upside-down hazard rates shapes. Some statistical properties of the PIL distribution are explored, such as quantile measure, moments, moment generating function, incomplete moments, residual life function, and entropy measure. The estimation of the model parameters is discussed using maximum likelihood, least squares, and weighted least squares methods. A simulation study is carried out to compare the efficiencies of different methods of estimation. This study indicated that the maximum likelihood estimates are more efficient than the corresponding least squares and weighted least squares estimates in approximately most of the situations Also, the mean square errors for all estimates are decreasing as the sample size increases. Further, two real data applications are provided in order to examine the flexibility of the PIL model by comparing it with some known distributions. The PIL model offers a more flexible distribution for modeling lifetime data and provides better fits than other models such as inverse Lomax, inverse Weibull, and generalized inverse Weibull.

APA, Harvard, Vancouver, ISO, and other styles

45

Li, Dasen, Zhendong Yin, Yanlong Zhao, Wudi Zhao, and Jiqing Li. "MLFAnet: A Tomato Disease Classification Method Focusing on OOD Generalization." Agriculture 13, no. 6 (May 29, 2023): 1140. http://dx.doi.org/10.3390/agriculture13061140.

Full text

Abstract:

Tomato disease classification based on images of leaves has received wide attention recently. As one of the best tomato disease classification methods, the convolutional neural network (CNN) has an immense impact due to its impressive performance. However, better performance is verified by independent identical distribution (IID) samples of tomato disease, which breaks down dramatically on out-of-distribution (OOD) classification tasks. In this paper, we investigated the corruption shifts, which was a vital component of OOD, and proposed a tomato disease classification method to improve the performance of corruption shift generalization. We first adopted discrete cosine transform (DCT) to obtain the low-frequency components. Then, the weight of the feature map was calculated by multiple low-frequency components, in order to reduce the influence of high-frequency variation caused by corrupted perturbation. The proposed method, termed as a multiple low-frequency attention network (MLFAnet), was verified by the benchmarking of ImageNet-C. The accuracy result and generalization performance confirmed the effectiveness of MLFAnet. The satisfactory generalization performance of our proposed classification method provides a reliable tool for the diagnosis of tomato disease.

APA, Harvard, Vancouver, ISO, and other styles

46

Xu, Xiaofeng, Ivor W. Tsang, and Chuancai Liu. "Improving Generalization via Attribute Selection on Out-of-the-Box Data." Neural Computation 32, no. 2 (February 2020): 485–514. http://dx.doi.org/10.1162/neco_a_01256.

Full text

Abstract:

Zero-shot learning (ZSL) aims to recognize unseen objects (test classes) given some other seen objects (training classes) by sharing information of attributes between different objects. Attributes are artificially annotated for objects and treated equally in recent ZSL tasks. However, some inferior attributes with poor predictability or poor discriminability may have negative impacts on the ZSL system performance. This letter first derives a generalization error bound for ZSL tasks. Our theoretical analysis verifies that selecting the subset of key attributes can improve the generalization performance of the original ZSL model, which uses all the attributes. Unfortunately, previous attribute selection methods have been conducted based on the seen data, and their selected attributes have poor generalization capability to the unseen data, which is unavailable in the training stage of ZSL tasks. Inspired by learning from pseudo-relevance feedback, this letter introduces out-of-the-box data—pseudo-data generated by an attribute-guided generative model—to mimic the unseen data. We then present an iterative attribute selection (IAS) strategy that iteratively selects key attributes based on the out-of-the-box data. Since the distribution of the generated out-of-the-box data is similar to that of the test data, the key attributes selected by IAS can be effectively generalized to test data. Extensive experiments demonstrate that IAS can significantly improve existing attribute-based ZSL methods and achieve state-of-the-art performance.

APA, Harvard, Vancouver, ISO, and other styles

47

Wurmbrand, Susi. "Stripping and Topless Complements." Linguistic Inquiry 48, no. 2 (April 2017): 341–66. http://dx.doi.org/10.1162/ling_a_00245.

Full text

Abstract:

This article shows that stripping, the elision of declarative TPs, is possible not only in coordinate structures, but also in embedded clauses—however, only when the complementizer is absent. This Embedded Stripping Generalization is not predicted by earlier accounts of stripping, but it falls out from a certain combination of independently available assumptions. Specifically, I propose a zero Spell-Out view of ellipsis in a dynamic (or contextual) phasehood approach, which, together with the lack of a CP layer in that-less embedded clauses, derives this generalization in languages like English. I then briefly consider stripping in other languages and suggest that the analysis also has the flexibility to accommodate crosslinguistic differences in the distribution of stripping.

APA, Harvard, Vancouver, ISO, and other styles

48

Yu, Shujian. "The Analysis of Deep Neural Networks by Information Theory: From Explainability to Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (June 26, 2023): 15462. http://dx.doi.org/10.1609/aaai.v37i13.26829.

Full text

Abstract:

Despite their great success in many artificial intelligence tasks, deep neural networks (DNNs) still suffer from a few limitations, such as poor generalization behavior for out-of-distribution (OOD) data and the "black-box" nature. Information theory offers fresh insights to solve these challenges. In this short paper, we briefly review the recent developments in this area, and highlight our contributions.

APA, Harvard, Vancouver, ISO, and other styles

49

Yu, Runpeng, Hong Zhu, Kaican Li, Lanqing Hong, Rui Zhang, Nanyang Ye, Shao-Lun Huang, and Xiuqiang He. "Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8945–53. http://dx.doi.org/10.1609/aaai.v36i8.20877.

Full text

Abstract:

Due to the poor generalization performance of traditional empirical risk minimization (ERM) in the case of distributional shift, Out-of-Distribution (OoD) generalization algorithms receive increasing attention. However, OoD generalization algorithms overlook the great variance in the quality of training data, which significantly compromises the accuracy of these methods. In this paper, we theoretically reveal the relationship between training data quality and algorithm performance, and analyze the optimal regularization scheme for Lipschitz regularized invariant risk minimization. A novel algorithm is proposed based on the theoretical results to alleviate the influence of low quality data at both the sample level and the domain level. The experiments on both the regression and classification benchmarks validate the effectiveness of our method with statistical significance.

APA, Harvard, Vancouver, ISO, and other styles

50

Sinha, Samarth, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg, and Florian Shkurti. "DIBS: Diversity Inducing Information Bottleneck in Model Ensembles." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 9666–74. http://dx.doi.org/10.1609/aaai.v35i11.17163.

Full text

Abstract:

Although deep learning models have achieved state-of-the art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertainty estimation are still active areas of research. Bayesian approaches including Bayesian Neural Nets (BNNs) do not scale well to modern computer vision tasks, as they are difficult to train, and have poor generalization under dataset-shift. This motivates the need for effective ensembles which can generalize and give reliable uncertainty estimates. In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning the stochastic latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. We evaluate our method on benchmark datasets: MNIST, CIFAR100, TinyImageNet and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. over 10% relative improvement in classification accuracy, over 5% relative improvement in generalizing under dataset shift, and over 5% better predictive uncertainty estimation as inferred by efficient out-of-distribution (OOD) detection.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!