Journal articles on the topic 'Invariant representation learning'

To see the other types of publications on this topic, follow the link: Invariant representation learning.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Invariant representation learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhu, Zheng-Mao, Shengyi Jiang, Yu-Ren Liu, Yang Yu, and Kun Zhang. "Invariant Action Effect Model for Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 9260–68. http://dx.doi.org/10.1609/aaai.v36i8.20913.

Full text
Abstract:
Good representations can help RL agents perform concise modeling of their surroundings, and thus support effective decision-making in complex environments. Previous methods learn good representations by imposing extra constraints on dynamics. However, in the causal perspective, the causation between the action and its effect is not fully considered in those methods, which leads to the ignorance of the underlying relations among the action effects on the transitions. Based on the intuition that the same action always causes similar effects among different states, we induce such causation by taking the invariance of action effects among states as the relation. By explicitly utilizing such invariance, in this paper, we show that a better representation can be learned and potentially improves the sample efficiency and the generalization ability of the learned policy. We propose Invariant Action Effect Model (IAEM) to capture the invariance in action effects, where the effect of an action is represented as the residual of representations from neighboring states. IAEM is composed of two parts: (1) a new contrastive-based loss to capture the underlying invariance of action effects; (2) an individual action effect and provides a self-adapted weighting strategy to tackle the corner cases where the invariance does not hold. The extensive experiments on two benchmarks, i.e. Grid-World and Atari, show that the representations learned by IAEM preserve the invariance of action effects. Moreover, with the invariant action effect, IAEM can accelerate the learning process by 1.6x, rapidly generalize to new environments by fine-tuning on a few components, and outperform other dynamics-based representation methods by 1.4x in limited steps.
APA, Harvard, Vancouver, ISO, and other styles
2

Shui, Changjian, Boyu Wang, and Christian Gagné. "On the benefits of representation regularization in invariance based domain generalization." Machine Learning 111, no. 3 (January 1, 2022): 895–915. http://dx.doi.org/10.1007/s10994-021-06080-w.

Full text
Abstract:
AbstractA crucial aspect of reliable machine learning is to design a deployable system for generalizing new related but unobserved environments. Domain generalization aims to alleviate such a prediction gap between the observed and unseen environments. Previous approaches commonly incorporated learning the invariant representation for achieving good empirical performance. In this paper, we reveal that merely learning the invariant representation is vulnerable to the related unseen environment. To this end, we derive a novel theoretical analysis to control the unseen test environment error in the representation learning, which highlights the importance of controlling the smoothness of representation. In practice, our analysis further inspires an efficient regularization method to improve the robustness in domain generalization. The proposed regularization is orthogonal to and can be straightforwardly adopted in existing domain generalization algorithms that ensure invariant representation learning. Empirical results show that our algorithm outperforms the base versions in various datasets and invariance criteria.
APA, Harvard, Vancouver, ISO, and other styles
3

Hyun, Jaeguk, ChanYong Lee, Hoseong Kim, Hyunjung Yoo, and Eunjin Koh. "Learning Domain Invariant Representation via Self-Rugularization." Journal of the Korea Institute of Military Science and Technology 24, no. 4 (August 5, 2021): 382–91. http://dx.doi.org/10.9766/kimst.2021.24.4.382.

Full text
Abstract:
Unsupervised domain adaptation often gives impressive solutions to handle domain shift of data. Most of current approaches assume that unlabeled target data to train is abundant. This assumption is not always true in practices. To tackle this issue, we propose a general solution to solve the domain gap minimization problem without any target data. Our method consists of two regularization steps. The first step is a pixel regularization by arbitrary style transfer. Recently, some methods bring style transfer algorithms to domain adaptation and domain generalization process. They use style transfer algorithms to remove texture bias in source domain data. We also use style transfer algorithms for removing texture bias, but our method depends on neither domain adaptation nor domain generalization paradigm. The second regularization step is a feature regularization by feature alignment. Adding a feature alignment loss term to the model loss, the model learns domain invariant representation more efficiently. We evaluate our regularization methods from several experiments both on small dataset and large dataset. From the experiments, we show that our model can learn domain invariant representation as much as unsupervised domain adaptation methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Aggarwal, Karan, Shafiq Joty, Luis Fernandez-Luque, and Jaideep Srivastava. "Adversarial Unsupervised Representation Learning for Activity Time-Series." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 834–41. http://dx.doi.org/10.1609/aaai.v33i01.3301834.

Full text
Abstract:
Sufficient physical activity and restful sleep play a major role in the prevention and cure of many chronic conditions. Being able to proactively screen and monitor such chronic conditions would be a big step forward for overall health. The rapid increase in the popularity of wearable devices pro-vides a significant new source, making it possible to track the user’s lifestyle real-time. In this paper, we propose a novel unsupervised representation learning technique called activ-ity2vecthat learns and “summarizes” the discrete-valued ac-tivity time-series. It learns the representations with three com-ponents: (i) the co-occurrence and magnitude of the activ-ity levels in a time-segment, (ii) neighboring context of the time-segment, and (iii) promoting subject-invariance with ad-versarial training. We evaluate our method on four disorder prediction tasks using linear classifiers. Empirical evaluation demonstrates that our proposed method scales and performs better than many strong baselines. The adversarial regime helps improve the generalizability of our representations by promoting subject invariant features. We also show that using the representations at the level of a day works the best since human activity is structured in terms of daily routines.
APA, Harvard, Vancouver, ISO, and other styles
5

Wu, Yue, Hongfu Liu, Jun Li, and Yun Fu. "Improving face representation learning with center invariant loss." Image and Vision Computing 79 (November 2018): 123–32. http://dx.doi.org/10.1016/j.imavis.2018.09.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Byrne, Patrick, and Suzanna Becker. "A Principle for Learning Egocentric-Allocentric Transformation." Neural Computation 20, no. 3 (March 2008): 709–37. http://dx.doi.org/10.1162/neco.2007.10-06-361.

Full text
Abstract:
Numerous single-unit recording studies have found mammalian hippocampal neurons that fire selectively for the animal's location in space, independent of its orientation. The population of such neurons, commonly known as place cells, is thought to maintain an allocentric, or orientation-independent, internal representation of the animal's location in space, as well as mediating long-term storage of spatial memories. The fact that spatial information from the environment must reach the brain via sensory receptors in an inherently egocentric, or viewpoint-dependent, fashion leads to the question of how the brain learns to transform egocentric sensory representations into allocentric ones for long-term memory storage. Additionally, if these long-term memory representations of space are to be useful in guiding motor behavior, then the reverse transformation, from allocentric to egocentric coordinates, must also be learned. We propose that orientation-invariant representations can be learned by neural circuits that follow two learning principles: minimization of reconstruction error and maximization of representational temporal inertia. Two different neural network models are presented that adhere to these learning principles, the first by direct optimization through gradient descent and the second using a more biologically realistic circuit based on the restricted Boltzmann machine (Hinton, 2002; Smolensky, 1986). Both models lead to orientation-invariant representations, with the latter demonstrating place-cell-like responses when trained on a linear track environment.
APA, Harvard, Vancouver, ISO, and other styles
7

Xu, Qi, Liang Yao, Zhengkai Jiang, Guannan Jiang, Wenqing Chu, Wenhui Han, Wei Zhang, Chengjie Wang, and Ying Tai. "DIRL: Domain-Invariant Representation Learning for Generalizable Semantic Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 2884–92. http://dx.doi.org/10.1609/aaai.v36i3.20193.

Full text
Abstract:
Model generalization to the unseen scenes is crucial to real-world applications, such as autonomous driving, which requires robust vision systems. To enhance the model generalization, domain generalization through learning the domain-invariant representation has been widely studied. However, most existing works learn the shared feature space within multi-source domains but ignore the characteristic of the feature itself (e.g., the feature sensitivity to the domain-specific style). Therefore, we propose the Domain-invariant Representation Learning (DIRL) for domain generalization which utilizes the feature sensitivity as the feature prior to guide the enhancement of the model generalization capability. The guidance reflects in two folds: 1) Feature re-calibration that introduces the Prior Guided Attention Module (PGAM) to emphasize the insensitive features and suppress the sensitive features. 2): Feature whiting that proposes the Guided Feature Whiting (GFW) to remove the feature correlations which are sensitive to the domain-specific style. We construct the domain-invariant representation which suppresses the effect of the domain-specific style on the quality and correlation of the features. As a result, our method is simple yet effective, and can enhance the robustness of various backbone networks with little computational cost. Extensive experiments over multiple domains generalizable segmentation tasks show the superiority of our approach to other methods.
APA, Harvard, Vancouver, ISO, and other styles
8

Qin, Cao, Yunzhou Zhang, Yan Liu, Sonya Coleman, Dermot Kerr, and Guanghao Lv. "Appearance-invariant place recognition by adversarially learning disentangled representation." Robotics and Autonomous Systems 131 (September 2020): 103561. http://dx.doi.org/10.1016/j.robot.2020.103561.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Liang, Sen, Zhi-ze Zhou, Yu-dong Guo, Xuan Gao, Ju-yong Zhang, and Hu-jun Bao. "Facial landmark disentangled network with variational autoencoder." Applied Mathematics-A Journal of Chinese Universities 37, no. 2 (June 2022): 290–305. http://dx.doi.org/10.1007/s11766-022-4589-0.

Full text
Abstract:
AbstractLearning disentangled representation of data is a key problem in deep learning. Specifically, disentangling 2D facial landmarks into different factors (e.g., identity and expression) is widely used in the applications of face reconstruction, face reenactment and talking head et al.. However, due to the sparsity of landmarks and the lack of accurate labels for the factors, it is hard to learn the disentangled representation of landmarks. To address these problem, we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations, which is based on a Variational Autoencoder framework. Besides, we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage. Moreover, we implement an identity preservation loss to further enhance the representation ability of identity factor. To the best of our knowledge, this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark.
APA, Harvard, Vancouver, ISO, and other styles
10

Bradski, Gary, Gail A. Carpenter, and Stephen Grossberg. "Working Memory Networks for Learning Temporal Order with Application to Three-Dimensional Visual Object Recognition." Neural Computation 4, no. 2 (March 1992): 270–86. http://dx.doi.org/10.1162/neco.1992.4.2.270.

Full text
Abstract:
Working memory neural networks, called Sustained Temporal Order REcurrent (STORE) models, encode the invariant temporal order of sequential events in short-term memory (STM). Inputs to the networks may be presented with widely differing growth rates, amplitudes, durations, and interstimulus intervals without altering the stored STM representation. The STORE temporal order code is designed to enable groupings of the stored events to be stably learned and remembered in real time, even as new events perturb the system. Such invariance and stability properties are needed in neural architectures which self-organize learned codes for variable-rate speech perception, sensorimotor planning, or three-dimensional (3-D) visual object recognition. Using such a working memory, a self-organizing architecture for invariant 3-D visual object recognition is described. The new model is based on the model of Seibert and Waxman (1990a), which builds a 3-D representation of an object from a temporally ordered sequence of its two-dimensional (2-D) aspect graphs. The new model, called an ARTSTORE model, consists of the following cascade of processing modules: Invariant Preprocessor → ART 2 → STORE Model → ART 2 → Outstar Network.
APA, Harvard, Vancouver, ISO, and other styles
11

WENG, JUYANG, TIANYU LUWANG, HONG LU, and XIANGYANG XUE. "A MULTILAYER IN-PLACE LEARNING NETWORK FOR DEVELOPMENT OF GENERAL INVARIANCES." International Journal of Humanoid Robotics 04, no. 02 (June 2007): 281–320. http://dx.doi.org/10.1142/s0219843607001072.

Full text
Abstract:
Currently, there is a lack of general-purpose, in-place learning engines that incrementally learn multiple tasks, to develop "soft" multi-task-shared invariances in the intermediate internal representation while a developmental robot interacts with its environment. In-place learning is a biologically inspired concept, rooted in the genomic equivalence principle, meaning that each neuron is responsible for its own development while interacting with its environment. With in-place learning, there is no need for a separate learning network. Computationally, biologically inspired, in-place learning provides unusually efficient learning algorithms whose simplicity, low computational complexity, and generality are set apart from typical conventional learning algorithms. We present in this paper the multiple-layer in-place learning network (MILN) for this ambitious goal. As a key requirement for autonomous mental development, the network enables both unsupervised and supervised learning to occur concurrently, depending on whether motor supervision signals are available or not at the motor end (the last layer) during the agent's interactions with the environment. We present principles based on which MILN automatically develops invariant neurons in different layers and why such invariant neuronal clusters are important for learning later tasks in open-ended development. From sequentially sensed sensory streams, the proposed MILN incrementally develops a hierarchy of internal representations. The global invariance achieved through multi-layer invariances, with increasing invariance from early layers to the later layers. Experimental results with statistical performance measures are presented to show the effects of the principles.
APA, Harvard, Vancouver, ISO, and other styles
12

Shankar, Karthik H., and Marc W. Howard. "A Scale-Invariant Internal Representation of Time." Neural Computation 24, no. 1 (January 2012): 134–93. http://dx.doi.org/10.1162/neco_a_00212.

Full text
Abstract:
We propose a principled way to construct an internal representation of the temporal stimulus history leading up to the present moment. A set of leaky integrators performs a Laplace transform on the stimulus function, and a linear operator approximates the inversion of the Laplace transform. The result is a representation of stimulus history that retains information about the temporal sequence of stimuli. This procedure naturally represents more recent stimuli more accurately than less recent stimuli; the decrement in accuracy is precisely scale invariant. This procedure also yields time cells that fire at specific latencies following the stimulus with a scale-invariant temporal spread. Combined with a simple associative memory, this representation gives rise to a moment-to-moment prediction that is also scale invariant in time. We propose that this scale-invariant representation of temporal stimulus history could serve as an underlying representation accessible to higher-level behavioral and cognitive mechanisms. In order to illustrate the potential utility of this scale-invariant representation in a variety of fields, we sketch applications using minimal performance functions to problems in classical conditioning, interval timing, scale-invariant learning in autoshaping, and the persistence of the recency effect in episodic memory across timescales.
APA, Harvard, Vancouver, ISO, and other styles
13

Bae, Soo Hyun, Inkyu Choi, and Nam Soo Kim. "Disentangled Feature Learning for Noise-Invariant Speech Enhancement." Applied Sciences 9, no. 11 (June 3, 2019): 2289. http://dx.doi.org/10.3390/app9112289.

Full text
Abstract:
Most of the recently proposed deep learning-based speech enhancement techniques have focused on designing the neural network architectures as a black box. However, it is often beneficial to understand what kinds of hidden representations the model has learned. Since the real-world speech data are drawn from a generative process involving multiple entangled factors, disentangling the speech factor can encourage the trained model to result in better performance for speech enhancement. With the recent success in learning disentangled representation using neural networks, we explore a framework for disentangling speech and noise, which has not been exploited in the conventional speech enhancement algorithms. In this work, we propose a novel noise-invariant speech enhancement method which manipulates the latent features to distinguish between the speech and noise features in the intermediate layers using adversarial training scheme. To compare the performance of the proposed method with other conventional algorithms, we conducted experiments in both the matched and mismatched noise conditions using TIMIT and TSPspeech datasets. Experimental results show that our model successfully disentangles the speech and noise latent features. Consequently, the proposed model not only achieves better enhancement performance but also offers more robust noise-invariant property than the conventional speech enhancement techniques.
APA, Harvard, Vancouver, ISO, and other styles
14

Wei, Yuheng, Junzhao Du, Hui Liu, and Zhipeng Zhang. "CentriForce: Multiple-Domain Adaptation for Domain-Invariant Speaker Representation Learning." IEEE Signal Processing Letters 29 (2022): 807–11. http://dx.doi.org/10.1109/lsp.2022.3154237.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Qin, Yidan, Max Allan, Yisong Yue, Joel W. Burdick, and Mahdi Azizian. "Learning Invariant Representation of Tasks for Robust Surgical State Estimation." IEEE Robotics and Automation Letters 6, no. 2 (April 2021): 3208–15. http://dx.doi.org/10.1109/lra.2021.3063014.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Wu, Junjun, Qingwu Shi, Qinghua Lu, Xilin Liu, Xiaoman Zhu, and Zeqin Lin. "Learning invariant semantic representation for long-term robust visual localization." Engineering Applications of Artificial Intelligence 111 (May 2022): 104793. http://dx.doi.org/10.1016/j.engappai.2022.104793.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Michler, Frank, Reinhard Eckhorn, and Thomas Wachtler. "Using Spatiotemporal Correlations to Learn Topographic Maps for Invariant Object Recognition." Journal of Neurophysiology 102, no. 2 (August 2009): 953–64. http://dx.doi.org/10.1152/jn.90651.2008.

Full text
Abstract:
The retinal image of visual objects can vary drastically with changes of viewing angle. Nevertheless, our visual system is capable of recognizing objects fairly invariant of viewing angle. Under natural viewing conditions, different views of the same object tend to occur in temporal proximity, thereby generating temporal correlations in the sequence of retinal images. Such spatial and temporal stimulus correlations can be exploited for learning invariant representations. We propose a biologically plausible mechanism that implements this learning strategy using the principle of self-organizing maps. We developed a network of spiking neurons that uses spatiotemporal correlations in the inputs to map different views of objects onto a topographic representation. After learning, different views of the same object are represented in a connected neighborhood of neurons. Model neurons of a higher processing area that receive unspecific input from a local neighborhood in the map show view-invariant selectivities for visual objects. The findings suggest a functional relevance of cortical topographic maps.
APA, Harvard, Vancouver, ISO, and other styles
18

Wiskott, Laurenz, and Terrence J. Sejnowski. "Slow Feature Analysis: Unsupervised Learning of Invariances." Neural Computation 14, no. 4 (April 1, 2002): 715–70. http://dx.doi.org/10.1162/089976602317318938.

Full text
Abstract:
Invariant features of temporally varying signals are useful for analysis and classification. Slow feature analysis (SFA) is a new method for learning invariant or slowly varying features from a vectorial input signal. It is based on a nonlinear expansion of the input signal and application of principal component analysis to this expanded signal and its time derivative. It is guaranteed to find the optimal solution within a family of functions directly and can learn to extract a large number of decor-related features, which are ordered by their degree of invariance. SFA can be applied hierarchically to process high-dimensional input signals and extract complex features. SFA is applied first to complex cell tuning properties based on simple cell output, including disparity and motion. Then more complicated input-output functions are learned by repeated application of SFA. Finally, a hierarchical network of SFA modules is presented as a simple model of the visual system. The same unstructured network can learn translation, size, rotation, contrast, or, to a lesser degree, illumination invariance for one-dimensional objects, depending on only the training stimulus. Surprisingly, only a few training objects suffice to achieve good generalization to new objects. The generated representation is suitable for object recognition. Performance degrades if the network is trained to learn multiple invariances simultaneously.
APA, Harvard, Vancouver, ISO, and other styles
19

Xie, Jiu-Cheng, Chi-Man Pun, and Kin-Man Lam. "Implicit and Explicit Feature Purification for Age-Invariant Facial Representation Learning." IEEE Transactions on Information Forensics and Security 17 (2022): 399–412. http://dx.doi.org/10.1109/tifs.2022.3142998.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Zhao, Shuyang, Jianwu Li, and Jiaxing Wang. "Disentangled representation learning and residual GAN for age-invariant face verification." Pattern Recognition 100 (April 2020): 107097. http://dx.doi.org/10.1016/j.patcog.2019.107097.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Zhang, Yang, Changhui Hu, and Xiaobo Lu. "IL-GAN: Illumination-invariant representation learning for single sample face recognition." Journal of Visual Communication and Image Representation 59 (February 2019): 501–13. http://dx.doi.org/10.1016/j.jvcir.2019.02.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Shao, Ming, Yizhe Zhang, and Yun Fu. "Collaborative Random Faces-Guided Encoders for Pose-Invariant Face Representation Learning." IEEE Transactions on Neural Networks and Learning Systems 29, no. 4 (April 2018): 1019–32. http://dx.doi.org/10.1109/tnnls.2017.2648122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Kang, Hyungu, and Seokho Kang. "Semi-supervised rotation-invariant representation learning for wafer map pattern analysis." Engineering Applications of Artificial Intelligence 120 (April 2023): 105864. http://dx.doi.org/10.1016/j.engappai.2023.105864.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Waydo, Stephen, and Christof Koch. "Unsupervised Learning of Individuals and Categories from Images." Neural Computation 20, no. 5 (May 2008): 1165–78. http://dx.doi.org/10.1162/neco.2007.03-07-493.

Full text
Abstract:
Motivated by the existence of highly selective, sparsely firing cells observed in the human medial temporal lobe (MTL), we present an unsupervised method for learning and recognizing object categories from unlabeled images. In our model, a network of nonlinear neurons learns a sparse representation of its inputs through an unsupervised expectation-maximization process. We show that the application of this strategy to an invariant feature-based description of natural images leads to the development of units displaying sparse, invariant selectivity for particular individuals or image categories much like those observed in the MTL data.
APA, Harvard, Vancouver, ISO, and other styles
25

Cao, Yingxin, Laiyi Fu, Jie Wu, Qinke Peng, Qing Nie, Jing Zhang, and Xiaohui Xie. "SAILER: scalable and accurate invariant representation learning for single-cell ATAC-seq processing and integration." Bioinformatics 37, Supplement_1 (July 1, 2021): i317—i326. http://dx.doi.org/10.1093/bioinformatics/btab303.

Full text
Abstract:
Abstract Motivation Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) provides new opportunities to dissect epigenomic heterogeneity and elucidate transcriptional regulatory mechanisms. However, computational modeling of scATAC-seq data is challenging due to its high dimension, extreme sparsity, complex dependencies and high sensitivity to confounding factors from various sources. Results Here, we propose a new deep generative model framework, named SAILER, for analyzing scATAC-seq data. SAILER aims to learn a low-dimensional nonlinear latent representation of each cell that defines its intrinsic chromatin state, invariant to extrinsic confounding factors like read depth and batch effects. SAILER adopts the conventional encoder-decoder framework to learn the latent representation but imposes additional constraints to ensure the independence of the learned representations from the confounding factors. Experimental results on both simulated and real scATAC-seq datasets demonstrate that SAILER learns better and biologically more meaningful representations of cells than other methods. Its noise-free cell embeddings bring in significant benefits in downstream analyses: clustering and imputation based on SAILER result in 6.9% and 18.5% improvements over existing methods, respectively. Moreover, because no matrix factorization is involved, SAILER can easily scale to process millions of cells. We implemented SAILER into a software package, freely available to all for large-scale scATAC-seq data analysis. Availability and implementation The software is publicly available at https://github.com/uci-cbcl/SAILER. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
26

Jurewicz, Mateusz, and Leon Derczynski. "Set-to-Sequence Methods in Machine Learning: A Review." Journal of Artificial Intelligence Research 71 (August 12, 2021): 885–924. http://dx.doi.org/10.1613/jair.1.12839.

Full text
Abstract:
Machine learning on sets towards sequential output is an important and ubiquitous task, with applications ranging from language modelling and meta-learning to multi-agent strategy games and power grid optimization. Combining elements of representation learning and structured prediction, its two primary challenges include obtaining a meaningful, permutation invariant set representation and subsequently utilizing this representation to output a complex target permutation. This paper provides a comprehensive introduction to the _eld as well as an overview of important machine learning methods tackling both of these key challenges, with a detailed qualitative comparison of selected model architectures.
APA, Harvard, Vancouver, ISO, and other styles
27

Qin, RuoXi, Huike Zhang, LingYun Jiang, Kai Qiao, Jinjin Hai, Jian Chen, Junling Xu, Dapeng Shi, and Bin Yan. "Multicenter Computer-Aided Diagnosis for Lymph Nodes Using Unsupervised Domain-Adaptation Networks Based on Cross-Domain Confounding Representations." Computational and Mathematical Methods in Medicine 2020 (January 24, 2020): 1–10. http://dx.doi.org/10.1155/2020/3709873.

Full text
Abstract:
To achieve the robust high-performance computer-aided diagnosis systems for lymph nodes, CT images may be typically collected from multicenter data, which cause the isolated performance of the model based on different data source centers. The variability adaptation problem of lymph node data which is related to the problem of domain adaptation in deep learning differs from the general domain adaptation problem because of the typically larger CT image size and more complex data distributions. Therefore, domain adaptation for this problem needs to consider the shared feature representation and even the conditioning information of each domain so that the adaptation network can capture significant discriminative representations in a domain-invariant space. This paper extracts domain-invariant features based on a cross-domain confounding representation and proposes a cycle-consistency learning framework to encourage the network to preserve class-conditioning information through cross-domain image translations. Compared with the performance of different domain adaptation methods, the accurate rate of our method achieves at least 4.4% points higher under multicenter lymph node data. The pixel-level cross-domain image mapping and the semantic-level cycle consistency provided a stable confounding representation with class-conditioning information to achieve effective domain adaptation under complex feature distribution.
APA, Harvard, Vancouver, ISO, and other styles
28

Warikoo, Neha, Yung-Chun Chang, and Shang-Pin Ma. "Gradient Boosting over Linguistic-Pattern-Structured Trees for Learning Protein–Protein Interaction in the Biomedical Literature." Applied Sciences 12, no. 20 (October 11, 2022): 10199. http://dx.doi.org/10.3390/app122010199.

Full text
Abstract:
Protein-based studies contribute significantly to gathering functional information about biological systems; therefore, the protein–protein interaction detection task is one of the most researched topics in the biomedical literature. To this end, many state-of-the-art systems using syntactic tree kernels (TK) and deep learning have been developed. However, these models are computationally complex and have limited learning interpretability. In this paper, we introduce a linguistic-pattern-representation-based Gradient-Tree Boosting model, i.e., LpGBoost. It uses linguistic patterns to optimize and generate semantically relevant representation vectors for learning over the gradient-tree boosting. The patterns are learned via unsupervised modeling by clustering invariant semantic features. These linguistic representations are semi-interpretable with rich semantic knowledge, and owing to their shallow representation, they are also computationally less expensive. Our experiments with six protein–protein interaction (PPI) corpora demonstrate that LpGBoost outperforms the SOTA tree-kernel models, as well as the CNN-based interaction detection studies for BioInfer and AIMed corpora.
APA, Harvard, Vancouver, ISO, and other styles
29

Yuan, Zixuan, Hao Liu, Renjun Hu, Denghui Zhang, and Hui Xiong. "Self-Supervised Prototype Representation Learning for Event-Based Corporate Profiling." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 5 (May 18, 2021): 4644–52. http://dx.doi.org/10.1609/aaai.v35i5.16594.

Full text
Abstract:
Event-based corporate profiling aims to assess the evolving operational status of the corresponding corporate from its event sequence. Existing studies on corporate profiling have partially addressed the problem via (i) case-by-case empirical analysis by leveraging traditional financial methods, or (ii) the automatic profile inference by reformulating the problem into a supervised learning task. However, both approaches heavily rely on domain knowledge and are labor-intensive. More importantly, the task-specific nature of both approaches prevents the obtained corporate profiles from being applied to diversified downstream applications. To this end, in this paper, we propose a Self-Supervised Prototype Representation Learning (SePaL) framework for dynamic corporate profiling. By exploiting the topological information of an event graph and exploring self-supervised learning techniques, SePaL can obtain unified corporate representations that are robust to event noises and can be easily fine-tuned to benefit various down-stream applications with only a few annotated data. Specifically, we first infer the initial cluster distribution of noise-resistant event prototypes based on latent representations of events. Then, we construct four permutation-invariant self-supervision signals to guide the representation learning of the event prototype. In terms of applications, we exploit the learned time-evolving corporate representations for both stock price spike prediction and corporate default risk evaluation. Experimental results on two real-world corporate event datasets demonstrate the effectiveness of SePaL for these two applications.
APA, Harvard, Vancouver, ISO, and other styles
30

Achille, Alessandro, and Stefano Soatto. "A Separation Principle for Control in the Age of Deep Learning." Annual Review of Control, Robotics, and Autonomous Systems 1, no. 1 (May 28, 2018): 287–307. http://dx.doi.org/10.1146/annurev-control-060117-105140.

Full text
Abstract:
We review the problem of defining and inferring a state for a control system based on complex, high-dimensional, highly uncertain measurement streams, such as videos. Such a state, or representation, should contain all and only the information needed for control and discount nuisance variability in the data. It should also have finite complexity, ideally modulated depending on available resources. This representation is what we want to store in memory in lieu of the data, as it separates the control task from the measurement process. For the trivial case with no dynamics, a representation can be inferred by minimizing the information bottleneck Lagrangian in a function class realized by deep neural networks. The resulting representation has much higher dimension than the data (already in the millions) but is smaller in the sense of information content, retaining only what is needed for the task. This process also yields representations that are invariant to nuisance factors and have maximally independent components. We extend these ideas to the dynamic case, where the representation is the posterior density of the task variable given the measurements up to the current time, which is in general much simpler than the prediction density maintained by the classical Bayesian filter. Again, this can be finitely parameterized using a deep neural network, and some applications are already beginning to emerge. No explicit assumption of Markovianity is needed; instead, complexity trades off approximation of an optimal representation, including the degree of Markovianity.
APA, Harvard, Vancouver, ISO, and other styles
31

Hu, Chaofan, Zhichao Zhou, Biao Wang, WeiGuang Zheng, and Shuilong He. "Tensor Transfer Learning for Intelligence Fault Diagnosis of Bearing with Semisupervised Partial Label Learning." Journal of Sensors 2021 (December 13, 2021): 1–11. http://dx.doi.org/10.1155/2021/6205890.

Full text
Abstract:
A new tensor transfer approach is proposed for rotating machinery intelligent fault diagnosis with semisupervised partial label learning in this paper. Firstly, the vibration signals are constructed as a three-way tensor via trial, condition, and channel. Secondly, for adapting the source and target domains tensor representations directly, without vectorization, the domain adaptation (DA) approach named tensor-aligned invariant subspace learning (TAISL) is first proposed for tensor representation when testing and training data are drawn from different distribution. Then, semisupervised partial label learning (SSPLL) is first introduced for tackling a problem that it is hard to label a large number of instances and there exists much data left to be unlabeled. Ultimately, the proposed method is used to identify faults. The effectiveness and feasibility of the proposed method has been thoroughly validated by transfer fault experiments. The experimental results show that the presented technique can achieve better performance.
APA, Harvard, Vancouver, ISO, and other styles
32

Ding, Huijie, and Arthur K. L. Lin. "Feature Extraction Based on Non-Subsampled Shearlet Transform (NSST) with Application to SAR Image Data." Mathematical Problems in Engineering 2020 (November 19, 2020): 1–6. http://dx.doi.org/10.1155/2020/8885887.

Full text
Abstract:
Considering the defaults in synthetic aperture radar (SAR) image feature extraction, an SAR target recognition method based on non-subsampled Shearlet transform (NSST) was proposed with application to target recognition. NSST was used to decompose an SAR image into multilevel representations. These representations were translation-invariant, and they could well reflect the dominant and detailed properties of the target. During the machine learning classification stage, the joint sparse representation was employed to jointly represent the multilevel representations. The joint sparse representation could represent individual components independently while considering the inner correlations between different components. Therefore, the precision of joint representation could be enhanced. Finally, the target label of the test sample was determined according to the overall reconstruction error. Experiments were conducted on the MSTAR dataset to examine the proposed method, and the results confirmed its validity and robustness under the standard operating condition, configuration variance, depression angle variance, and noise corruption.
APA, Harvard, Vancouver, ISO, and other styles
33

Gu, Bin, and Wu Guo. "Dynamic Convolution With Global-Local Information for Session-Invariant Speaker Representation Learning." IEEE Signal Processing Letters 29 (2022): 404–8. http://dx.doi.org/10.1109/lsp.2021.3136141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Guo, Tiantian, Yang Chen, Minglei Shi, Xiangyu Li, and Michael Q. Zhang. "Integration of single cell data by disentangled representation learning." Nucleic Acids Research 50, no. 2 (November 24, 2021): e8-e8. http://dx.doi.org/10.1093/nar/gkab978.

Full text
Abstract:
Abstract Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets from different laboratories or technologies due to batch effect, which are interspersed with biological variances. To overcome this problem, we have proposed Single Cell Integration by Disentangled Representation Learning (SCIDRL), a domain adaption-based method, to learn low-dimensional representations invariant to batch effect. This method can efficiently remove batch effect while retaining cell type purity. We applied it to thirteen diverse simulated and real datasets. Benchmark results show that SCIDRL outperforms other methods in most cases and exhibits excellent performances in two common situations: (i) effective integration of batch-shared rare cell types and preservation of batch-specific rare cell types; (ii) reliable integration of datasets with different cell compositions. This demonstrates SCIDRL will offer a valuable tool for researchers to decode the enigma of cell heterogeneity.
APA, Harvard, Vancouver, ISO, and other styles
35

Zhang, Zhenduo, Yongru Chen, Wenming Yang, Guijin Wang, and Qingmin Liao. "Pose-Invariant Face Recognition via Adaptive Angular Distillation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3390–98. http://dx.doi.org/10.1609/aaai.v36i3.20249.

Full text
Abstract:
Pose-invariant face recognition is a practically useful but challenging task. This paper introduces a novel method to learn pose-invariant feature representation without normalizing profile faces to frontal ones or learning disentangled features. We first design a novel strategy to learn pose-invariant feature embeddings by distilling the angular knowledge of frontal faces extracted by teacher network to student network, which enables the handling of faces with large pose variations. In this way, the features of faces across variant poses can cluster compactly for the same person to create a pose-invariant face representation. Secondly, we propose a Pose-Adaptive Angular Distillation loss to mitigate the negative effect of uneven distribution of face poses in the training dataset to pay more attention to the samples with large pose variations. Extensive experiments on two challenging benchmarks (IJB-A and CFP-FP) show that our approach consistently outperforms the existing methods.
APA, Harvard, Vancouver, ISO, and other styles
36

Roschin, Vadim Y., Alexander A. Frolov, Yves Burnod, and Marc A. Maier. "A Neural Network Model for the Acquisition of a Spatial Body Scheme Through Sensorimotor Interaction." Neural Computation 23, no. 7 (July 2011): 1821–34. http://dx.doi.org/10.1162/neco_a_00138.

Full text
Abstract:
This letter presents a novel unsupervised sensory matching learning technique for the development of an internal representation of three-dimensional information. The representation is invariant with respect to the sensory modalities involved. Acquisition of the internal representation is demonstrated with a neural network model of a sensorimotor system of a simple model creature, consisting of a tactile-sensitive body and a multiple-degrees-of-freedom arm with proprioceptive sensitivity. Acquisition of the 3D representation as well as a distributed representation of the body scheme, occurs through sensorimotor interactions (i.e., the sensory-motor experience of the creature). Convergence of the learning is demonstrated through computer simulations for the model creature with a 7-DoF arm and a spherical body covered by 20 tactile fields.
APA, Harvard, Vancouver, ISO, and other styles
37

Jia, Xibin, Ya Jin, Xing Su, and Yongli Hu. "Domain-invariant representation learning using an unsupervised domain adversarial adaptation deep neural network." Neurocomputing 355 (August 2019): 209–20. http://dx.doi.org/10.1016/j.neucom.2019.04.033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Kang, Hua, Qianyi Huang, and Qian Zhang. "Augmented Adversarial Learning for Human Activity Recognition with Partial Sensor Sets." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, no. 3 (September 6, 2022): 1–30. http://dx.doi.org/10.1145/3550285.

Full text
Abstract:
Human activity recognition (HAR) plays an important role in a wide range of applications, such as health monitoring and gaming. Inertial sensors attached to body segments constitute a critical sensing system for HAR. Diverse inertial sensor datasets for HAR have been released with the intention of attracting collective efforts and saving the data collection burden. However, these datasets are heterogeneous in terms of subjects and sensor positions. The coupling of these two factors makes it hard to generalize the model to a new application scenario, where there are unseen subjects and new sensor position combinations. In this paper, we design a framework to combine heterogeneous data to learn a general representation for HAR, so that it can work for new applications. We propose an Augmented Adversarial Learning framework for HAR (AALH) to learn generalizable representations to deal with diverse combinations of sensor positions and subject discrepancies. We train an adversarial neural network to map various sensor sets' data into a common latent representation space which is domain-invariant and class-discriminative. We enrich the latent representation space by a hybrid missing strategy and complement each subject domain with a multi-domain mixup method, and they significantly improve model generalization. Experiment results on two HAR datasets demonstrate that the proposed method significantly outperforms previous methods on unseen subjects and new sensor position combinations.
APA, Harvard, Vancouver, ISO, and other styles
39

Pham, Huy Hieu, Houssam Salmane, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, and Sergio A. Velastin. "Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks." Sensors 19, no. 8 (April 24, 2019): 1932. http://dx.doi.org/10.3390/s19081932.

Full text
Abstract:
Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio–temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based representation and a deep learning framework for 3D action recognition using RGB-D sensors. We propose to build an action map called SPMF (Skeleton Posture-Motion Feature), which is a compact image representation built from skeleton poses and their motions. An Adaptive Histogram Equalization (AHE) algorithm is then applied on the SPMF to enhance their local patterns and form an enhanced action map, namely Enhanced-SPMF. For learning and classification tasks, we exploit Deep Convolutional Neural Networks based on the DenseNet architecture to learn directly an end-to-end mapping between input skeleton sequences and their action labels via the Enhanced-SPMFs. The proposed method is evaluated on four challenging benchmark datasets, including both individual actions, interactions, multiview and large-scale datasets. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches on all benchmark tasks, whilst requiring low computational time for training and inference.
APA, Harvard, Vancouver, ISO, and other styles
40

Cohen, Ido, Eli David, and Nathan Netanyahu. "Supervised and Unsupervised End-to-End Deep Learning for Gene Ontology Classification of Neural In Situ Hybridization Images." Entropy 21, no. 3 (February 26, 2019): 221. http://dx.doi.org/10.3390/e21030221.

Full text
Abstract:
In recent years, large datasets of high-resolution mammalian neural images have become available, which has prompted active research on the analysis of gene expression data. Traditional image processing methods are typically applied for learning functional representations of genes, based on their expressions in these brain images. In this paper, we describe a novel end-to-end deep learning-based method for generating compact representations of in situ hybridization (ISH) images, which are invariant-to-translation. In contrast to traditional image processing methods, our method relies, instead, on deep convolutional denoising autoencoders (CDAE) for processing raw pixel inputs, and generating the desired compact image representations. We provide an in-depth description of our deep learning-based approach, and present extensive experimental results, demonstrating that representations extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Our methods improve the previous state-of-the-art classification rate (Liscovitch, et al.) from an average AUC of 0.92 to 0.997, i.e., it achieves 96% reduction in error rate. Furthermore, the representation vectors generated due to our method are more compact in comparison to previous state-of-the-art methods, allowing for a more efficient high-level representation of images. These results are obtained with significantly downsampled images in comparison to the original high-resolution ones, further underscoring the robustness of our proposed method.
APA, Harvard, Vancouver, ISO, and other styles
41

Ayalew, Melese, Shijie Zhou, Imran Memon, Md Belal Bin Heyat, Faijan Akhtar, and Xiaojuan Zhang. "View-Invariant Spatiotemporal Attentive Motion Planning and Control Network for Autonomous Vehicles." Machines 10, no. 12 (December 9, 2022): 1193. http://dx.doi.org/10.3390/machines10121193.

Full text
Abstract:
Autonomous driving vehicles (ADVs) are sleeping giant intelligent machines that perceive their environment and make driving decisions. Most existing ADSs are built as hand-engineered perception-planning-control pipelines. However, designing generalized handcrafted rules for autonomous driving in an urban environment is complex. An alternative approach is imitation learning (IL) from human driving demonstrations. However, most previous studies on IL for autonomous driving face several critical challenges: (1) poor generalization ability toward the unseen environment due to distribution shift problems such as changes in driving views and weather conditions; (2) lack of interpretability; and (3) mostly trained to learn the single driving task. To address these challenges, we propose a view-invariant spatiotemporal attentive planning and control network for autonomous vehicles. The proposed method first extracts spatiotemporal representations from images of a front and top driving view sequence through attentive Siamese 3DResNet. Then, the maximum mean discrepancy loss (MMD) is employed to minimize spatiotemporal discrepancies between these driving views and produce an invariant spatiotemporal representation, which reduces domain shift due to view change. Finally, the multitasking learning (MTL) method is employed to jointly train trajectory planning and high-level control tasks based on learned representations and previous motions. Results of extensive experimental evaluations on a large autonomous driving dataset with various weather/lighting conditions verified that the proposed method is effective for feasible motion planning and control in autonomous vehicles.
APA, Harvard, Vancouver, ISO, and other styles
42

Li, Chao, Xin Min, Shouqian Sun, Wenqian Lin, and Zhichuan Tang. "DeepGait: A Learning Deep Convolutional Representation for View-Invariant Gait Recognition Using Joint Bayesian." Applied Sciences 7, no. 3 (February 23, 2017): 210. http://dx.doi.org/10.3390/app7030210.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Özdenizci, Ozan, Safaa Eldeeb, Andaç Demir, Deniz Erdoğmuş, and Murat Akçakaya. "EEG-based texture roughness classification in active tactile exploration with invariant representation learning networks." Biomedical Signal Processing and Control 67 (May 2021): 102507. http://dx.doi.org/10.1016/j.bspc.2021.102507.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Mao, Ye, Farzaneh Khoshnevisan, Thomas Price, Tiffany Barnes, and Min Chi. "Cross-Lingual Adversarial Domain Adaptation for Novice Programming." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7682–90. http://dx.doi.org/10.1609/aaai.v36i7.20735.

Full text
Abstract:
Student modeling sits at the epicenter of adaptive learning technology. In contrast to the voluminous work on student modeling for well-defined domains such as algebra, there has been little research on student modeling in programming (SMP) due to data scarcity caused by the unbounded solution spaces of open-ended programming exercises. In this work, we focus on two essential SMP tasks: program classification and early prediction of student success and propose a Cross-Lingual Adversarial Domain Adaptation (CrossLing) framework that can leverage a large programming dataset to learn features that can improve SMP's build using a much smaller dataset in a different programming language. Our framework maintains one globally invariant latent representation across both datasets via an adversarial learning process, as well as allocating domain-specific models for each dataset to extract local latent representations that cannot and should not be united. By separating globally-shared representations from domain-specific representations, our framework outperforms existing state-of-the-art methods for both SMP tasks.
APA, Harvard, Vancouver, ISO, and other styles
45

Ma, Chunmei, Qing Zhu, Shuang Wu, and Bin Liu. "Representation Learning from Time Labelled Heterogeneous Data for Mobile Crowdsensing." Mobile Information Systems 2016 (2016): 1–10. http://dx.doi.org/10.1155/2016/2097243.

Full text
Abstract:
Mobile crowdsensing is a new paradigm that can utilize pervasive smartphones to collect and analyze data to benefit users. However, sensory data gathered by smartphone usually involves different data types because of different granularity and multiple sensor sources. Besides, the data are also time labelled. The heterogeneous and time sequential data raise new challenges for data analyzing. Some existing solutions try to learn each type of data one by one and analyze them separately without considering time information. In addition, the traditional methods also have to determine phone orientation because some sensors equipped in smartphone are orientation related. In this paper, we think that a combination of multiple sensors can represent an invariant feature for a crowdsensing context. Therefore, we propose a new representation learning method of heterogeneous data with time labels to extract typical features using deep learning. We evaluate that our proposed method can adapt data generated by different orientations effectively. Furthermore, we test the performance of the proposed method by recognizing two group mobile activities, walking/cycling and driving/bus with smartphone sensors. It achieves precisions of98.6%and93.7%in distinguishing cycling from walking and bus from driving, respectively.
APA, Harvard, Vancouver, ISO, and other styles
46

Cho, KyungHyun, Tapani Raiko, and Alexander Ilin. "Enhanced Gradient for Training Restricted Boltzmann Machines." Neural Computation 25, no. 3 (March 2013): 805–31. http://dx.doi.org/10.1162/neco_a_00397.

Full text
Abstract:
Restricted Boltzmann machines (RBMs) are often used as building blocks in greedy learning of deep networks. However, training this simple model can be laborious. Traditional learning algorithms often converge only with the right choice of metaparameters that specify, for example, learning rate scheduling and the scale of the initial weights. They are also sensitive to specific data representation. An equivalent RBM can be obtained by flipping some bits and changing the weights and biases accordingly, but traditional learning rules are not invariant to such transformations. Without careful tuning of these training settings, traditional algorithms can easily get stuck or even diverge. In this letter, we present an enhanced gradient that is derived to be invariant to bit-flipping transformations. We experimentally show that the enhanced gradient yields more stable training of RBMs both when used with a fixed learning rate and an adaptive one.
APA, Harvard, Vancouver, ISO, and other styles
47

Mel, Bartlett W., and József Fiser. "Minimizing Binding Errors Using Learned Conjunctive Features." Neural Computation 12, no. 4 (April 1, 2000): 731–62. http://dx.doi.org/10.1162/089976600300015574.

Full text
Abstract:
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors—Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word sizes, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.
APA, Harvard, Vancouver, ISO, and other styles
48

Mel, Bartlett W., and József Fiser. "Minimizing Binding Errors Using Learned Conjunctive Features." Neural Computation 12, no. 2 (February 1, 2000): 247–78. http://dx.doi.org/10.1162/089976600300015772.

Full text
Abstract:
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors—Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word sizes, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.
APA, Harvard, Vancouver, ISO, and other styles
49

O'Reilly, Randall C., Jacob L. Russin, Maryam Zolfaghar, and John Rohrlich. "Deep Predictive Learning in Neocortex and Pulvinar." Journal of Cognitive Neuroscience 33, no. 6 (May 1, 2021): 1158–96. http://dx.doi.org/10.1162/jocn_a_01708.

Full text
Abstract:
Abstract How do humans learn from raw sensory experience? Throughout life, but most obviously in infancy, we learn without explicit instruction. We propose a detailed biological mechanism for the widely embraced idea that learning is driven by the differences between predictions and actual outcomes (i.e., predictive error-driven learning). Specifically, numerous weak projections into the pulvinar nucleus of the thalamus generate top–down predictions, and sparse driver inputs from lower areas supply the actual outcome, originating in Layer 5 intrinsic bursting neurons. Thus, the outcome representation is only briefly activated, roughly every 100 msec (i.e., 10 Hz, alpha), resulting in a temporal difference error signal, which drives local synaptic changes throughout the neocortex. This results in a biologically plausible form of error backpropagation learning. We implemented these mechanisms in a large-scale model of the visual system and found that the simulated inferotemporal pathway learns to systematically categorize 3-D objects according to invariant shape properties, based solely on predictive learning from raw visual inputs. These categories match human judgments on the same stimuli and are consistent with neural representations in inferotemporal cortex in primates.
APA, Harvard, Vancouver, ISO, and other styles
50

Mo, Y., T. Qian, and W. Mi. "Sparse representation in Szegő kernels through reproducing kernel Hilbert space theory with applications." International Journal of Wavelets, Multiresolution and Information Processing 13, no. 04 (July 2015): 1550030. http://dx.doi.org/10.1142/s0219691315500307.

Full text
Abstract:
This paper discusses generalization bounds for complex data learning which serve as a theoretical foundation for complex support vector machine (SVM). Drawn on the generalization bounds, a complex SVM approach based on the Szegő kernel of the Hardy space H2(𝔻) is formulated. It is applied to the frequency-domain identification problem of discrete linear time-invariant system (LTIS). Experiments show that the proposed algorithm is effective in applications.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography