To see the other types of publications on this topic, follow the link: State representation learning.

Journal articles on the topic 'State representation learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'State representation learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Xu, Cai, Wei Zhao, Jinglong Zhao, Ziyu Guan, Yaming Yang, Long Chen, and Xiangyu Song. "Progressive Deep Multi-View Comprehensive Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10557–65. http://dx.doi.org/10.1609/aaai.v37i9.26254.

Full text
Abstract:
Multi-view Comprehensive Representation Learning (MCRL) aims to synthesize information from multiple views to learn comprehensive representations of data items. Prevalent deep MCRL methods typically concatenate synergistic view-specific representations or average aligned view-specific representations in the fusion stage. However, the performance of synergistic fusion methods inevitably degenerate or even fail when partial views are missing in real-world applications; the aligned based fusion methods usually cannot fully exploit the complementarity of multi-view data. To eliminate all these drawbacks, in this work we present a Progressive Deep Multi-view Fusion (PDMF) method. Considering the multi-view comprehensive representation should contain complete information and the view-specific data contain partial information, we deem that it is unstable to directly learn the mapping from partial information to complete information. Hence, PDMF employs a progressive learning strategy, which contains the pre-training and fine-tuning stages. In the pre-training stage, PDMF decodes the auxiliary comprehensive representation to the view-specific data. It also captures the consistency and complementarity by learning the relations between the dimensions of the auxiliary comprehensive representation and all views. In the fine-tuning stage, PDMF learns the mapping from the original data to the comprehensive representation with the help of the auxiliary comprehensive representation and relations. Experiments conducted on a synthetic toy dataset and 4 real-world datasets show that PDMF outperforms state-of-the-art baseline methods. The code is released at https://github.com/winterant/PDMF.
APA, Harvard, Vancouver, ISO, and other styles
2

Yue, Yang, Bingyi Kang, Zhongwen Xu, Gao Huang, and Shuicheng Yan. "Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 11069–77. http://dx.doi.org/10.1609/aaai.v37i9.26311.

Full text
Abstract:
Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model, which is different from how the model is used in RL---performing value-based planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future state (also referred to as the "imagined state'') based on the current one and a sequence of actions. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a Q value head on both of the states and obtains two distributions of action values. Then a distance is computed and minimized to force the imagined state to produce a similar action value prediction as that by the real state. We develop two implementations of the above idea for the discrete and continuous action spaces respectively. We conduct experiments on Atari 100k and DeepMind Control Suite benchmarks to validate their effectiveness for improving sample efficiency. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
APA, Harvard, Vancouver, ISO, and other styles
3

de Bruin, Tim, Jens Kober, Karl Tuyls, and Robert Babuska. "Integrating State Representation Learning Into Deep Reinforcement Learning." IEEE Robotics and Automation Letters 3, no. 3 (July 2018): 1394–401. http://dx.doi.org/10.1109/lra.2018.2800101.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chen, Haoqiang, Yadong Liu, Zongtan Zhou, and Ming Zhang. "A2C: Attention-Augmented Contrastive Learning for State Representation Extraction." Applied Sciences 10, no. 17 (August 26, 2020): 5902. http://dx.doi.org/10.3390/app10175902.

Full text
Abstract:
Reinforcement learning (RL) faces a series of challenges, including learning efficiency and generalization. The state representation used to train RL is one of the important factors causing these challenges. In this paper, we explore providing a more efficient state representation for RL. Contrastive learning is used as the representation extraction method in our work. We propose an attention mechanism implementation and extend an existing contrastive learning method by embedding the attention mechanism. Finally an attention-augmented contrastive learning method called A2C is obtained. As a result, using the state representation from A2C, the robot achieves better learning efficiency and generalization than those using state-of-the-art representations. Moreover, our attention mechanism is proven to be able to calculate the correlation of arbitrary distance among pixels, which is conducive to capturing more accurate obstacle information. What is more, we remove the attention mechanism from A2C. It is shown that the rewards available for the attention-removed A2C are reduced by more than 70%, which indicates the important role of the attention mechanism.
APA, Harvard, Vancouver, ISO, and other styles
5

Ong, Sylvie, Yuri Grinberg, and Joelle Pineau. "Mixed Observability Predictive State Representations." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (June 30, 2013): 746–52. http://dx.doi.org/10.1609/aaai.v27i1.8680.

Full text
Abstract:
Learning accurate models of agent behaviours is crucial for the purpose of controlling systems where the agents' and environment's dynamics are unknown. This is a challenging problem, but structural assumptions can be leveraged to tackle it effectively. In particular, many systems exhibit mixed observability, when observations of some system components are essentially perfect and noiseless, while observations of other components are imperfect, aliased or noisy. In this paper we present a new model learning framework, the mixed observability predictive state representation (MO-PSR), which extends the previously known predictive state representations to the case of mixed observability systems. We present a learning algorithm that is scalable to large amounts of data and to large mixed observability domains, and show theoretical analysis of the learning consistency and computational complexity. Empirical results demonstrate that our algorithm is capable of learning accurate models, at a larger scale than with the generic predictive state representation, by leveraging the mixed observability properties.
APA, Harvard, Vancouver, ISO, and other styles
6

Maier, Marc, Brian Taylor, Huseyin Oktay, and David Jensen. "Learning Causal Models of Relational Domains." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 3, 2010): 531–38. http://dx.doi.org/10.1609/aaai.v24i1.7695.

Full text
Abstract:
Methods for discovering causal knowledge from observational data have been a persistent topic of AI research for several decades. Essentially all of this work focuses on knowledge representations for propositional domains. In this paper, we present several key algorithmic and theoretical innovations that extend causal discovery to relational domains. We provide strong evidence that effective learning of causal models is enhanced by relational representations. We present an algorithm, relational PC, that learns causal dependencies in a state-of-the-art relational representation, and we identify the key representational and algorithmic innovations that make the algorithm possible. Finally, we prove the algorithm's theoretical correctness and demonstrate its effectiveness on synthetic and real data sets.
APA, Harvard, Vancouver, ISO, and other styles
7

Lesort, Timothée, Natalia Díaz-Rodríguez, Jean-Frano̧is Goudou, and David Filliat. "State representation learning for control: An overview." Neural Networks 108 (December 2018): 379–92. http://dx.doi.org/10.1016/j.neunet.2018.07.006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Chornozhuk, S. "The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem." Cybernetics and Computer Technologies, no. 3 (October 27, 2020): 59–73. http://dx.doi.org/10.34229/2707-451x.20.3.6.

Full text
Abstract:
Introduction. The spatial protein structure folding is an important and actual problem in computational biology. Considering the mathematical model of the task, it can be easily concluded that finding an optimal protein conformation in a three dimensional grid is a NP-hard problem. Therefore some reinforcement learning techniques such as Q-learning approach can be used to solve the problem. The article proposes a new geometric “state-action” space representation which significantly differs from all alternative representations used for this problem. The purpose of the article is to analyze existing approaches of different states and actions spaces representations for Q-learning algorithm for protein structure folding problem, reveal their advantages and disadvantages and propose the new geometric “state-space” representation. Afterwards the goal is to compare existing and the proposed approaches, make conclusions with also describing possible future steps of further research. Result. The work of the proposed algorithm is compared with others on the basis of 10 known chains with a length of 48 first proposed in [16]. For each of the chains the Q-learning algorithm with the proposed “state-space” representation outperformed the same Q-learning algorithm with alternative existing “state-space” representations both in terms of average and minimal energy values of resulted conformations. Moreover, a plenty of existing representations are used for a 2D protein structure predictions. However, during the experiments both existing and proposed representations were slightly changed or developed to solve the problem in 3D, which is more computationally demanding task. Conclusion. The quality of the Q-learning algorithm with the proposed geometric “state-action” space representation has been experimentally confirmed. Consequently, it’s proved that the further research is promising. Moreover, several steps of possible future research such as combining the proposed approach with deep learning techniques has been already suggested. Keywords: Spatial protein structure, combinatorial optimization, relative coding, machine learning, Q-learning, Bellman equation, state space, action space, basis in 3D space.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhang, Yujia, Lai-Man Po, Xuyuan Xu, Mengyang Liu, Yexin Wang, Weifeng Ou, Yuzhi Zhao, and Wing-Yin Yu. "Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3380–89. http://dx.doi.org/10.1609/aaai.v36i3.20248.

Full text
Abstract:
Spatio-temporal representation learning is critical for video self-supervised representation. Recent approaches mainly use contrastive learning and pretext tasks. However, these approaches learn representation by discriminating sampled instances via feature similarity in the latent space while ignoring the intermediate state of the learned representations, which limits the overall performance. In this work, taking into account the degree of similarity of sampled instances as the intermediate state, we propose a novel pretext task - spatio-temporal overlap rate (STOR) prediction. It stems from the observation that humans are capable of discriminating the overlap rates of videos in space and time. This task encourages the model to discriminate the STOR of two generated samples to learn the representations. Moreover, we employ a joint optimization combining pretext tasks with contrastive learning to further enhance the spatio-temporal representation learning. We also study the mutual influence of each component in the proposed scheme. Extensive experiments demonstrate that our proposed STOR task can favor both contrastive learning and pretext tasks and the joint optimization scheme can significantly improve the spatio-temporal representation in video understanding. The code is available at https://github.com/Katou2/CSTP.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Dongfen, Lichao Meng, Jingjing Li, Ke Lu, and Yang Yang. "Domain adaptive state representation alignment for reinforcement learning." Information Sciences 609 (September 2022): 1353–68. http://dx.doi.org/10.1016/j.ins.2022.07.156.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Razmi, Niloufar, and Matthew R. Nassar. "Adaptive Learning through Temporal Dynamics of State Representation." Journal of Neuroscience 42, no. 12 (February 1, 2022): 2524–38. http://dx.doi.org/10.1523/jneurosci.0387-21.2022.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Liu, Qiyuan, Qi Zhou, Rui Yang, and Jie Wang. "Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 7 (June 26, 2023): 8843–51. http://dx.doi.org/10.1609/aaai.v37i7.26063.

Full text
Abstract:
Recent work has shown that representation learning plays a critical role in sample-efficient reinforcement learning (RL) from pixels. Unfortunately, in real-world scenarios, representation learning is usually fragile to task-irrelevant distractions such as variations in background or viewpoint. To tackle this problem, we propose a novel clustering-based approach, namely Clustering with Bisimulation Metrics (CBM), which learns robust representations by grouping visual observations in the latent space. Specifically, CBM alternates between two steps: (1) grouping observations by measuring their bisimulation distances to the learned prototypes; (2) learning a set of prototypes according to the current cluster assignments. Computing cluster assignments with bisimulation metrics enables CBM to capture task-relevant information, as bisimulation metrics quantify the behavioral similarity between observations. Moreover, CBM encourages the consistency of representations within each group, which facilitates filtering out task-irrelevant information and thus induces robust representations against distractions. An appealing feature is that CBM can achieve sample-efficient representation learning even if multiple distractions exist simultaneously. Experiments demonstrate that CBM significantly improves the sample efficiency of popular visual RL algorithms and achieves state-of-the-art performance on both multiple and single distraction settings. The code is available at https://github.com/MIRALab-USTC/RL-CBM.
APA, Harvard, Vancouver, ISO, and other styles
13

Jin, Xu, Teng Huang, Ke Wen, Mengxian Chi, and Hong An. "HistoSSL: Self-Supervised Representation Learning for Classifying Histopathology Images." Mathematics 11, no. 1 (December 26, 2022): 110. http://dx.doi.org/10.3390/math11010110.

Full text
Abstract:
The success of image classification depends on copious annotated images for training. Annotating histopathology images is costly and laborious. Although several successful self-supervised representation learning approaches have been introduced, they are still insufficient to consider the unique characteristics of histopathology images. In this work, we propose the novel histopathology-oriented self-supervised representation learning framework (HistoSSL) to efficiently extract representations from unlabeled histopathology images at three levels: global, cell, and stain. The model transfers remarkably to downstream tasks: colorectal tissue phenotyping on the NCTCRC dataset and breast cancer metastasis recognition on the CAMELYON16 dataset. HistoSSL achieved higher accuracies than state-of-the-art self-supervised learning approaches, which proved the robustness of the learned representations.
APA, Harvard, Vancouver, ISO, and other styles
14

Luo, Dezhao, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, and Weiping Wang. "Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11701–8. http://dx.doi.org/10.1609/aaai.v34i07.6840.

Full text
Abstract:
We propose a novel self-supervised method, referred to as Video Cloze Procedure (VCP), to learn rich spatial-temporal representations. VCP first generates “blanks” by withholding video clips and then creates “options” by applying spatio-temporal operations on the withheld clips. Finally, it fills the blanks with “options” and learns representations by predicting the categories of operations applied on the clips. VCP can act as either a proxy task or a target task in self-supervised learning. As a proxy task, it converts rich self-supervised representations into video clip operations (options), which enhances the flexibility and reduces the complexity of representation learning. As a target task, it can assess learned representation models in a uniform and interpretable manner. With VCP, we train spatial-temporal representation models (3D-CNNs) and apply such models on action recognition and video retrieval tasks. Experiments on commonly used benchmarks show that the trained models outperform the state-of-the-art self-supervised models with significant margins.
APA, Harvard, Vancouver, ISO, and other styles
15

Park, Deog-Yeong, and Ki-Hoon Lee. "Practical Algorithmic Trading Using State Representation Learning and Imitative Reinforcement Learning." IEEE Access 9 (2021): 152310–21. http://dx.doi.org/10.1109/access.2021.3127209.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Chen, Hanxiao. "Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 18 (May 18, 2021): 15769–70. http://dx.doi.org/10.1609/aaai.v35i18.17881.

Full text
Abstract:
Humans possess the advanced ability to grab, hold, and manipulate objects with dexterous hands. What about robots? Can they interact with the surrounding world intelligently to achieve certain goals (e.g., grasping, object-relocation)? Actually, robotic manipulation is central to achieving the premise of robotics and represents immense potential to be widely applied in various scenarios like industries, hospitals, and homes. In this work, we aim to address multiple robotic manipulation tasks like grasping, button-pushing, and door-opening with reinforcement learning (RL), state representation learning (SRL), and imitation learning. For diverse missions, we self-built the PyBullet or MuJoCo simulated environments and independently explored three different learning-style methods to successfully solve such tasks: (1) Normal reinforcement learning methods; (2) Combined state representation learning (SRL) and RL approaches; (3) Imitation learning bootstrapped RL algorithms.
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Xingqi, Mengrui Zhang, Bin Chen, Dan Wei, and Yanli Shao. "Dynamic Weighted Multitask Learning and Contrastive Learning for Multimodal Sentiment Analysis." Electronics 12, no. 13 (July 7, 2023): 2986. http://dx.doi.org/10.3390/electronics12132986.

Full text
Abstract:
Multimodal sentiment analysis (MSA) has attracted more and more attention in recent years. This paper focuses on the representation learning of multimodal data to reach higher prediction results. We propose a model to assist in learning modality representations with multitask learning and contrastive learning. In addition, our approach obtains dynamic weights by considering the homoscedastic uncertainty of each task in multitask learning. Specially, we design two groups of subtasks, which predict the sentiment polarity of unimodal and bimodal representations, to assist in learning representation through a hard parameter-sharing mechanism in the upstream neural network. A loss weight is learned according to the homoscedastic uncertainty of each task. Moreover, a training strategy based on contrastive learning is designed to balance the inconsistency between training and inference caused by the randomness of the dropout layer. This method minimizes the MSE between two submodels. Experimental results on the MOSI and MOSEI datasets show our method achieves better performance than the current state-of-the-art methods by comprehensively considering the intramodality and intermodality interaction information.
APA, Harvard, Vancouver, ISO, and other styles
18

Rives, Alexander, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, et al. "Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences." Proceedings of the National Academy of Sciences 118, no. 15 (April 5, 2021): e2016239118. http://dx.doi.org/10.1073/pnas.2016239118.

Full text
Abstract:
In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In the life sciences, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Protein language modeling at the scale of evolution is a logical step toward predictive and generative artificial intelligence for biology. To this end, we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million protein sequences spanning evolutionary diversity. The resulting model contains information about biological properties in its representations. The representations are learned from sequence data alone. The learned representation space has a multiscale organization reflecting structure from the level of biochemical properties of amino acids to remote homology of proteins. Information about secondary and tertiary structure is encoded in the representations and can be identified by linear projections. Representation learning produces features that generalize across a range of applications, enabling state-of-the-art supervised prediction of mutational effect and secondary structure and improving state-of-the-art features for long-range contact prediction.
APA, Harvard, Vancouver, ISO, and other styles
19

Chang, Xinglong, Jianrong Wang, Rui Guo, Yingkui Wang, and Weihao Li. "Asymmetric Graph Contrastive Learning." Mathematics 11, no. 21 (October 31, 2023): 4505. http://dx.doi.org/10.3390/math11214505.

Full text
Abstract:
Learning effective graph representations in an unsupervised manner is a popular research topic in graph data analysis. Recently, contrastive learning has shown its success in unsupervised graph representation learning. However, how to avoid collapsing solutions for contrastive learning methods remains a critical challenge. In this paper, a simple method is proposed to solve this problem for graph representation learning, which is different from existing commonly used techniques (such as negative samples or predictor network). The proposed model mainly relies on an asymmetric design that consists of two graph neural networks (GNNs) with unequal depth layers to learn node representations from two augmented views and defines contrastive loss only based on positive sample pairs. The simple method has lower computational and memory complexity than existing methods. Furthermore, a theoretical analysis proves that the asymmetric design avoids collapsing solutions when training together with a stop-gradient operation. Our method is compared to nine state-of-the-art methods on six real-world datasets to demonstrate its validity and superiority. The ablation experiments further validated the essential role of the asymmetric architecture.
APA, Harvard, Vancouver, ISO, and other styles
20

Xing, Jinwei, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci, and Jeffrey L. Krichmar. "Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10452–59. http://dx.doi.org/10.1609/aaai.v35i12.17251.

Full text
Abstract:
Despite the recent success of deep reinforcement learning (RL), domain adaptation remains an open problem. Although the generalization ability of RL agents is critical for the real-world applicability of Deep RL, zero-shot policy transfer is still a challenging problem since even minor visual changes could make the trained agent completely fail in the new task. To address this issue, we propose a two-stage RL agent that first learns a latent unified state representation (LUSR) which is consistent across multiple domains in the first stage, and then do RL training in one source domain based on LUSR in the second stage. The cross-domain consistency of LUSR allows the policy acquired from the source domain to generalize to other target domains without extra training. We first demonstrate our approach in variants of CarRacing games with customized manipulations, and then verify it in CARLA, an autonomous driving simulator with more complex and realistic visual observations. Our results show that this approach can achieve state-of-the-art domain adaptation performance in related RL tasks and outperforms prior approaches based on latent-representation based RL and image-to-image translation.
APA, Harvard, Vancouver, ISO, and other styles
21

Zhu, Yi, Lei Li, and Xindong Wu. "Stacked Convolutional Sparse Auto-Encoders for Representation Learning." ACM Transactions on Knowledge Discovery from Data 15, no. 2 (April 2021): 1–21. http://dx.doi.org/10.1145/3434767.

Full text
Abstract:
Deep learning seeks to achieve excellent performance for representation learning in image datasets. However, supervised deep learning models such as convolutional neural networks require a large number of labeled image data, which is intractable in applications, while unsupervised deep learning models like stacked denoising auto-encoder cannot employ label information. Meanwhile, the redundancy of image data incurs performance degradation on representation learning for aforementioned models. To address these problems, we propose a semi-supervised deep learning framework called stacked convolutional sparse auto-encoder, which can learn robust and sparse representations from image data with fewer labeled data records. More specifically, the framework is constructed by stacking layers. In each layer, higher layer feature representations are generated by features of lower layers in a convolutional way with kernels learned by a sparse auto-encoder. Meanwhile, to solve the data redundance problem, the algorithm of Reconstruction Independent Component Analysis is designed to train on patches for sphering the input data. The label information is encoded using a Softmax Regression model for semi-supervised learning. With this framework, higher level representations are learned by layers mapping from image data. It can boost the performance of the base subsequent classifiers such as support vector machines. Extensive experiments demonstrate the superior classification performance of our framework compared to several state-of-the-art representation learning methods.
APA, Harvard, Vancouver, ISO, and other styles
22

Wang, Sheng, Liyong Chen, and Furong Peng. "Multiview Latent Representation Learning with Feature Diversity for Clustering." Mathematical Problems in Engineering 2022 (July 11, 2022): 1–12. http://dx.doi.org/10.1155/2022/1866636.

Full text
Abstract:
To analyze and manage data, multiple features are extracted for robust and accurate description. Taking account of the complementary nature of multiview data, it is urgent to find a low-dimensional compact representation by leveraging multiple views and generating a better performance for clustering. In this paper, we present multiview latent representation learning with feature diversity for clustering (MvLRFD). To get rid of noise and maintain the information of each view, matrix factorization is adopted to get the latent representation. For fusing the diverse latent representations, we argue that the final representation is constructed by concatenating the latent representation of each view. With the help of the Hilbert–Schmidt Independence Criterion (HSIC), the diversity of features in the final representation is maximized to exploit the complementary information. A new coregularization strategy is introduced to maximize the distance between the similarity between data of the final representation and the latent one. The partition entropy regularization is adopted to control the uniformity level of the value of the similarity matrix and the weight of each view. To verify the availability of MvLRFD compared to state-of-the-art algorithms, several experiments are conducted on four real datasets.
APA, Harvard, Vancouver, ISO, and other styles
23

Keller, Patrick, Abdoul Kader Kaboré, Laura Plein, Jacques Klein, Yves Le Traon, and Tegawendé F. Bissyandé. "What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning." ACM Transactions on Software Engineering and Methodology 31, no. 2 (April 30, 2022): 1–34. http://dx.doi.org/10.1145/3485135.

Full text
Abstract:
Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.
APA, Harvard, Vancouver, ISO, and other styles
24

SCARPETTA, SILVIA, ZHAOPING LI, and JOHN HERTZ. "LEARNING IN AN OSCILLATORY CORTICAL MODEL." Fractals 11, supp01 (February 2003): 291–300. http://dx.doi.org/10.1142/s0218348x03001951.

Full text
Abstract:
We study a model of generalized-Hebbian learning in asymmetric oscillatory neural networks modeling cortical areas such as hippocampus and olfactory cortex. The learning rule is based on the synaptic plasticity observed experimentally, in particular long-term potentiation and long-term depression of the synaptic efficacies depending on the relative timing of the pre- and postsynaptic activities during learning. The learned memory or representational states can be encoded by both the amplitude and the phase patterns of the oscillating neural populations, enabling more efficient and robust information coding than in conventional models of associative memory or input representation. Depending on the class of nonlinearity of the activation function, the model can function as an associative memory for oscillatory patterns (nonlinearity of class II) or can generalize from or interpolate between the learned states, appropriate for the function of input representation (nonlinearity of class I). In the former case, simulations of the model exhibits a first order transition between the "disordered state" and the "ordered" memory state.
APA, Harvard, Vancouver, ISO, and other styles
25

Zang, Hongyu, Xin Li, and Mingzhong Wang. "SimSR: Simple Distance-Based State Representations for Deep Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8997–9005. http://dx.doi.org/10.1609/aaai.v36i8.20883.

Full text
Abstract:
This work explores how to learn robust and generalizable state representation from image-based observations with deep reinforcement learning methods. Addressing the computational complexity, stringent assumptions and representation collapse challenges in existing work of bisimulation metric, we devise Simple State Representation (SimSR) operator. SimSR enables us to design a stochastic approximation method that can practically learn the mapping functions (encoders) from observations to latent representation space. In addition to the theoretical analysis and comparison with the existing work, we experimented and compared our work with recent state-of-the-art solutions in visual MuJoCo tasks. The results shows that our model generally achieves better performance and has better robustness and good generalization.
APA, Harvard, Vancouver, ISO, and other styles
26

Zhu, Zixin, Le Wang, Wei Tang, Ziyi Liu, Nanning Zheng, and Gang Hua. "Learning Disentangled Classification and Localization Representations for Temporal Action Localization." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3644–52. http://dx.doi.org/10.1609/aaai.v36i3.20277.

Full text
Abstract:
A common approach to Temporal Action Localization (TAL) is to generate action proposals and then perform action classification and localization on them. For each proposal, existing methods universally use a shared proposal-level representation for both tasks. However, our analysis indicates that this shared representation focuses on the most discriminative frames for classification, e.g., ``take-offs" rather than ``run-ups" in distinguishing ``high jump" and ``long jump", while frames most relevant to localization, such as the start and end frames of an action, are largely ignored. In other words, such a shared representation can not simultaneously handle both classification and localization tasks well, and it makes precise TAL difficult. To address this challenge, this paper disentangles the shared representation into classification and localization representations. The disentangled classification representation focuses on the most discriminative frames, and the disentangled localization representation focuses on the action phase as well as the action start and end. Our model could be divided into two sub-networks, i.e., the disentanglement network and the context-based aggregation network. The disentanglement network is an autoencoder to learn orthogonal hidden variables of classification and localization. The context-based aggregation network aggregates the classification and localization representations by modeling local and global contexts. We evaluate our proposed method on two popular benchmarks for TAL, which outperforms all state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
27

Zeng, Fanrui, Yingjie Sun, and Yizhou Li. "MRLBot: Multi-Dimensional Representation Learning for Social Media Bot Detection." Electronics 12, no. 10 (May 19, 2023): 2298. http://dx.doi.org/10.3390/electronics12102298.

Full text
Abstract:
Social media bots pose potential threats to the online environment, and the continuously evolving anti-detection technologies require bot detection methods to be more reliable and general. Current detection methods encounter challenges, including limited generalization ability, susceptibility to evasion in traditional feature engineering, and insufficient exploration of user relationships. To tackle these challenges, this paper proposes MRLBot, a social media bot detection framework based on unsupervised representation learning. We design a behavior representation learning model that utilizes Transformer and a CNN encoder–decoder to simultaneously extract global and local features from behavioral information. Furthermore, a network representation learning model is proposed that introduces intra- and outer-community-oriented random walks to learn structural features and community connections from the relationship graph. Finally, the behavioral representation and relationship representation learning models are combined to generate fused representations for bot detection. The experimental results of four publicly available social network datasets demonstrate that the proposed method has certain advantages over state-of-the-art detection methods in this field.
APA, Harvard, Vancouver, ISO, and other styles
28

Yang, Di, Yaohui Wang, Quan Kong, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca, and François Brémond. "Self-Supervised Video Representation Learning via Latent Time Navigation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3118–26. http://dx.doi.org/10.1609/aaai.v37i3.25416.

Full text
Abstract:
Self-supervised video representation learning aimed at maximizing similarity between different temporal segments of one video, in order to enforce feature persistence over time. This leads to loss of pertinent information related to temporal relationships, rendering actions such as `enter' and `leave' to be indistinguishable. To mitigate this limitation, we propose Latent Time Navigation (LTN), a time parameterized contrastive learning strategy that is streamlined to capture fine-grained motions. Specifically, we maximize the representation similarity between different video segments from one video, while maintaining their representations time-aware along a subspace of the latent representation code including an orthogonal basis to represent temporal changes. Our extensive experimental analysis suggests that learning video representations by LTN consistently improves performance of action classification in fine-grained and human-oriented tasks (e.g., on Toyota Smarthome dataset). In addition, we demonstrate that our proposed model, when pre-trained on Kinetics-400, generalizes well onto the unseen real world video benchmark datasets UCF101 and HMDB51, achieving state-of-the-art performance in action recognition.
APA, Harvard, Vancouver, ISO, and other styles
29

Li, Xiutian, Siqi Sun, and Rui Feng. "Causal Representation Learning via Counterfactual Intervention." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 4 (March 24, 2024): 3234–42. http://dx.doi.org/10.1609/aaai.v38i4.28108.

Full text
Abstract:
Existing causal representation learning methods are based on the causal graph they build. However, due to the omission of bias within the causal graph, they essentially encourage models to learn biased causal effects in latent space. In this paper, we propose a novel causally disentangling framework that aims to learn unbiased causal effects. We first introduce inductive and dataset biases into traditional causal graph for the physical concepts of interest. Then, we eliminate the negative effects from these two biases by counterfactual intervention with reweighted loss function for learning unbiased causal effects. Finally, we employ the causal effects into the VAE to endow the latent representations with causality. In particular, we highlight that removing biases in this paper is regarded as a part of learning process for unbiased causal effects, which is crucial for causal disentanglement performance improvement. Through extensive experiments on real-world and synthetic datasets, we show that our method outperforms different baselines and obtains the state-of-the-art results for achieving causal representation learning.
APA, Harvard, Vancouver, ISO, and other styles
30

Kim, Jung-Hoon, Yizhen Zhang, Kuan Han, Zheyu Wen, Minkyu Choi, and Zhongming Liu. "Representation learning of resting state fMRI with variational autoencoder." NeuroImage 241 (November 2021): 118423. http://dx.doi.org/10.1016/j.neuroimage.2021.118423.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Humbert, Pierre, Clement Dubost, Julien Audiffren, and Laurent Oudre. "Apprenticeship Learning for a Predictive State Representation of Anesthesia." IEEE Transactions on Biomedical Engineering 67, no. 7 (July 2020): 2052–63. http://dx.doi.org/10.1109/tbme.2019.2954348.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Liu, Feng, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, Yuzhou Zhang, and Xiuqiang He. "State representation modeling for deep reinforcement learning based recommendation." Knowledge-Based Systems 205 (October 2020): 106170. http://dx.doi.org/10.1016/j.knosys.2020.106170.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Mo, Yujie, Liang Peng, Jie Xu, Xiaoshuang Shi, and Xiaofeng Zhu. "Simple Unsupervised Graph Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7797–805. http://dx.doi.org/10.1609/aaai.v36i7.20748.

Full text
Abstract:
In this paper, we propose a simple unsupervised graph representation learning method to conduct effective and efficient contrastive learning. Specifically, the proposed multiplet loss explores the complementary information between the structural information and neighbor information to enlarge the inter-class variation, as well as adds an upper bound loss to achieve the finite distance between positive embeddings and anchor embeddings for reducing the intra-class variation. As a result, both enlarging inter-class variation and reducing intra-class variation result in small generalization error, thereby obtaining an effective model. Furthermore, our method removes widely used data augmentation and discriminator from previous graph contrastive learning methods, meanwhile available to output low-dimensional embeddings, leading to an efficient model. Experimental results on various real-world datasets demonstrate the effectiveness and efficiency of our method, compared to state-of-the-art methods. The source codes are released at https://github.com/YujieMo/SUGRL.
APA, Harvard, Vancouver, ISO, and other styles
34

Achille, Alessandro, and Stefano Soatto. "A Separation Principle for Control in the Age of Deep Learning." Annual Review of Control, Robotics, and Autonomous Systems 1, no. 1 (May 28, 2018): 287–307. http://dx.doi.org/10.1146/annurev-control-060117-105140.

Full text
Abstract:
We review the problem of defining and inferring a state for a control system based on complex, high-dimensional, highly uncertain measurement streams, such as videos. Such a state, or representation, should contain all and only the information needed for control and discount nuisance variability in the data. It should also have finite complexity, ideally modulated depending on available resources. This representation is what we want to store in memory in lieu of the data, as it separates the control task from the measurement process. For the trivial case with no dynamics, a representation can be inferred by minimizing the information bottleneck Lagrangian in a function class realized by deep neural networks. The resulting representation has much higher dimension than the data (already in the millions) but is smaller in the sense of information content, retaining only what is needed for the task. This process also yields representations that are invariant to nuisance factors and have maximally independent components. We extend these ideas to the dynamic case, where the representation is the posterior density of the task variable given the measurements up to the current time, which is in general much simpler than the prediction density maintained by the classical Bayesian filter. Again, this can be finitely parameterized using a deep neural network, and some applications are already beginning to emerge. No explicit assumption of Markovianity is needed; instead, complexity trades off approximation of an optimal representation, including the degree of Markovianity.
APA, Harvard, Vancouver, ISO, and other styles
35

Li, Zhengyi, Menglu Li, Lida Zhu, and Wen Zhang. "Improving PTM Site Prediction by Coupling of Multi-Granularity Structure and Multi-Scale Sequence Representation." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 1 (March 24, 2024): 188–96. http://dx.doi.org/10.1609/aaai.v38i1.27770.

Full text
Abstract:
Protein post-translational modification (PTM) site prediction is a fundamental task in bioinformatics. Several computational methods have been developed to predict PTM sites. However, existing methods ignore the structure information and merely utilize protein sequences. Furthermore, designing a more fine-grained structure representation learning method is urgently needed as PTM is a biological event that occurs at the atom granularity. In this paper, we propose a PTM site prediction method by Coupling of Multi-Granularity structure and Multi-Scale sequence representation, PTM-CMGMS for brevity. Specifically, multigranularity structure-aware representation learning is designed to learn neighborhood structure representations at the amino acid, atom, and whole protein granularity from AlphaFold predicted structures, followed by utilizing contrastive learning to optimize the structure representations. Additionally, multi-scale sequence representation learning is used to extract context sequence information, and motif generated by aligning all context sequences of PTM sites assists the prediction. Extensive experiments on three datasets show that PTM-CMGMS outperforms the state-of-the-art methods. Source code can be found at https://github.com/LZY-HZAU/PTM-CMGMS.
APA, Harvard, Vancouver, ISO, and other styles
36

Grigoryeva, Lyudmila, Allen Hart, and Juan-Pablo Ortega. "Learning strange attractors with reservoir systems." Nonlinearity 36, no. 9 (July 27, 2023): 4674–708. http://dx.doi.org/10.1088/1361-6544/ace492.

Full text
Abstract:
Abstract This paper shows that the celebrated embedding theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generalized synchronization that arises in this setup and that yields a topological conjugacy between the state-space dynamics driven by the generic observations of the dynamical system and the dynamical system itself. This result provides additional tools for the representation, learning, and analysis of chaotic attractors and sheds additional light on the reservoir computing phenomenon that appears in the context of recurrent neural networks.
APA, Harvard, Vancouver, ISO, and other styles
37

Kefato, Zekarias, and Sarunas Girdzijauskas. "Gossip and Attend: Context-Sensitive Graph Representation Learning." Proceedings of the International AAAI Conference on Web and Social Media 14 (May 26, 2020): 351–59. http://dx.doi.org/10.1609/icwsm.v14i1.7305.

Full text
Abstract:
Graph representation learning (GRL) is a powerful technique for learning low-dimensional vector representation of high-dimensional and often sparse graphs. Most studies explore the structure and metadata associated with the graph using random walks and employ an unsupervised or semi-supervised learning schemes. Learning in these methods is context-free, resulting in only a single representation per node. Recently studies have argued on the adequacy of a single representation and proposed context-sensitive approaches, which are capable of extracting multiple node representations for different contexts. This proved to be highly effective in applications such as link prediction and ranking.However, most of these methods rely on additional textual features that require complex and expensive RNNs or CNNs to capture high-level features or rely on a community detection algorithm to identify multiple contexts of a node.In this study we show that in-order to extract high-quality context-sensitive node representations it is not needed to rely on supplementary node features, nor to employ computationally heavy and complex models. We propose Goat, a context-sensitive algorithm inspired by gossip communication and a mutual attention mechanism simply over the structure of the graph. We show the efficacy of Goat using 6 real-world datasets on link prediction and node clustering tasks and compare it against 12 popular and state-of-the-art (SOTA) baselines. Goat consistently outperforms them and achieves up to 12% and 19% gain over the best performing methods on link prediction and clustering tasks, respectively.
APA, Harvard, Vancouver, ISO, and other styles
38

BREEDEN, JOSEPH L., and NORMAN H. PACKARD. "A LEARNING ALGORITHM FOR OPTIMAL REPRESENTATION OF EXPERIMENTAL DATA." International Journal of Bifurcation and Chaos 04, no. 02 (April 1994): 311–26. http://dx.doi.org/10.1142/s0218127494000228.

Full text
Abstract:
We have developed a procedure for finding optimal representations of experimental data. Criteria for optimality vary according to context; an optimal state space representation will be one that best suits one’s stated goal for reconstruction. We consider an ∞-dimensional set of possible reconstruction coordinate systems that include time delays, derivatives, and many other possible coordinates; and any optimality criterion is specified as a real valued functional on this space. We present a method for finding the optima using a learning algorithm based upon the genetic algorithm and evolutionary programming. The learning algorithm machinery for finding optimal representations is independent of the definition of optimality, and thus provides a general tool useful in a wide variety of contexts.
APA, Harvard, Vancouver, ISO, and other styles
39

Liu, Shengli, Xiaowen Zhu, Zewei Cao, and Gang Wang. "Deep 1D Landmark Representation Learning for Space Target Pose Estimation." Remote Sensing 14, no. 16 (August 18, 2022): 4035. http://dx.doi.org/10.3390/rs14164035.

Full text
Abstract:
Monocular vision-based pose estimation for known uncooperative space targets plays an increasingly important role in on-orbit operations. The existing state-of-the-art methods of space target pose estimation build the 2D-3D correspondences to recover the space target pose, where space target landmark regression is a key component of the methods. The 2D heatmap representation is the dominant descriptor in landmark regression. However, its quantization error grows dramatically under low-resolution input conditions, and extra post-processing is usually needed to compute the accurate 2D pixel coordinates of landmarks from heatmaps. To overcome the aforementioned problems, we propose a novel 1D landmark representation that encodes the horizontal and vertical pixel coordinates of a landmark as two independent 1D vectors. Furthermore, we also propose a space target landmark regression network to regress the locations of landmarks in the image using 1D landmark representations. Comprehensive experiments conducted on the SPEED dataset show that the proposed 1D landmark representation helps the proposed space target landmark regression network outperform existing state-of-the-art methods at various input resolutions, especially at low resolutions. Based on the 2D landmarks predicted by the proposed space target landmark regression network, the error of space target pose estimation is also smaller than existing state-of-the-art methods under all input resolution conditions.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Jingran, Xing Xu, Fumin Shen, Huimin Lu, Xin Liu, and Heng Tao Shen. "Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (May 18, 2021): 3351–59. http://dx.doi.org/10.1609/aaai.v35i4.16447.

Full text
Abstract:
The recent success of audio-visual representations learning can be largely attributed to their pervasive concurrency property, which can be used as a self-supervision signal and extract correlation information. While most recent works focus on capturing the shared associations between the audio and visual modalities, they rarely consider multiple audio and video pairs at once and pay little attention to exploiting the valuable information within each modality. To tackle this problem, we propose a novel audio-visual representation learning method dubbed self-supervised curriculum learning (SSCL) under the teacher-student learning manner. Specifically, taking advantage of contrastive learning, a two-stage scheme is exploited, which transfers the cross-modal information between teacher and student model as a phased process. The proposed SSCL approach regards the pervasive property of audiovisual concurrency as latent supervision and mutually distills the structure knowledge of visual to audio data. Notably, the SSCL method can learn discriminative audio and visual representations for various downstream applications. Extensive experiments conducted on both action video recognition and audio sound recognition tasks show the remarkably improved performance of the SSCL method compared with the state-of-the-art self-supervised audio-visual representation learning methods.
APA, Harvard, Vancouver, ISO, and other styles
41

Han, Ruijiang, Wei Wang, Yuxi Long, and Jiajie Peng. "Deep Representation Debiasing via Mutual Information Minimization and Maximization (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 11 (June 28, 2022): 12965–66. http://dx.doi.org/10.1609/aaai.v36i11.21619.

Full text
Abstract:
Deep representation learning has succeeded in several fields. However, pre-trained deep representations are usually biased and make downstream models sensitive to different attributes. In this work, we propose a post-processing unsupervised deep representation debiasing algorithm, DeepMinMax, which can obtain unbiased representations directly from pre-trained representations without re-training or fine-tuning the entire model. The experimental results on synthetic and real-world datasets indicate that DeepMinMax outperforms the existing state-of-the-art algorithms on downstream tasks.
APA, Harvard, Vancouver, ISO, and other styles
42

Li, Fengpeng, Jiabao Li, Wei Han, Ruyi Feng, and Lizhe Wang. "Unsupervised Representation High-Resolution Remote Sensing Image Scene Classification via Contrastive Learning Convolutional Neural Network." Photogrammetric Engineering & Remote Sensing 87, no. 8 (August 1, 2021): 577–91. http://dx.doi.org/10.14358/pers.87.8.577.

Full text
Abstract:
Inspired by the outstanding achievement of deep learning, supervised deep learning representation methods for high-spatial-resolution remote sensing image scene classification obtained state-of-the-art performance. However, supervised deep learning representation methods need a considerable amount of labeled data to capture class-specific features, limiting the application of deep learning-based methods while there are a few labeled training samples. An unsupervised deep learning representation, high-resolution remote sensing image scene classification method is proposed in this work to address this issue. The proposed method, called contrastive learning, narrows the distance between positive views: color channels belonging to the same images widens the gaps between negative view pairs consisting of color channels from different images to obtain class-specific data representations of the input data without any supervised information. The classifier uses extracted features by the convolutional neural network (CNN)-based feature extractor with labeled information of training data to set space of each category and then, using linear regression, makes predictions in the testing procedure. Comparing with existing unsupervised deep learning representation high-resolution remote sensing image scene classification methods, contrastive learning CNN achieves state-of-the-art performance on three different scale benchmark data sets: small scale RSSCN7 data set, midscale aerial image data set, and large-scale NWPU-RESISC45 data set.
APA, Harvard, Vancouver, ISO, and other styles
43

Hallac, Ibrahim Riza, Betul Ay, and Galip Aydin. "User Representation Learning for Social Networks: An Empirical Study." Applied Sciences 11, no. 12 (June 13, 2021): 5489. http://dx.doi.org/10.3390/app11125489.

Full text
Abstract:
Gathering useful insights from social media data has gained great interest over the recent years. User representation can be a key task in mining publicly available user-generated rich content offered by the social media platforms. The way to automatically create meaningful observations about users of a social network is to obtain real-valued vectors for the users with user embedding representation learning models. In this study, we presented one of the most comprehensive studies in the literature in terms of learning high-quality social media user representations by leveraging state-of-the-art text representation approaches. We proposed a novel doc2vec-based representation method, which can encode both textual and non-textual information of a social media user into a low dimensional vector. In addition, various experiments were performed for investigating the performance of text representation techniques and concepts including word2vec, doc2vec, Glove, NumberBatch, FastText, BERT, ELMO, and TF-IDF. We also shared a new social media dataset comprising data from 500 manually selected Twitter users of five predefined groups. The dataset contains different activity data such as comment, retweet, like, location, as well as the actual tweets composed by the users.
APA, Harvard, Vancouver, ISO, and other styles
44

Liu, Jiexi, and Songcan Chen. "TimesURL: Self-Supervised Contrastive Learning for Universal Time Series Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (March 24, 2024): 13918–26. http://dx.doi.org/10.1609/aaai.v38i12.29299.

Full text
Abstract:
Learning universal time series representations applicable to various types of downstream tasks is challenging but valuable in real applications. Recently, researchers have attempted to leverage the success of self-supervised contrastive learning (SSCL) in Computer Vision(CV) and Natural Language Processing(NLP) to tackle time series representation. Nevertheless, due to the special temporal characteristics, relying solely on empirical guidance from other domains may be ineffective for time series and difficult to adapt to multiple downstream tasks. To this end, we review three parts involved in SSCL including 1) designing augmentation methods for positive pairs, 2) constructing (hard) negative pairs, and 3) designing SSCL loss. For 1) and 2), we find that unsuitable positive and negative pair construction may introduce inappropriate inductive biases, which neither preserve temporal properties nor provide sufficient discriminative features. For 3), just exploring segment- or instance-level semantics information is not enough for learning universal representation. To remedy the above issues, we propose a novel self-supervised framework named TimesURL. Specifically, we first introduce a frequency-temporal-based augmentation to keep the temporal property unchanged. And then, we construct double Universums as a special kind of hard negative to guide better contrastive learning. Additionally, we introduce time reconstruction as a joint optimization objective with contrastive learning to capture both segment-level and instance-level information. As a result, TimesURL can learn high-quality universal representations and achieve state-of-the-art performance in 6 different downstream tasks, including short- and long-term forecasting, imputation, classification, anomaly detection and transfer learning.
APA, Harvard, Vancouver, ISO, and other styles
45

Perrinet, Laurent U. "Role of Homeostasis in Learning Sparse Representations." Neural Computation 22, no. 7 (July 2010): 1812–36. http://dx.doi.org/10.1162/neco.2010.05-08-795.

Full text
Abstract:
Neurons in the input layer of primary visual cortex in primates develop edge-like receptive fields. One approach to understanding the emergence of this response is to state that neural activity has to efficiently represent sensory data with respect to the statistics of natural scenes. Furthermore, it is believed that such an efficient coding is achieved using a competition across neurons so as to generate a sparse representation, that is, where a relatively small number of neurons are simultaneously active. Indeed, different models of sparse coding, coupled with Hebbian learning and homeostasis, have been proposed that successfully match the observed emergent response. However, the specific role of homeostasis in learning such sparse representations is still largely unknown. By quantitatively assessing the efficiency of the neural representation during learning, we derive a cooperative homeostasis mechanism that optimally tunes the competition between neurons within the sparse coding algorithm. We apply this homeostasis while learning small patches taken from natural images and compare its efficiency with state-of-the-art algorithms. Results show that while different sparse coding algorithms give similar coding results, the homeostasis provides an optimal balance for the representation of natural images within the population of neurons. Competition in sparse coding is optimized when it is fair. By contributing to optimizing statistical competition across neurons, homeostasis is crucial in providing a more efficient solution to the emergence of independent components.
APA, Harvard, Vancouver, ISO, and other styles
46

Naseem, Usman, Imran Razzak, Shah Khalid Khan, and Mukesh Prasad. "A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models." ACM Transactions on Asian and Low-Resource Language Information Processing 20, no. 5 (June 23, 2021): 1–35. http://dx.doi.org/10.1145/3434237.

Full text
Abstract:
Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP-related tasks. In the end, this survey briefly discusses the commonly used ML- and DL-based classifiers, evaluation metrics, and the applications of these word embeddings in different NLP tasks.
APA, Harvard, Vancouver, ISO, and other styles
47

Janner, Michael, Karthik Narasimhan, and Regina Barzilay. "Representation Learning for Grounded Spatial Reasoning." Transactions of the Association for Computational Linguistics 6 (December 2018): 49–61. http://dx.doi.org/10.1162/tacl_a_00004.

Full text
Abstract:
The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive rewards. The proposed model learns a representation of the world steered by instruction text. This design allows for precise alignment of local neighborhoods with corresponding verbalizations, while also handling global references in the instructions. We train our model with reinforcement learning using a variant of generalized value iteration. The model outperforms state-of-the-art approaches on several metrics, yielding a 45% reduction in goal localization error.
APA, Harvard, Vancouver, ISO, and other styles
48

Xu, Xiao, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, and Nan Duan. "BridgeTower: Building Bridges between Encoders in Vision-Language Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10637–47. http://dx.doi.org/10.1609/aaai.v37i9.26263.

Full text
Abstract:
Vision-Language (VL) models with the Two-Tower architecture have dominated visual-language representation learning in recent years. Current VL models either use lightweight uni-modal encoders and learn to extract, align and fuse both modalities simultaneously in a deep cross-modal encoder, or feed the last-layer uni-modal representations from the deep pre-trained uni-modal encoders into the top cross-modal encoder. Both approaches potentially restrict vision-language representation learning and limit model performance. In this paper, we propose BridgeTower, which introduces multiple bridge layers that build a connection between the top layers of uni-modal encoders and each layer of the cross-modal encoder. This enables effective bottom-up cross-modal alignment and fusion between visual and textual representations of different semantic levels of pre-trained uni-modal encoders in the cross-modal encoder. Pre-trained with only 4M images, BridgeTower achieves state-of-the-art performance on various downstream vision-language tasks. In particular, on the VQAv2 test-std set, BridgeTower achieves an accuracy of 78.73%, outperforming the previous state-of-the-art model METER by 1.09% with the same pre-training data and almost negligible additional parameters and computational costs. Notably, when further scaling the model, BridgeTower achieves an accuracy of 81.15%, surpassing models that are pre-trained on orders-of-magnitude larger datasets. Code and checkpoints are available at https://github.com/microsoft/BridgeTower.
APA, Harvard, Vancouver, ISO, and other styles
49

Umar Jamshaid, Umar Jamshaid. "Optimal Query Execution Plan with Deep Reinforcement Learning." International Journal for Electronic Crime Investigation 5, no. 3 (April 6, 2022): 23–28. http://dx.doi.org/10.54692/ijeci.2022.050386.

Full text
Abstract:
We examine the use of profound support learning for inquiry development. The technique is to gradually construct queries by encoding features of sub-inquiries using a learnt representation. We specifically focus on the organization of the state progress effort and the state portrayal issue. We provide preliminary results and investigate how we might use the state representation to further refine question streamlining using assist learning
APA, Harvard, Vancouver, ISO, and other styles
50

Guo, Jifeng, Zhiqi Pang, Wenbo Sun, Shi Li, and Yu Chen. "Redundancy Removal Adversarial Active Learning Based on Norm Online Uncertainty Indicator." Computational Intelligence and Neuroscience 2021 (October 25, 2021): 1–10. http://dx.doi.org/10.1155/2021/4752568.

Full text
Abstract:
Active learning aims to select the most valuable unlabelled samples for annotation. In this paper, we propose a redundancy removal adversarial active learning (RRAAL) method based on norm online uncertainty indicator, which selects samples based on their distribution, uncertainty, and redundancy. RRAAL includes a representation generator, state discriminator, and redundancy removal module (RRM). The purpose of the representation generator is to learn the feature representation of a sample, and the state discriminator predicts the state of the feature vector after concatenation. We added a sample discriminator to the representation generator to improve the representation learning ability of the generator and designed a norm online uncertainty indicator (Norm-OUI) to provide a more accurate uncertainty score for the state discriminator. In addition, we designed an RRM based on a greedy algorithm to reduce the number of redundant samples in the labelled pool. The experimental results on four datasets show that the state discriminator, Norm-OUI, and RRM can improve the performance of RRAAL, and RRAAL outperforms the previous state-of-the-art active learning methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography