Siga este enlace para ver otros tipos de publicaciones sobre el tema: States representation learning.

Artículos de revistas sobre el tema "States representation learning"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "States representation learning".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.

1

Konidaris, George, Leslie Pack Kaelbling y Tomas Lozano-Perez. "From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning". Journal of Artificial Intelligence Research 61 (31 de enero de 2018): 215–89. http://dx.doi.org/10.1613/jair.5575.

Texto completo
Resumen
We consider the problem of constructing abstract representations for planning in high-dimensional, continuous environments. We assume an agent equipped with a collection of high-level actions, and construct representations provably capable of evaluating plans composed of sequences of those actions. We first consider the deterministic planning case, and show that the relevant computation involves set operations performed over sets of states. We define the specific collection of sets that is necessary and sufficient for planning, and use them to construct a grounded abstract symbolic representation that is provably suitable for deterministic planning. The resulting representation can be expressed in PDDL, a canonical high-level planning domain language; we construct such a representation for the Playroom domain and solve it in milliseconds using an off-the-shelf planner. We then consider probabilistic planning, which we show requires generalizing from sets of states to distributions over states. We identify the specific distributions required for planning, and use them to construct a grounded abstract symbolic representation that correctly estimates the expected reward and probability of success of any plan. In addition, we show that learning the relevant probability distributions corresponds to specific instances of probabilistic density estimation and probabilistic classification. We construct an agent that autonomously learns the correct abstract representation of a computer game domain, and rapidly solves it. Finally, we apply these techniques to create a physical robot system that autonomously learns its own symbolic representation of a mobile manipulation task directly from sensorimotor data---point clouds, map locations, and joint angles---and then plans using that representation. Together, these results establish a principled link between high-level actions and abstract representations, a concrete theoretical foundation for constructing abstract representations with provable properties, and a practical mechanism for autonomously learning abstract high-level representations.
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

SCARPETTA, SILVIA, ZHAOPING LI y JOHN HERTZ. "LEARNING IN AN OSCILLATORY CORTICAL MODEL". Fractals 11, supp01 (febrero de 2003): 291–300. http://dx.doi.org/10.1142/s0218348x03001951.

Texto completo
Resumen
We study a model of generalized-Hebbian learning in asymmetric oscillatory neural networks modeling cortical areas such as hippocampus and olfactory cortex. The learning rule is based on the synaptic plasticity observed experimentally, in particular long-term potentiation and long-term depression of the synaptic efficacies depending on the relative timing of the pre- and postsynaptic activities during learning. The learned memory or representational states can be encoded by both the amplitude and the phase patterns of the oscillating neural populations, enabling more efficient and robust information coding than in conventional models of associative memory or input representation. Depending on the class of nonlinearity of the activation function, the model can function as an associative memory for oscillatory patterns (nonlinearity of class II) or can generalize from or interpolate between the learned states, appropriate for the function of input representation (nonlinearity of class I). In the former case, simulations of the model exhibits a first order transition between the "disordered state" and the "ordered" memory state.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Zhu, Zheng-Mao, Shengyi Jiang, Yu-Ren Liu, Yang Yu y Kun Zhang. "Invariant Action Effect Model for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 9260–68. http://dx.doi.org/10.1609/aaai.v36i8.20913.

Texto completo
Resumen
Good representations can help RL agents perform concise modeling of their surroundings, and thus support effective decision-making in complex environments. Previous methods learn good representations by imposing extra constraints on dynamics. However, in the causal perspective, the causation between the action and its effect is not fully considered in those methods, which leads to the ignorance of the underlying relations among the action effects on the transitions. Based on the intuition that the same action always causes similar effects among different states, we induce such causation by taking the invariance of action effects among states as the relation. By explicitly utilizing such invariance, in this paper, we show that a better representation can be learned and potentially improves the sample efficiency and the generalization ability of the learned policy. We propose Invariant Action Effect Model (IAEM) to capture the invariance in action effects, where the effect of an action is represented as the residual of representations from neighboring states. IAEM is composed of two parts: (1) a new contrastive-based loss to capture the underlying invariance of action effects; (2) an individual action effect and provides a self-adapted weighting strategy to tackle the corner cases where the invariance does not hold. The extensive experiments on two benchmarks, i.e. Grid-World and Atari, show that the representations learned by IAEM preserve the invariance of action effects. Moreover, with the invariant action effect, IAEM can accelerate the learning process by 1.6x, rapidly generalize to new environments by fine-tuning on a few components, and outperform other dynamics-based representation methods by 1.4x in limited steps.
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Yue, Yang, Bingyi Kang, Zhongwen Xu, Gao Huang y Shuicheng Yan. "Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 9 (26 de junio de 2023): 11069–77. http://dx.doi.org/10.1609/aaai.v37i9.26311.

Texto completo
Resumen
Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model, which is different from how the model is used in RL---performing value-based planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future state (also referred to as the "imagined state'') based on the current one and a sequence of actions. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a Q value head on both of the states and obtains two distributions of action values. Then a distance is computed and minimized to force the imagined state to produce a similar action value prediction as that by the real state. We develop two implementations of the above idea for the discrete and continuous action spaces respectively. We conduct experiments on Atari 100k and DeepMind Control Suite benchmarks to validate their effectiveness for improving sample efficiency. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Chornozhuk, S. "The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem". Cybernetics and Computer Technologies, n.º 3 (27 de octubre de 2020): 59–73. http://dx.doi.org/10.34229/2707-451x.20.3.6.

Texto completo
Resumen
Introduction. The spatial protein structure folding is an important and actual problem in computational biology. Considering the mathematical model of the task, it can be easily concluded that finding an optimal protein conformation in a three dimensional grid is a NP-hard problem. Therefore some reinforcement learning techniques such as Q-learning approach can be used to solve the problem. The article proposes a new geometric “state-action” space representation which significantly differs from all alternative representations used for this problem. The purpose of the article is to analyze existing approaches of different states and actions spaces representations for Q-learning algorithm for protein structure folding problem, reveal their advantages and disadvantages and propose the new geometric “state-space” representation. Afterwards the goal is to compare existing and the proposed approaches, make conclusions with also describing possible future steps of further research. Result. The work of the proposed algorithm is compared with others on the basis of 10 known chains with a length of 48 first proposed in [16]. For each of the chains the Q-learning algorithm with the proposed “state-space” representation outperformed the same Q-learning algorithm with alternative existing “state-space” representations both in terms of average and minimal energy values of resulted conformations. Moreover, a plenty of existing representations are used for a 2D protein structure predictions. However, during the experiments both existing and proposed representations were slightly changed or developed to solve the problem in 3D, which is more computationally demanding task. Conclusion. The quality of the Q-learning algorithm with the proposed geometric “state-action” space representation has been experimentally confirmed. Consequently, it’s proved that the further research is promising. Moreover, several steps of possible future research such as combining the proposed approach with deep learning techniques has been already suggested. Keywords: Spatial protein structure, combinatorial optimization, relative coding, machine learning, Q-learning, Bellman equation, state space, action space, basis in 3D space.
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Lamanna, Leonardo, Alfonso Emilio Gerevini, Alessandro Saetti, Luciano Serafini y Paolo Traverso. "On-line Learning of Planning Domains from Sensor Data in PAL: Scaling up to Large State Spaces". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 13 (18 de mayo de 2021): 11862–69. http://dx.doi.org/10.1609/aaai.v35i13.17409.

Texto completo
Resumen
We propose an approach to learn an extensional representation of a discrete deterministic planning domain from observations in a continuous space navigated by the agent actions. This is achieved through the use of a perception function providing the likelihood of a real-value observation being in a given state of the planning domain after executing an action. The agent learns an extensional representation of the domain (the set of states, the transitions from states to states caused by actions) and the perception function on-line, while it acts for accomplishing its task. In order to provide a practical approach that can scale up to large state spaces, a “draft” intensional (PDDL-based) model of the planning domain is used to guide the exploration of the environment and learn the states and state transitions. The proposed approach uses a novel algorithm to (i) construct the extensional representation of the domain by interleaving symbolic planning in the PDDL intensional representation and search in the state transition graph of the extensional representation; (ii) incrementally refine the intensional representation taking into account information about the actions that the agent cannot execute. An experimental analysis shows that the novel approach can scale up to large state spaces, thus overcoming the limits in scalability of the previous work.
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Sapena, Oscar, Eva Onaindia y Eliseo Marzal. "Automated feature extraction for planning state representation". Inteligencia Artificial 27, n.º 74 (10 de octubre de 2024): 227–42. http://dx.doi.org/10.4114/intartif.vol27iss74pp227-242.

Texto completo
Resumen
Deep learning methods have recently emerged as a mechanism for generating embeddings of planning states without the need to predefine feature spaces. In this work, we advocate for an automated, cost-effective and interpretable approach to extract representative features of planning states from high-level language. We present a technique that builds up on the objects type and yields a generalization over an entire planning domain, enabling to encode numerical state and goal information of individual planning tasks. The proposed representation is then evaluated in a task for learning heuristic functions for particular domains. A comparative analysis with one of the best current sequential planner and a recent ML-based approach demonstrate the efficacy of our method in improving planner performance.
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

O’Donnell, Ryan y John Wright. "Learning and testing quantum states via probabilistic combinatorics and representation theory". Current Developments in Mathematics 2021, n.º 1 (2021): 43–94. http://dx.doi.org/10.4310/cdm.2021.v2021.n1.a2.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Zhang, Hengyuan, Suyao Zhao, Ruiheng Liu, Wenlong Wang, Yixin Hong y Runjiu Hu. "Automatic Traffic Anomaly Detection on the Road Network with Spatial-Temporal Graph Neural Network Representation Learning". Wireless Communications and Mobile Computing 2022 (20 de junio de 2022): 1–12. http://dx.doi.org/10.1155/2022/4222827.

Texto completo
Resumen
Traffic anomaly detection is an essential part of an intelligent transportation system. Automatic traffic anomaly detection can provide sufficient decision-support information for road network operators, travelers, and other stakeholders. This research proposes a novel automatic traffic anomaly detection method based on spatial-temporal graph neural network representation learning. We divide traffic anomaly detection into two steps: first is learning the implicit graph feature representation of multivariate time series of traffic flows based on a graph attention model to predict the traffic states. Second, traffic anomalies are detected using graph deviation score calculation to compare the deviation of predicted traffic states with the observed traffic states. Experiments on real network datasets show that with an end-to-end workflow and spatial-temporal representation of traffic states, this method can detect traffic anomalies accurately and automatically and achieves better performance over baselines.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Dayan, Peter. "Improving Generalization for Temporal Difference Learning: The Successor Representation". Neural Computation 5, n.º 4 (julio de 1993): 613–24. http://dx.doi.org/10.1162/neco.1993.5.4.613.

Texto completo
Resumen
Estimation of returns over time, the focus of temporal difference (TD) algorithms, imposes particular constraints on good function approximators or representations. Appropriate generalization between states is determined by how similar their successors are, and representations should follow suit. This paper shows how TD machinery can be used to learn such representations, and illustrates, using a navigation task, the appropriately distributed nature of the result.
Los estilos APA, Harvard, Vancouver, ISO, etc.
11

Gershman, Samuel J., Christopher D. Moore, Michael T. Todd, Kenneth A. Norman y Per B. Sederberg. "The Successor Representation and Temporal Context". Neural Computation 24, n.º 6 (junio de 2012): 1553–68. http://dx.doi.org/10.1162/neco_a_00282.

Texto completo
Resumen
The successor representation was introduced into reinforcement learning by Dayan ( 1993 ) as a means of facilitating generalization between states with similar successors. Although reinforcement learning in general has been used extensively as a model of psychological and neural processes, the psychological validity of the successor representation has yet to be explored. An interesting possibility is that the successor representation can be used not only for reinforcement learning but for episodic learning as well. Our main contribution is to show that a variant of the temporal context model (TCM; Howard & Kahana, 2002 ), an influential model of episodic memory, can be understood as directly estimating the successor representation using the temporal difference learning algorithm (Sutton & Barto, 1998 ). This insight leads to a generalization of TCM and new experimental predictions. In addition to casting a new normative light on TCM, this equivalence suggests a previously unexplored point of contact between different learning systems.
Los estilos APA, Harvard, Vancouver, ISO, etc.
12

M. Mounika, L. Sahithi, K. Prasanna Lakshmi, K. Praveenya y N. Ashok Kumar. "Quantum driven deep learning for enhanced diabetic retinopathy detection". World Journal of Advanced Research and Reviews 22, n.º 1 (30 de abril de 2024): 055–60. http://dx.doi.org/10.30574/wjarr.2024.22.1.0964.

Texto completo
Resumen
Traditional convolutional neural networks (CNNs) have shown potential for recognizing retinopathy caused by diabetes (DR). However, developing quantum computing has the possibility for improved feature representation. We propose a hybrid approach that combines classical CNNs with quantum circuits to capitalize on both classical and quantum information for DR classification. Using the Keras and Qiskit frameworks, our model encodes picture features into quantum states, allowing for richer representations. Through experiments on a collection of retinal pictures, our model displays competitive performance, with excellent reliability and precision in categorizing DR severity levels. This combination of classical and quantum paradigms offers a fresh approach to enhancing DR diagnosis and therapy.
Los estilos APA, Harvard, Vancouver, ISO, etc.
13

Robins, Anthony V. "MULTIPLE REPRESENTATIONS IN CONNECTIONIST SYSTEMS". International Journal of Neural Systems 02, n.º 04 (enero de 1991): 345–62. http://dx.doi.org/10.1142/s0129065791000327.

Texto completo
Resumen
This paper proposes an extension to the basic framework of distributed representation through the learning and use of different sorts of information—“multiple representations”—in connectionist/neural network systems. In current distributed networks units are typically ascribed only one “representing” or information carrying state (activation). Similarly, connections carry a single piece of information (a weight derived from the structure of the population of patterns). In this paper we explore units and connections with multiple information carrying states. In this extended framework, multiple distributed representations can coexist with a given pattern of activation. Processing may be based on the interaction of these representations and multiple learning processes can occur simultaneously in a network. We illustrate these extensions using (in addition to patterns of activation) “centrality distribution” representations. Centrality distributions are applied to two tasks, the representation of category and type hierarchy information and the highlighting of exceptional mappings to speed up learning. We suggest that the use of multiple distributed representations in a network can increase the flexibility and power of connectionist systems while remaining within the subsymbolic paradigm. This topic is of particular relevance in the context of the recent interest in the limitations of connectionism and the interface between connectionist and symbolic methods.
Los estilos APA, Harvard, Vancouver, ISO, etc.
14

Li, Xinlin, Changhe Fan y Chengyue Su. "Self-Supervised Learning for Speech-Based Detection of Depressive States". Frontiers in Computing and Intelligent Systems 11, n.º 2 (27 de febrero de 2025): 106–9. https://doi.org/10.54097/1cspmj65.

Texto completo
Resumen
This study aims to enhance the accuracy of depression detection by leveraging representation learning from audio data. The data of depression speech sets are sparse and costly to annotate. Therefore, a self-supervised pre-training approach is employed to improve the performance, generalization capability, and training efficiency of downstream tasks. When processing unlabeled data, the pre-trained audio representations based on self-supervised learning may be interfered with by noisy data if there is a significant amount of noise or errors present. Consequently, it is necessary to effectively analyze long-distance sequence data to enhance anti-interference capabilities. However, traditional LSTM models have limitations in context extraction and robustness to input outliers. Thus, an improved method named CNN-BiLSTM is proposed in this paper. The network initializes the LSTM's embedding layer with pre-trained word vectors and extracts spatial and temporal features separately to ensure a full and complete expression of useful input information. Different weights are assigned based on the importance of the features to obtain fused features. Additionally, a random forest is used for classification to mitigate the risk of overfitting and to demonstrate good performance when processing high-dimensional data. Experimental results show that the proposed model exhibits good classification performance on the depression dataset, outperforming traditional methods and state-of-the-art investigations.
Los estilos APA, Harvard, Vancouver, ISO, etc.
15

Wu, Bo, Yan Peng Feng y Hong Yan Zheng. "A Model-Based Factored Bayesian Reinforcement Learning Approach". Applied Mechanics and Materials 513-517 (febrero de 2014): 1092–95. http://dx.doi.org/10.4028/www.scientific.net/amm.513-517.1092.

Texto completo
Resumen
Bayesian reinforcement learning has turned out to be an effective solution to the optimal tradeoff between exploration and exploitation. However, in practical applications, the learning parameters with exponential growth are the main impediment for online planning and learning. To overcome this problem, we bring factored representations, model-based learning, and Bayesian reinforcement learning together in a new approach. Firstly, we exploit a factored representation to describe the states to reduce the size of learning parameters, and adopt Bayesian inference method to learn the unknown structure and parameters simultaneously. Then, we use an online point-based value iteration algorithm to plan and learn. The experimental results show that the proposed approach is an effective way for improving the learning efficiency in large-scale state spaces.
Los estilos APA, Harvard, Vancouver, ISO, etc.
16

Alvi, Maira, Tim French, Philip Keymer y Rachel Cardell-Oliver. "Automated State Estimation for Summarizing the Dynamics of Complex Urban Systems Using Representation Learning". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 21 (24 de marzo de 2024): 23020–26. http://dx.doi.org/10.1609/aaai.v38i21.30344.

Texto completo
Resumen
Complex urban systems can be difficult to monitor, diagnose and manage because the complete states of such systems are only partially observable with sensors. State estimation techniques can be used to determine the underlying dynamic behavior of such complex systems with their highly non-linear processes and external time-variant influences. States can be estimated by clustering observed sensor readings. However, clustering performance degrades as the number of sensors and readings (i.e. feature dimension) increases. To address this problem, we propose a framework that learns a feature-centric lower dimensional representation of data for clustering to support analysis of system dynamics. We propose Unsupervised Feature Attention with Compact Representation (UFACR) to rank features contributing to a cluster assignment. These weighted features are then used to learn a reduced-dimension temporal representation of the data with a deep-learning model. The resulting low-dimensional representation can be effectively clustered into states. UFACR is evaluated on real-world and synthetic wastewater treatment plant data sets, and feature ranking outcomes were validated by Wastewater treatment domain experts. Our quantitative and qualitative experimental analyses demonstrate the effectiveness of UFACR for uncovering system dynamics in an automated and unsupervised manner to offer guidance to wastewater engineers to enhance industrial productivity and treatment efficiency.
Los estilos APA, Harvard, Vancouver, ISO, etc.
17

Yamashita, Kodai y Tomoki Hamagami. "Reinforcement Learning for POMDP Environments Using State Representation with Reservoir Computing". Journal of Advanced Computational Intelligence and Intelligent Informatics 26, n.º 4 (20 de julio de 2022): 562–69. http://dx.doi.org/10.20965/jaciii.2022.p0562.

Texto completo
Resumen
One of the challenges in reinforcement learning is regarding the partially observable Markov decision process (POMDP). In this case, an agent cannot observe the true state of the environment and perceive different states to be the same. Our proposed method uses the agent’s time-series information to deal with this imperfect perception problem. In particular, the proposed method uses reservoir computing to transform the time-series of observation information into a non-linear state. A typical model of reservoir computing, the echo state network (ESN), transforms raw observations into reservoir states. The proposed method is named dual ESNs reinforcement learning, which uses two ESNs specialized for observation and action information. The experimental results show the effectiveness of the proposed method in environments where imperfect perception problems occur.
Los estilos APA, Harvard, Vancouver, ISO, etc.
18

Yan, Yan, Xu-Cheng Yin, Sujian Li, Mingyuan Yang y Hong-Wei Hao. "Learning Document Semantic Representation with Hybrid Deep Belief Network". Computational Intelligence and Neuroscience 2015 (2015): 1–9. http://dx.doi.org/10.1155/2015/650527.

Texto completo
Resumen
High-level abstraction, for example, semantic representation, is vital for document classification and retrieval. However, how to learn document semantic representation is still a topic open for discussion in information retrieval and natural language processing. In this paper, we propose a new Hybrid Deep Belief Network (HDBN) which uses Deep Boltzmann Machine (DBM) on the lower layers together with Deep Belief Network (DBN) on the upper layers. The advantage of DBM is that it employs undirected connection when training weight parameters which can be used to sample the states of nodes on each layer more successfully and it is also an effective way to remove noise from the different document representation type; the DBN can enhance extract abstract of the document in depth, making the model learn sufficient semantic representation. At the same time, we explore different input strategies for semantic distributed representation. Experimental results show that our model using the word embedding instead of single word has better performance.
Los estilos APA, Harvard, Vancouver, ISO, etc.
19

Zeng, Zheng, Rodney M. Goodman y Padhraic Smyth. "Learning Finite State Machines With Self-Clustering Recurrent Networks". Neural Computation 5, n.º 6 (noviembre de 1993): 976–90. http://dx.doi.org/10.1162/neco.1993.5.6.976.

Texto completo
Resumen
Recent work has shown that recurrent neural networks have the ability to learn finite state automata from examples. In particular, networks using second-order units have been successful at this task. In studying the performance and learning behavior of such networks we have found that the second-order network model attempts to form clusters in activation space as its internal representation of states. However, these learned states become unstable as longer and longer test input strings are presented to the network. In essence, the network “forgets” where the individual states are in activation space. In this paper we propose a new method to force such a network to learn stable states by introducing discretization into the network and using a pseudo-gradient learning rule to perform training. The essence of the learning rule is that in doing gradient descent, it makes use of the gradient of a sigmoid function as a heuristic hint in place of that of the hard-limiting function, while still using the discretized value in the feedback update path. The new structure uses isolated points in activation space instead of vague clusters as its internal representation of states. It is shown to have similar capabilities in learning finite state automata as the original network, but without the instability problem. The proposed pseudo-gradient learning rule may also be used as a basis for training other types of networks that have hard-limiting threshold activation functions.
Los estilos APA, Harvard, Vancouver, ISO, etc.
20

Brantley, Kianté, Soroush Mehri y Geoff J. Gordon. "Successor Feature Sets: Generalizing Successor Representations Across Policies". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 13 (18 de mayo de 2021): 11774–81. http://dx.doi.org/10.1609/aaai.v35i13.17399.

Texto completo
Resumen
Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards. However, successor-style representations are not optimized to generalize across policies: typically, we maintain a limited-length list of policies, and share information among them by representation learning or GPI. Successor-style representations also typically make no provision for gathering information or reasoning about latent variables. To address these limitations, we bring together ideas from predictive state representations, belief space value iteration, successor features, and convex analysis: we develop a new, general successor-style representation, together with a Bellman equation that connects multiple sources of information within this representation, including different latent states, policies, and reward functions. The new representation is highly expressive: for example, it lets us efficiently read off an optimal policy for a new reward function, or a policy that imitates a new demonstration. For this paper, we focus on exact computation of the new representation in small, known environments, since even this restricted setting offers plenty of interesting questions. Our implementation does not scale to large, unknown environments --- nor would we expect it to, since it generalizes POMDP value iteration, which is difficult to scale. However, we believe that future work will allow us to extend our ideas to approximate reasoning in large, unknown environments. We conduct experiments to explore which of the potential barriers to scaling are most pressing.
Los estilos APA, Harvard, Vancouver, ISO, etc.
21

Supianto, Ahmad Afif, Satrio Agung Wicaksono, Fitra A. Bachtiar, Admaja Dwi Herlambang, Yusuke Hayashi y Tsukasa Hirashima. "Web-based Application for Visual Representation of Learners' Problem-Posing Learning Pattern". Journal of Information Technology and Computer Science 4, n.º 1 (27 de junio de 2019): 103. http://dx.doi.org/10.25126/jitecs.20194172.

Texto completo
Resumen
The analysis of learner’s learning process on an interactive learning media tends to involve a huge amount of data gathering. The analysis aims to explore patterns and relationship of such data to understand the learning experience and identifying learner’s trap states in the learning process. Such data analysis tends to be a nuisance for stakeholders such as class instructors (i.e., teachers) and educational researchers. This research suggests the development of a web-based software application that utilizes the use of visual artifacts as an approach of educational data mining to analyze learner’s learning process in an interactive learning media. The developed software aims to visualize the learner’s activity sequence, furthermore, to identify learning paths toward bottleneck states due to lack of understanding of a problem given. Such information is then passed to the instructors as key information to create proper feedback to the learners based on its results. As a case study, this research uses the data log of Monsakun, a digital learning environment that focuses on exercising arithmetic using story-based question by using a problem-posing approach with integration of mathematical sentences.
Los estilos APA, Harvard, Vancouver, ISO, etc.
22

Janowicz, Maciej y Andrzej Zembrzuski. "Guessing quantum states from images of their zeros in the complex plane". Machine Graphics and Vision 32, n.º 3/4 (18 de diciembre de 2023): 147–59. http://dx.doi.org/10.22630/mgv.2022.31.3.8.

Texto completo
Resumen
The problem of determining the wave function of a physical system based on the graphical representation of its zeros is considered. It can be dealt with by invoking the Bargmann representation in which the wave functions are represented by analytic functions with an appropriate definition of the scalar product. The Weierstrass factorization theorem can then be applied. Examples of states that can be guessed from the pictorial representation of zeros by both the human eye and, possibly, by machine learning systems are given. The quality of recognition by the latter has been tested using Convolutional Neural Networks.
Los estilos APA, Harvard, Vancouver, ISO, etc.
23

ARENA, PAOLO, LUIGI FORTUNA, MATTIA FRASCA, DAVIDE LOMBARDO, LUCA PATANÈ y PAOLO CRUCITTI. "TURING PATTERNS IN RD-CNNs FOR THE EMERGENCE OF PERCEPTUAL STATES IN ROVING ROBOTS". International Journal of Bifurcation and Chaos 17, n.º 01 (enero de 2007): 107–27. http://dx.doi.org/10.1142/s0218127407017203.

Texto completo
Resumen
Behavior-based robotics considers perception as a holistic process, strongly connected to behavioral needs of the robot. We present a bio-inspired framework for sensing-perception-action, applied to a roving robot in a random foraging task. Perception is here considered as a complex and emergent phenomenon where a huge amount of information coming from sensors is used to form an abstract and concise representation of the environment, useful to take a suitable action or sequence of actions. In this work a model for perceptual representation is formalized by means of RD-CNNs showing Turing patterns. They are used as attractive states for particular set of environmental conditions in order to associate, via a reinforcement learning, a proper action. Learning is also introduced at the afferent stage to shape the environment information according to the particular emerging pattern. The basins of attraction for the Turing patterns are so dynamically tuned by an unsupervised learning in order to form an internal, abstract and plastic representation of the environment, as recorded by the sensors.
Los estilos APA, Harvard, Vancouver, ISO, etc.
24

Hadra, Mohammad y Iman Abdelrahman. "Automatic EEG-based Alertness Classification using Sparse Representation and Dictionary Learning". Journal of Biomedical Engineering and Medical Imaging 7, n.º 5 (8 de noviembre de 2020): 19–28. http://dx.doi.org/10.14738/jbemi.75.9264.

Texto completo
Resumen
Automation of human alertness identification has been widely investigated in recent decades. Many applications can benefit from automatic alertness state identification, such as driver fatigue detection, monotonous task workers' vigilance detection and sleep studies in the medical field. Many researchers have tried to exploit different types of behavioural aspects in vigilance detection, such as eye movement, head position and facial expression. On the other hand, some biomedical signals like ECG, EEG and heart rhythm are also exploited; however, there is a consensus of the superiority of EEG signal in alertness classification due to its close relation with different human vigilant states. In this paper, we propose an automatic method for vigilance detection using a single EEG channel along with sparse representation and dictionary learning. We used Discrete Wavelet Packet Transform to extract the features related to different human vigilance states. We use well-known other classifiers to compare the performance of our proposed method. Results of classification with sparse representation and dictionary learning produced better accuracy results than the other methods.
Los estilos APA, Harvard, Vancouver, ISO, etc.
25

De Giacomo, Giuseppe, Marco Favorito, Luca Iocchi y Fabio Patrizi. "Imitation Learning over Heterogeneous Agents with Restraining Bolts". Proceedings of the International Conference on Automated Planning and Scheduling 30 (1 de junio de 2020): 517–21. http://dx.doi.org/10.1609/icaps.v30i1.6747.

Texto completo
Resumen
A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This can be overcome by resorting to Inverse Reinforcement Learning (IRL), which consists in first obtaining a reward function from a set of execution traces generated by an expert agent, and then making the learning agent learn the expert's behavior –this is known as Imitation Learning (IL). Typical IRL solutions rely on a numerical representation of the reward function, which raises problems related to the adopted optimization procedures.We describe an IL method where the execution traces generated by the expert agent, possibly via planning, are used to produce a logical (as opposed to numerical) specification of the reward function, to be incorporated into a device known as Restraining Bolt (RB). The RB can be attached to the learning agent to drive the learning process and ultimately make it imitate the expert. We show that IL can be applied to heterogeneous agents, with the expert, the learner and the RB using different representations of the environment's actions and states, without specifying mappings among their representations.
Los estilos APA, Harvard, Vancouver, ISO, etc.
26

Benjamin, Ari S. y Konrad P. Kording. "A role for cortical interneurons as adversarial discriminators". PLOS Computational Biology 19, n.º 9 (28 de septiembre de 2023): e1011484. http://dx.doi.org/10.1371/journal.pcbi.1011484.

Texto completo
Resumen
The brain learns representations of sensory information from experience, but the algorithms by which it does so remain unknown. One popular theory formalizes representations as inferred factors in a generative model of sensory stimuli, meaning that learning must improve this generative model and inference procedure. This framework underlies many classic computational theories of sensory learning, such as Boltzmann machines, the Wake/Sleep algorithm, and a more recent proposal that the brain learns with an adversarial algorithm that compares waking and dreaming activity. However, in order for such theories to provide insights into the cellular mechanisms of sensory learning, they must be first linked to the cell types in the brain that mediate them. In this study, we examine whether a subtype of cortical interneurons might mediate sensory learning by serving as discriminators, a crucial component in an adversarial algorithm for representation learning. We describe how such interneurons would be characterized by a plasticity rule that switches from Hebbian plasticity during waking states to anti-Hebbian plasticity in dreaming states. Evaluating the computational advantages and disadvantages of this algorithm, we find that it excels at learning representations in networks with recurrent connections but scales poorly with network size. This limitation can be partially addressed if the network also oscillates between evoked activity and generative samples on faster timescales. Consequently, we propose that an adversarial algorithm with interneurons as discriminators is a plausible and testable strategy for sensory learning in biological systems.
Los estilos APA, Harvard, Vancouver, ISO, etc.
27

Gao, Kaizhi, Tianyu Wang, Zhongjing Ma y Suli Zou. "Winnie: Task-Oriented Dialog System with Structure-Aware Contrastive Learning and Enhanced Policy Planning". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 16 (24 de marzo de 2024): 18021–29. http://dx.doi.org/10.1609/aaai.v38i16.29758.

Texto completo
Resumen
Pre-trained encoder-decoder models are widely applied in Task-Oriented Dialog (TOD) systems on the session level, mainly focusing on modeling the dialog semantic information. Dialogs imply structural information indicating the interaction among user utterances, belief states, database search results, system acts and responses, which is also crucial for TOD systems. In addition, for the system acts, additional pre-training and datasets are considered to improve their accuracies, undoubtedly introducing a burden. Therefore, a novel end-to-end TOD system named Winnie is proposed in this paper to improve the TOD performance. First, to make full use of the intrinsic structural information, supervised contrastive learning is adopted to narrow the gap in the representation space between text representations of the same category and enlarge the overall continuous representation margin between text representations of different categories in dialog context. Then, a system act classification task is introduced for policy optimization during fine-tuning. Empirical results show that Winnie substantially improves the performance of the TOD system. By introducing the supervised contrastive and system act classification losses, Winnie achieves state-of-the-art results on benchmark datasets, including MultiWOZ2.2, In-Car, and Camrest676. Their end-to-end combined scores are improved by 3.2, 1.9, and 1.1 points, respectively.
Los estilos APA, Harvard, Vancouver, ISO, etc.
28

Montero Quispe, Kevin G., Daniel M. S. Utyiama, Eulanda M. dos Santos, Horácio A. B. F. Oliveira y Eduardo J. P. Souto. "Applying Self-Supervised Representation Learning for Emotion Recognition Using Physiological Signals". Sensors 22, n.º 23 (23 de noviembre de 2022): 9102. http://dx.doi.org/10.3390/s22239102.

Texto completo
Resumen
The use of machine learning (ML) techniques in affective computing applications focuses on improving the user experience in emotion recognition. The collection of input data (e.g., physiological signals), together with expert annotations are part of the established standard supervised learning methodology used to train human emotion recognition models. However, these models generally require large amounts of labeled data, which is expensive and impractical in the healthcare context, in which data annotation requires even more expert knowledge. To address this problem, this paper explores the use of the self-supervised learning (SSL) paradigm in the development of emotion recognition methods. This approach makes it possible to learn representations directly from unlabeled signals and subsequently use them to classify affective states. This paper presents the key concepts of emotions and how SSL methods can be applied to recognize affective states. We experimentally analyze and compare self-supervised and fully supervised training of a convolutional neural network designed to recognize emotions. The experimental results using three emotion datasets demonstrate that self-supervised representations can learn widely useful features that improve data efficiency, are widely transferable, are competitive when compared to their fully supervised counterparts, and do not require the data to be labeled for learning.
Los estilos APA, Harvard, Vancouver, ISO, etc.
29

Niu, Yijie, Wu Deng, Xuesong Zhang, Yuchun Wang, Guoqing Wang, Yanjuan Wang y Pengpeng Zhi. "A Sparse Learning Method with Regularization Parameter as a Self-Adaptation Strategy for Rolling Bearing Fault Diagnosis". Electronics 12, n.º 20 (16 de octubre de 2023): 4282. http://dx.doi.org/10.3390/electronics12204282.

Texto completo
Resumen
Sparsity-based fault diagnosis methods have achieved great success. However, fault classification is still challenging because of neglected potential knowledge. This paper proposes a combined sparse representation deep learning (SR-DEEP) method for rolling bearing fault diagnosis. Firstly, the SR-DEEP method utilizes prior domain knowledge to establish a sparsity-based fault model. Then, based on this model, the corresponding regularization parameter regression networks are trained for different running states, whose core is to explore the latent relationship between the regularization parameters and running states. Subsequently, the performance of the fault classification is improved by embedding the trained regularization parameter regression networks into the sparse representation classification method. This strategy improves the adaptability of the sparse regularization parameter, further improving the performance of the fault classification method. Finally, the applicability of the SR-DEEP method for rolling bearing fault diagnosis is validated with the CWRU platform and QPZZ-II platform, demonstrating that SR-DEEP yields superior accuracies of 100% and 99.20% for diagnosing four and five running states, respectively. Comparative studies show that the SR-DEEP method outperforms four sparse representation methods and seven classical deep learning classification methods in terms of the classification performance.
Los estilos APA, Harvard, Vancouver, ISO, etc.
30

Cai, Yuanying, Chuheng Zhang, Wei Shen, Xuyun Zhang, Wenjie Ruan y Longbo Huang. "RePreM: Representation Pre-training with Masked Model for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 6 (26 de junio de 2023): 6879–87. http://dx.doi.org/10.1609/aaai.v37i6.25842.

Texto completo
Resumen
Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory. RePreM is simple but effective compared to existing representation pre-training methods in RL. It avoids algorithmic sophistication (such as data augmentation or estimating multiple models) with sequence modeling and generates a representation that captures long-term dynamics well. Empirically, we demonstrate the effectiveness of RePreM in various tasks, including dynamic prediction, transfer learning, and sample-efficient RL with both value-based and actor-critic methods. Moreover, we show that RePreM scales well with dataset size, dataset quality, and the scale of the encoder, which indicates its potential towards big RL models.
Los estilos APA, Harvard, Vancouver, ISO, etc.
31

BOXER, PAUL A. "LEARNING NAIVE PHYSICS BY VISUAL OBSERVATION: USING QUALITATIVE SPATIAL REPRESENTATIONS AND PROBABILISTIC REASONING". International Journal of Computational Intelligence and Applications 01, n.º 03 (septiembre de 2001): 273–85. http://dx.doi.org/10.1142/s146902680100024x.

Texto completo
Resumen
Autonomous robots are unsuccessful at operating in complex, unconstrained environments. They lack the ability to learn about the physical behavior of different objects through the use of vision. We combine Bayesian networks and qualitative spatial representation to learn general physical behavior by visual observation. We input training scenarios that allow the system to observe and learn normal physical behavior. The position and velocity of the visible objects are represented as qualitative states. Transitions between these states over time are entered as evidence into a Bayesian network. The network provides probabilities of future transitions to produce predictions of future physical behavior. We use test scenarios to determine how well the approach discriminates between normal and abnormal physical behavior and actively predicts future behavior. We examine the ability of the system to learn three naive physical concepts, "no action at a distance", "solidity" and "movement on continuous paths". We conclude that the combination of qualitative spatial representations and Bayesian network techniques is capable of learning these three rules of naive physics.
Los estilos APA, Harvard, Vancouver, ISO, etc.
32

Lanchantin, Jack, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet y Arthur Szlam. "A Data Source for Reasoning Embodied Agents". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 7 (26 de junio de 2023): 8438–46. http://dx.doi.org/10.1609/aaai.v37i7.26017.

Texto completo
Resumen
Recent progress in using machine learning models for reasoning tasks has been driven by novel model architectures, large-scale pre-training protocols, and dedicated reasoning datasets for fine-tuning. In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent. The generated data consists of templated text queries and answers, matched with world-states encoded into a database. The world-states are a result of both world dynamics and the actions of the agent. We show the results of several baseline models on instantiations of train sets. These include pre-trained language models fine-tuned on a text-formatted representation of the database, and graph-structured Transformers operating on a knowledge-graph representation of the database. We find that these models can answer some questions about the world-state, but struggle with others. These results hint at new research directions in designing neural reasoning models and database representations. Code to generate the data and train the models will be released at github.com/facebookresearch/neuralmemory
Los estilos APA, Harvard, Vancouver, ISO, etc.
33

Zhou, Hangbo, Gang Zhang y Yong-Wei Zhang. "Neural network representation and optimization of thermoelectric states of multiple interacting quantum dots". Physical Chemistry Chemical Physics 22, n.º 28 (2020): 16165–73. http://dx.doi.org/10.1039/d0cp02894k.

Texto completo
Resumen
We perform quantum master equation calculations and machine learning to investigate the thermoelectric properties of multiple interacting quantum dots, including electrical conductance, Seebeck coefficient, thermal conductance and ZT.
Los estilos APA, Harvard, Vancouver, ISO, etc.
34

Yu, Jia, Huiling Peng, Guoqiang Wang y Nianfeng Shi. "A topical VAEGAN-IHMM approach for automatic story segmentation". Mathematical Biosciences and Engineering 21, n.º 7 (2024): 6608–30. http://dx.doi.org/10.3934/mbe.2024289.

Texto completo
Resumen
<p>Feature representations with rich topic information can greatly improve the performance of story segmentation tasks. VAEGAN offers distinct advantages in feature learning by combining variational autoencoder (VAE) and generative adversarial network (GAN), which not only captures intricate data representations through VAE's probabilistic encoding and decoding mechanism but also enhances feature diversity and quality via GAN's adversarial training. To better learn topical domain representation, we used a topical classifier to supervise the training process of VAEGAN. Based on the learned feature, a segmentor splits the document into shorter ones with different topics. Hidden Markov model (HMM) is a popular approach for story segmentation, in which stories are viewed as instances of topics (hidden states). The number of states has to be set manually but it is often unknown in real scenarios. To solve this problem, we proposed an infinite HMM (IHMM) approach which utilized an HDP prior on transition matrices over countably infinite state spaces to automatically infer the state's number from the data. Given a running text, a Blocked Gibbis sampler labeled the states with topic classes. The position where the topic changes was a story boundary. Experimental results on the TDT2 corpus demonstrated that the proposed topical VAEGAN-IHMM approach was significantly better than the traditional HMM method in story segmentation tasks and achieved state-of-the-art performance.</p>
Los estilos APA, Harvard, Vancouver, ISO, etc.
35

Anggraini, Nanda Ayu, Eka Fitria Ningsih, Choirudin Choirudin, Rani Darmayanti y Diyan Triyanto. "Application of the AIR learning model using song media to improve students’ mathematical representational ability". AMCA Journal of Science and Technology 2, n.º 1 (11 de noviembre de 2022): 28–33. http://dx.doi.org/10.51773/ajst.v2i1.264.

Texto completo
Resumen
One of the overarching goals of teaching mathematics is the capacity for mathematical representation. Someone must be able to communicate through visuals, graphics, diagrams, or other types of representation. The learning model must enable students to play an active role in class, learn more through experiments, ask questions, and receive answers to their questions. This study aimed to determine the increase in students' mathematical representation abilities after following the Auditory intellectual repetition (AIR) learning model assisted by song media. This research method is a quantitative method with the One Group Pretest-Posttest design. The research was conducted at Ma'arif 5 Metro Middle School in the even semester of the 2022/2023 school year. The sample for this research was class VII students, totaling 23 students. The test instrument is used for data collection. The data analysis technique uses a paired sample t-test. The conclusion of this study states that the Auditory intellectual repetition (AIR) learning model using song media can improve students' mathematical representation skills.
Los estilos APA, Harvard, Vancouver, ISO, etc.
36

Hennig, Jay A., Sandra A. Romero Pinto, Takahiro Yamaguchi, Scott W. Linderman, Naoshige Uchida y Samuel J. Gershman. "Emergence of belief-like representations through reinforcement learning". PLOS Computational Biology 19, n.º 9 (11 de septiembre de 2023): e1011067. http://dx.doi.org/10.1371/journal.pcbi.1011067.

Texto completo
Resumen
To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming “beliefs”—optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial observability, it is not the only way, nor is it the most computationally scalable solution in complex, real-world environments. Here we show that a recurrent neural network (RNN) can learn to estimate value directly from observations, generating reward prediction errors that resemble those observed experimentally, without any explicit objective of estimating beliefs. We integrate statistical, functional, and dynamical systems perspectives on beliefs to show that the RNN’s learned representation encodes belief information, but only when the RNN’s capacity is sufficiently large. These results illustrate how animals can estimate value in tasks without explicitly estimating beliefs, yielding a representation useful for systems with limited capacity.
Los estilos APA, Harvard, Vancouver, ISO, etc.
37

Francois-Lavet, Vincent, Guillaume Rabusseau, Joelle Pineau, Damien Ernst y Raphael Fonteneau. "On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability". Journal of Artificial Intelligence Research 65 (5 de mayo de 2019): 1–30. http://dx.doi.org/10.1613/jair.1.11478.

Texto completo
Resumen
This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability. Our theoretical analysis formally characterizes that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. This analysis relies on expressing the quality of a state representation by bounding $L_1$ error terms of the associated belief states. Theoretical results are empirically illustrated when the state representation is a truncated history of observations, both on synthetic POMDPs and on a large-scale POMDP in the context of smartgrids, with real-world data. Finally, similarly to known results in the fully observable setting, we also briefly discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting in the partially observable context.
Los estilos APA, Harvard, Vancouver, ISO, etc.
38

Liao, Weijian, Zongzhang Zhang y Yang Yu. "Policy-Independent Behavioral Metric-Based Representation for Deep Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 7 (26 de junio de 2023): 8746–54. http://dx.doi.org/10.1609/aaai.v37i7.26052.

Texto completo
Resumen
Behavioral metrics can calculate the distance between states or state-action pairs from the rewards and transitions difference. By virtue of their capability to filter out task-irrelevant information in theory, using them to shape a state embedding space becomes a new trend of representation learning for deep reinforcement learning (RL), especially when there are explicit distracting factors in observation backgrounds. However, due to the tight coupling between the metric and the RL policy, such metric-based methods may result in less informative embedding spaces which can weaken their aid to the baseline RL algorithm and even consume more samples to learn. We resolve this by proposing a new behavioral metric. It decouples the learning of RL policy and metric owing to its independence on RL policy. We theoretically justify its scalability to continuous state and action spaces and design a practical way to incorporate it into an RL procedure as a representation learning target. We evaluate our approach on DeepMind control tasks with default and distracting backgrounds. By statistically reliable evaluation protocols, our experiments demonstrate our approach is superior to previous metric-based methods in terms of sample efficiency and asymptotic performance in both backgrounds.
Los estilos APA, Harvard, Vancouver, ISO, etc.
39

Corli, Sebastiano, Lorenzo Moro, Davide E. Galli y Enrico Prati. "Casting Rubik’s Group into a Unitary Representation for Reinforcement Learning". Journal of Physics: Conference Series 2533, n.º 1 (1 de junio de 2023): 012006. http://dx.doi.org/10.1088/1742-6596/2533/1/012006.

Texto completo
Resumen
Abstract Rubik’s Cube is one of the most famous combinatorial puzzles involving nearly 4.3 × 1019 possible configurations. However, only a single configuration matches the solved one. Its mathematical description is expressed by the Rubik’s group, whose elements define how its layers rotate. We develop a unitary representation of the Rubik’s group and a quantum formalism to describe the Cube based on its geometrical constraints. Using single particle quantum states, we describe the cubies as bosons for corners and fermions for edges. By introducing a set of four Ising-like Hamiltonians, we managed to set the solved configuration of the Cube as the global ground state for all the Hamiltonians. To reach the ground state of all the Hamiltonian operators, we made use of a Deep Reinforcement Learning algorithm based on a Hamiltonian reward. The Rubik’s Cube is successfully solved through four phases, each phase driven by a corresponding Hamiltonian reward based on its energy spectrum. We call our algorithm QUBE, as it employs quantum mechanics to tackle the combinatorial problem of solving the Rubik’s Cube. Embedding combinatorial problems into the quantum mechanics formalism suggests new possible algorithms and future implementations on quantum hardware.
Los estilos APA, Harvard, Vancouver, ISO, etc.
40

Hirshorn, Elizabeth A., Yuanning Li, Michael J. Ward, R. Mark Richardson, Julie A. Fiez y Avniel Singh Ghuman. "Decoding and disrupting left midfusiform gyrus activity during word reading". Proceedings of the National Academy of Sciences 113, n.º 29 (20 de junio de 2016): 8162–67. http://dx.doi.org/10.1073/pnas.1604126113.

Texto completo
Resumen
The nature of the visual representation for words has been fiercely debated for over 150 y. We used direct brain stimulation, pre- and postsurgical behavioral measures, and intracranial electroencephalography to provide support for, and elaborate upon, the visual word form hypothesis. This hypothesis states that activity in the left midfusiform gyrus (lmFG) reflects visually organized information about words and word parts. In patients with electrodes placed directly in their lmFG, we found that disrupting lmFG activity through stimulation, and later surgical resection in one of the patients, led to impaired perception of whole words and letters. Furthermore, using machine-learning methods to analyze the electrophysiological data from these electrodes, we found that information contained in early lmFG activity was consistent with an orthographic similarity space. Finally, the lmFG contributed to at least two distinguishable stages of word processing, an early stage that reflects gist-level visual representation sensitive to orthographic statistics, and a later stage that reflects more precise representation sufficient for the individuation of orthographic word forms. These results provide strong support for the visual word form hypothesis and demonstrate that across time the lmFG is involved in multiple stages of orthographic representation.
Los estilos APA, Harvard, Vancouver, ISO, etc.
41

Cresswell, Stephen y Peter Gregory. "Generalised Domain Model Acquisition from Action Traces". Proceedings of the International Conference on Automated Planning and Scheduling 21 (22 de marzo de 2011): 42–49. http://dx.doi.org/10.1609/icaps.v21i1.13476.

Texto completo
Resumen
One approach to the problem of formulating domain models for planning is to learn the models from example action sequences. The LOCM system demonstrated the feasibility of learning domain models from example action sequences only, with no observation of states before, during or after the plans. LOCM uses an object-centred representation, in which each object is represented by a single parameterised state machine. This makes it powerful for learning domains which fit within that representation, but there are some well-known domains which do not. This paper introduces LOCM2, a novel algorithm in which the domain representation of LOCM is generalised to allow multiple parameterised state machines to represent a single object. This extends the coverage of domains for which an adequate domain model can be learned. The LOCM2 algorithm is described and evaluated by testing domain learning from example plans from published results of past International Planning Competitions.
Los estilos APA, Harvard, Vancouver, ISO, etc.
42

Charalambous, Panayiotis, Julien Pettre, Vassilis Vassiliades, Yiorgos Chrysanthou y Nuria Pelechano. "GREIL-Crowds: Crowd Simulation with Deep Reinforcement Learning and Examples". ACM Transactions on Graphics 42, n.º 4 (26 de julio de 2023): 1–15. http://dx.doi.org/10.1145/3592459.

Texto completo
Resumen
Simulating crowds with realistic behaviors is a difficult but very important task for a variety of applications. Quantifying how a person balances between different conflicting criteria such as goal seeking, collision avoidance and moving within a group is not intuitive, especially if we consider that behaviors differ largely between people. Inspired by recent advances in Deep Reinforcement Learning, we propose Guided REinforcement Learning (GREIL) Crowds, a method that learns a model for pedestrian behaviors which is guided by reference crowd data. The model successfully captures behaviors such as goal seeking, being part of consistent groups without the need to define explicit relationships and wandering around seemingly without a specific purpose. Two fundamental concepts are important in achieving these results: (a) the per agent state representation and (b) the reward function. The agent state is a temporal representation of the situation around each agent. The reward function is based on the idea that people try to move in situations/states in which they feel comfortable in. Therefore, in order for agents to stay in a comfortable state space, we first obtain a distribution of states extracted from real crowd data; then we evaluate states based on how much of an outlier they are compared to such a distribution. We demonstrate that our system can capture and simulate many complex and subtle crowd interactions in varied scenarios. Additionally, the proposed method generalizes to unseen situations, generates consistent behaviors and does not suffer from the limitations of other data-driven and reinforcement learning approaches.
Los estilos APA, Harvard, Vancouver, ISO, etc.
43

BARRETO, GUILHERME DE A. y ALUIZIO F. R. ARAÚJO. "Unsupervised Learning and Recall of Temporal Sequences: An Application to Robotics". International Journal of Neural Systems 09, n.º 03 (junio de 1999): 235–42. http://dx.doi.org/10.1142/s012906579900023x.

Texto completo
Resumen
This paper describes an unsupervised neural network model for learning and recall of temporal patterns. The model comprises two groups of synaptic weights, named competitive feedforward and Hebbian feedback, which are responsible for encoding the static and temporal features of the sequence respectively. Three additional mechanisms allow the network to deal with complex sequences: context units, a neuron commitment equation, and redundancy in the representation of sequence states. The proposed network encodes a set of robot trajectories which may contain states in common, and retrieves them accurately in the correct order. Further tests evaluate the fault-tolerance and noise sensitivity of the proposed model.
Los estilos APA, Harvard, Vancouver, ISO, etc.
44

Zou, Eric, Erik Long y Erhai Zhao. "Learning a compass spin model with neural network quantum states". Journal of Physics: Condensed Matter 34, n.º 12 (7 de enero de 2022): 125802. http://dx.doi.org/10.1088/1361-648x/ac43ff.

Texto completo
Resumen
Abstract Neural network quantum states provide a novel representation of the many-body states of interacting quantum systems and open up a promising route to solve frustrated quantum spin models that evade other numerical approaches. Yet its capacity to describe complex magnetic orders with large unit cells has not been demonstrated, and its performance in a rugged energy landscape has been questioned. Here we apply restricted Boltzmann machines (RBMs) and stochastic gradient descent to seek the ground states of a compass spin model on the honeycomb lattice, which unifies the Kitaev model, Ising model and the quantum 120° model with a single tuning parameter. We report calculation results on the variational energy, order parameters and correlation functions. The phase diagram obtained is in good agreement with the predictions of tensor network ansatz, demonstrating the capacity of RBMs in learning the ground states of frustrated quantum spin Hamiltonians. The limitations of the calculation are discussed. A few strategies are outlined to address some of the challenges in machine learning frustrated quantum magnets.
Los estilos APA, Harvard, Vancouver, ISO, etc.
45

CASTELLANO, GIOVANNA, CIRO CASTIELLO, DANILO DELL'AGNELLO, ANNA MARIA FANELLI, CORRADO MENCAR y MARIA ALESSANDRA TORSELLO. "LEARNING FUZZY USER PROFILES FOR RESOURCE RECOMMENDATION". International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 18, n.º 04 (agosto de 2010): 389–410. http://dx.doi.org/10.1142/s0218488510006611.

Texto completo
Resumen
Recommender systems are systems capable of assisting users by quickly providing them with relevant resources according to their interests or preferences. The efficacy of a recommender system is strictly connected with the possibility of creating meaningful user profiles, including information about user preferences, interests, goals, usage data and interactive behavior. In particular, analysis of user preferences is important to predict user behaviors and make appropriate recommendations. In this paper, we present a fuzzy framework to represent, learn and update user profiles. The representation of a user profile is based on a structured model of user cognitive states, including a competence profile, a preference profile and an acquaintance profile. The strategy for deriving and updating profiles is to record the sequence of accessed resources by each user, and to update preference profiles accordingly, so as to suggest similar resources at next user accesses. The adaption of the preference profile is performed continuously, but in earlier stages it is more sensitive to updates (plastic phase) while in later stages it is less sensitive (stable phase) to allow resource recommendation. Simulation results are reported to show the effectiveness of the proposed approach.
Los estilos APA, Harvard, Vancouver, ISO, etc.
46

Guo, Chao y Rongrong Ren. "Learning and Problem Representation in Foreign Policy Decision-Making: China'S Decision to Enter the Korean War Revisited". Public Administration Quarterly 27, n.º 3 (septiembre de 2003): 274–310. http://dx.doi.org/10.1177/073491490302700302.

Texto completo
Resumen
This article attempts to study the role of learning and problem representation in Chinese foreign policy decision-making as illustrated by the case of the Chinese decision to enter the Korean War. Drawing on cognitive theories of learning and problem representation, the authors argue that, in the past three decades prior to the outbreak of the Korean War, Chinese communist leaders learned through trial-and-error experimentation and through success and failure and developed their image of the United States as the biggest imperialist enemy and that this enemy image led to their representation of the Korean War problem as the American aggression into China which seriously constrained the generation of alternatives among which the Chinese policy-makers could choose. This article concludes with several theoretical and policy implications.
Los estilos APA, Harvard, Vancouver, ISO, etc.
47

Trevarthen, Colwyn y Kenneth J. Aitken. "Brain development, infant communication, and empathy disorders: Intrinsic factors in child mental health". Development and Psychopathology 6, n.º 4 (1994): 597–633. http://dx.doi.org/10.1017/s0954579400004703.

Texto completo
Resumen
AbstractDisorders of emotion, communication, and learning in early childhood are considered in light of evidence on human brain growth from embryo stages. We cite microbehavioral evidence indicating that infants are born able to express the internal activity of their brains, including dynamic “motive states” that drive learning. Infant expressions stimulate the development of imitative and reciprocal relations with corresponding dynamic brain states of caregivers. The infant's mind must have an “innate self-with-other representation” of the inter-mind correspondence and reciprocity of feelings that can be generated with an adult.Primordial motive systems appear in subcortical and limbic systems of the embryo before the cerebral cortex. These are presumed to continue to guide the growth of a child's brain after birth. We propose that an “intrinsic motive formation” is assembled prenatally and is ready at birth to share emotion with caregivers for regulation of the child's cortical development, on which cultural cognition and learning depend.The intrinsic potentiality for “intersubjectivity” can be disorganized if the epigenetic program for the infant's brain fails. Indeed, many psychological disorders of childhood can be traced to faults in early stages of brain development when core motive systems form.
Los estilos APA, Harvard, Vancouver, ISO, etc.
48

Whitehead, Steven D. y Dana H. Ballard. "Active Perception and Reinforcement Learning". Neural Computation 2, n.º 4 (diciembre de 1990): 409–19. http://dx.doi.org/10.1162/neco.1990.2.4.409.

Texto completo
Resumen
This paper considers adaptive control architectures that integrate active sensorimotor systems with decision systems based on reinforcement learning. One unavoidable consequence of active perception is that the agent's internal representation often confounds external world states. We call this phenomenon perceptual aliasing and show that it destabilizes existing reinforcement learning algorithms with respect to the optimal decision policy. A new decision system that overcomes these difficulties is described. The system incorporates a perceptual subcycle within the overall decision cycle and uses a modified learning algorithm to suppress the effects of perceptual aliasing. The result is a control architecture that learns not only how to solve a task but also where to focus its attention in order to collect necessary sensory information.
Los estilos APA, Harvard, Vancouver, ISO, etc.
49

Tian, Yuan. "Music emotion representation based on non-negative matrix factorization algorithm and user label information". PeerJ Computer Science 9 (25 de septiembre de 2023): e1590. http://dx.doi.org/10.7717/peerj-cs.1590.

Texto completo
Resumen
Music emotion representation learning forms the foundation of user emotion recognition, addressing the challenges posed by the vast volume of digital music data and the scarcity of emotion annotation data. This article introduces a novel music emotion representation model, leveraging the nonnegative matrix factorization algorithm (NMF) to derive emotional embeddings of music by utilizing user-generated listening lists and emotional labels. This approach facilitates emotion recognition by positioning music within the emotional space. Furthermore, a dedicated music emotion recognition algorithm is formulated, alongside the proposal of a user emotion recognition model, which employs similarity-weighted calculations to obtain user emotion representations. Experimental findings demonstrate the method’s convergence after a mere 400 iterations, yielding a remarkable 47.62% increase in F1 value across all emotion classes. In practical testing scenarios, the comprehensive accuracy rate of user emotion recognition attains an impressive 52.7%, effectively discerning emotions within seven emotion categories and accurately identifying users’ emotional states.
Los estilos APA, Harvard, Vancouver, ISO, etc.
50

Finn, Tobias Sebastian, Lucas Disson, Alban Farchi, Marc Bocquet y Charlotte Durand. "Representation learning with unconditional denoising diffusion models for dynamical systems". Nonlinear Processes in Geophysics 31, n.º 3 (19 de septiembre de 2024): 409–31. http://dx.doi.org/10.5194/npg-31-409-2024.

Texto completo
Resumen
Abstract. We propose denoising diffusion models for data-driven representation learning of dynamical systems. In this type of generative deep learning, a neural network is trained to denoise and reverse a diffusion process, where Gaussian noise is added to states from the attractor of a dynamical system. Iteratively applied, the neural network can then map samples from isotropic Gaussian noise to the state distribution. We showcase the potential of such neural networks in proof-of-concept experiments with the Lorenz 1963 system. Trained for state generation, the neural network can produce samples that are almost indistinguishable from those on the attractor. The model has thereby learned an internal representation of the system, applicable for different tasks other than state generation. As a first task, we fine-tune the pre-trained neural network for surrogate modelling by retraining its last layer and keeping the remaining network as a fixed feature extractor. In these low-dimensional settings, such fine-tuned models perform similarly to deep neural networks trained from scratch. As a second task, we apply the pre-trained model to generate an ensemble out of a deterministic run. Diffusing the run, and then iteratively applying the neural network, conditions the state generation, which allows us to sample from the attractor in the run's neighbouring region. To control the resulting ensemble spread and Gaussianity, we tune the diffusion time and, thus, the sampled portion of the attractor. While easier to tune, this proposed ensemble sampler can outperform tuned static covariances in ensemble optimal interpolation. Therefore, these two applications show that denoising diffusion models are a promising way towards representation learning for dynamical systems.
Los estilos APA, Harvard, Vancouver, ISO, etc.
Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!

Pasar a la bibliografía