Articoli di riviste: "Reinforcement Learning"

1

Deora, Merin, e Sumit Mathur. "Reinforcement Learning". IJARCCE 6, n. 4 (30 aprile 2017): 178–81. http://dx.doi.org/10.17148/ijarcce.2017.6433.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Barto, Andrew G. "Reinforcement Learning". IFAC Proceedings Volumes 31, n. 29 (ottobre 1998): 5. http://dx.doi.org/10.1016/s1474-6670(17)38315-5.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Woergoetter, Florentin, e Bernd Porr. "Reinforcement learning". Scholarpedia 3, n. 3 (2008): 1448. http://dx.doi.org/10.4249/scholarpedia.1448.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Moore, Brett L., Anthony G. Doufas e Larry D. Pyeatt. "Reinforcement Learning". Anesthesia & Analgesia 112, n. 2 (febbraio 2011): 360–67. http://dx.doi.org/10.1213/ane.0b013e31820334a7.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Likas, Aristidis. "A Reinforcement Learning Approach to Online Clustering". Neural Computation 11, n. 8 (1 novembre 1999): 1915–32. http://dx.doi.org/10.1162/089976699300016025.

Testo completo

Abstract (sommario):

A general technique is proposed for embedding online clustering algorithms based on competitive learning in a reinforcement learning framework. The basic idea is that the clustering system can be viewed as a reinforcement learning system that learns through reinforcements to follow the clustering strategy we wish to implement. In this sense, the reinforcement guided competitive learning (RGCL) algorithm is proposed that constitutes a reinforcement-based adaptation of learning vector quantization (LVQ) with enhanced clustering capabilities. In addition, we suggest extensions of RGCL and LVQ that are characterized by the property of sustained exploration and significantly improve the performance of those algorithms, as indicated by experimental tests on well-known data sets.

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Mardhatillah, Elsy. "Teacher’s Reinforcement in English Classroom in MTSS Darul Makmur Sungai Cubadak". Indonesian Research Journal On Education 3, n. 1 (2 gennaio 2022): 825–32. http://dx.doi.org/10.31004/irje.v3i1.202.

Testo completo

Abstract (sommario):

This research was due to some problems found in MTsS Darul Makmur. First, some students were not motivated in learning. Second, sometime the teacher still uses Indonesian in giving reinforcements. Third, some Students did not care about the teacher's reinforcement. This study aimed to find out the types of reinforcement used by the teacher. Then, to find out the types of reinforcement often and rarely to be usedby the teacher. Then, to find out the reasons the teacher used certain reinforcements. Last, to find out how the teacher understands the reinforcement. This research used a qualitative approach. The design of this research was descriptive because the researcher made a description of the use of reinforcement by theteacher in the English classroom. In this research, the interview and observation sheets were used by the researcher. The researcher found that the type of reinforcement used by the teacher is positive reinforcement and negative reinforcement. First, there were two types of positive reinforcement used by teachers, namely verbal reinforcement and non-verbal reinforcement. The verbal often used by theteacher was a reinforcement in the form of words and reinforcement in the form of phrases. Then, verbal reinforcement in the form of sentences was never done by the teacher in the learning process. While the non-verbal reinforcement often used by the teacher was gestural, activity reinforcement, and proximity reinforcement. Second, the negative reinforcement often used by the teacher was a warning, gesture, and eye contact. Meanwhile, the negative reinforcement rarely used by the teacher was speech volume andpunishment. Third, the reasons teachers reinforce learning are to motivate students and make students feel appreciated and happy while learning.

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Liaq, Mudassar, e Yungcheol Byun. "Autonomous UAV Navigation Using Reinforcement Learning". International Journal of Machine Learning and Computing 9, n. 6 (dicembre 2019): 756–61. http://dx.doi.org/10.18178/ijmlc.2019.9.6.869.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Alrammal, Muath, e Munir Naveed. "Monte-Carlo Based Reinforcement Learning (MCRL)". International Journal of Machine Learning and Computing 10, n. 2 (febbraio 2020): 227–32. http://dx.doi.org/10.18178/ijmlc.2020.10.2.924.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Nurmuhammet, Abdullayev. "DEEP REINFORCEMENT LEARNING ON STOCK DATA". Alatoo Academic Studies 23, n. 2 (30 giugno 2023): 505–18. http://dx.doi.org/10.17015/aas.2023.232.49.

Testo completo

Abstract (sommario):

This study proposes using Deep Reinforcement Learning (DRL) for stock trading decisions and prediction. DRL is a machine learning technique that enables agents to learn optimal strategies by interacting with their environment. The proposed model surpasses traditional models and can make informed trading decisions in real-time. The study highlights the feasibility of applying DRL in financial markets and its advantages in strategic decision- making. The model's ability to learn from market dynamics makes it a promising approach for stock market forecasting. Overall, this paper provides valuable insights into the use of DRL for stock trading decisions and prediction, establishing a strong case for its adoption in financial markets. Keywords: reinforcement learning, stock market, deep reinforcement learning.

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Fan, ZiSheng. "An exploration of reinforcement learning and deep reinforcement learning". Applied and Computational Engineering 73, n. 1 (5 luglio 2024): 154–59. http://dx.doi.org/10.54254/2755-2721/73/20240386.

Testo completo

Abstract (sommario):

Today, machine learning is evolving so quickly that new algorithms are always appearing. Deep neural networks in particular have shown positive outcomes in a variety of areas, including computer vision, natural language processing, and time series prediction. Its development moves at a very sluggish pace due to the high threshold. Therefore, a thorough examination of the reinforcement learning field should be required. This essay examines both the deep learning algorithm and the reinforcement learning operational procedure. The study identifies information retrieval, data mining, intelligent speech, natural language processing, and reinforcement learning as key technologies. The scientific study of reinforcement learning has advanced remarkably quickly, and it is now being used to tackle important decision optimization issues at academic conferences and journal research work in computer networks, computer graphics, etc. Brief introductions and reviews of both types of models are provided in this paper, along with an understanding of some of the most cutting-edge reinforcement learning applications and approaches.

Gli stili APA, Harvard, Vancouver, ISO e altri

11

Myers, Catherine. "LEARNING WITH DELAYED REINFORCEMENT THROUGH ATTENTION-DRIVEN BUFFERING". International Journal of Neural Systems 01, n. 04 (gennaio 1991): 337–46. http://dx.doi.org/10.1142/s0129065791000376.

Testo completo

Abstract (sommario):

Learning with delayed reinforcement refers to situations where the reinforcement to a learning system occurs only at the end of a string of actions or outputs, and it must then be assigned back to the relevant actions. A method for accomplishing this is presented which buffers a small number of past actions based on the unpredictability of or attention to each as it occurs. This approach allows for the buffer size to be small, and yet learning can reach indefinitely far back into the past; it also allows the system to learn when reinforcement is not only delayed but also reinforcements from other unrelated actions may arrive during this delay. An example of a simulated food-finding creature is used to show the system at work in a predictive application where reinforcements show this interleaving behaviour.

Gli stili APA, Harvard, Vancouver, ISO e altri

12

Horie, Naoto, Tohgoroh Matsui, Koichi Moriyama, Atsuko Mutoh e Nobuhiro Inuzuka. "Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning". Artificial Life and Robotics 24, n. 3 (8 febbraio 2019): 352–59. http://dx.doi.org/10.1007/s10015-019-00523-3.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

13

Lee, Dongsu, Chanin Eom, Sungwoo Choi, Sungkwan Kim e Minhae Kwon. "Survey on Practical Reinforcement Learning : from Imitation Learning to Offline Reinforcement Learning". Journal of Korean Institute of Communications and Information Sciences 48, n. 11 (30 novembre 2023): 1405–17. http://dx.doi.org/10.7840/kics.2023.48.11.1405.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

14

Osogami, Takayuki, e Rudy Raymond. "Determinantal Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 4659–66. http://dx.doi.org/10.1609/aaai.v33i01.33014659.

Testo completo

Abstract (sommario):

We study reinforcement learning for controlling multiple agents in a collaborative manner. In some of those tasks, it is insufficient for the individual agents to take relevant actions, but those actions should also have diversity. We propose the approach of using the determinant of a positive semidefinite matrix to approximate the action-value function in reinforcement learning, where we learn the matrix in a way that it represents the relevance and diversity of the actions. Experimental results show that the proposed approach allows the agents to learn a nearly optimal policy approximately ten times faster than baseline approaches in benchmark tasks of multi-agent reinforcement learning. The proposed approach is also shown to achieve the performance that cannot be achieved with conventional approaches in partially observable environment with exponentially large action space.

Gli stili APA, Harvard, Vancouver, ISO e altri

15

Pateria, Shubham, Budhitama Subagdja, Ah-hwee Tan e Chai Quek. "Hierarchical Reinforcement Learning". ACM Computing Surveys 54, n. 5 (giugno 2021): 1–35. http://dx.doi.org/10.1145/3453160.

Testo completo

Abstract (sommario):

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.

Gli stili APA, Harvard, Vancouver, ISO e altri

16

Matsui, Tohgoroh. "Compound Reinforcement Learning". Transactions of the Japanese Society for Artificial Intelligence 26 (2011): 330–34. http://dx.doi.org/10.1527/tjsai.26.330.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

17

Daoyi Dong, Chunlin Chen, Hanxiong Li e Tzyh-Jong Tarn. "Quantum Reinforcement Learning". IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, n. 5 (ottobre 2008): 1207–20. http://dx.doi.org/10.1109/tsmcb.2008.925743.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

18

Farias, Vivek F., Ciamac C. Moallemi, Benjamin Van Roy e Tsachy Weissman. "Universal Reinforcement Learning". IEEE Transactions on Information Theory 56, n. 5 (maggio 2010): 2441–54. http://dx.doi.org/10.1109/tit.2010.2043762.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

19

Morimoto, Jun, e Kenji Doya. "Robust Reinforcement Learning". Neural Computation 17, n. 2 (1 febbraio 2005): 335–59. http://dx.doi.org/10.1162/0899766053011528.

Testo completo

Abstract (sommario):

This letter proposes a new reinforcement learning (RL) paradigm that explicitly takes into account input disturbance as well as modeling errors. The use of environmental models in RL is quite popular for both off-line learning using simulations and for online action planning. However, the difference between the model and the real environment can lead to unpredictable, and often unwanted, results. Based on the theory of H∞ control, we consider a differential game in which a “disturbing” agent tries to make the worst possible disturbance while a “control” agent tries to make the best control input. The problem is formulated as finding a min-max solution of a value function that takes into account the amount of the reward and the norm of the disturbance. We derive online learning algorithms for estimating the value function and for calculating the worst disturbance and the best control in reference to the value function. We tested the paradigm, which we call robust reinforcement learning (RRL), on the control task of an inverted pendulum. In the linear domain, the policy and the value function learned by online algorithms coincided with those derived analytically by the linear H∞ control theory. For a fully nonlinear swing-up task, RRL achieved robust performance with changes in the pendulum weight and friction, while a standard reinforcement learning algorithm could not deal with these changes. We also applied RRL to the cart-pole swing-up task, and a robust swing-up policy was acquired.

Gli stili APA, Harvard, Vancouver, ISO e altri

20

Weiβ, Gerhard. "Distributed reinforcement learning". Robotics and Autonomous Systems 15, n. 1-2 (luglio 1995): 135–42. http://dx.doi.org/10.1016/0921-8890(95)00018-b.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

21

Servedio, Maria R., Stein A. Sæther e Glenn-Peter Sætre. "Reinforcement and learning". Evolutionary Ecology 23, n. 1 (17 luglio 2007): 109–23. http://dx.doi.org/10.1007/s10682-007-9188-2.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

22

ANDRECUT, M., e M. K. ALI. "FUZZY REINFORCEMENT LEARNING". International Journal of Modern Physics C 13, n. 05 (giugno 2002): 659–74. http://dx.doi.org/10.1142/s0129183102003450.

Testo completo

Abstract (sommario):

Fuzzy logic represents an extension of classical logic, giving modes of approximate reasoning in an environment of uncertainty and imprecision. Fuzzy inference systems incorporates human knowledge into their knowledge base on the conclusions of the fuzzy rules, which are affected by subjective decisions. In this paper we show how the reinforcement learning technique can be used to tune the conclusion part of a fuzzy inference system. The fuzzy reinforcement learning technique is illustrated using two examples: the cart centering problem and the autonomous navigation problem.

Gli stili APA, Harvard, Vancouver, ISO e altri

23

Zhu, Ruoqing, Donglin Zeng e Michael R. Kosorok. "Reinforcement Learning Trees". Journal of the American Statistical Association 110, n. 512 (2 ottobre 2015): 1770–84. http://dx.doi.org/10.1080/01621459.2015.1036994.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

24

Oku, Makito, e Kazuyuki Aihara. "Networked reinforcement learning". Artificial Life and Robotics 13, n. 1 (dicembre 2008): 112–15. http://dx.doi.org/10.1007/s10015-008-0565-x.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

25

Barto, Andrew G. "Reinforcement learning control". Current Opinion in Neurobiology 4, n. 6 (dicembre 1994): 888–93. http://dx.doi.org/10.1016/0959-4388(94)90138-4.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

26

Hernandez-Orallo, Jose. "Constructive reinforcement learning". International Journal of Intelligent Systems 15, n. 3 (marzo 2000): 241–64. http://dx.doi.org/10.1002/(sici)1098-111x(200003)15:3<241::aid-int6>3.0.co;2-z.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

27

Aydin, Mehmet Emin, Rafet Durgut e Abdur Rakib. "Why Reinforcement Learning?" Algorithms 17, n. 6 (20 giugno 2024): 269. http://dx.doi.org/10.3390/a17060269.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

28

Muhammad Azhar, Mansoor Ahmed Khuhro, Muhammad Waqas, Umair Saeed e Mehar Khan Niazi. "Comprehensive Study on Reinforcement Learning and Deep Reinforcement Learning Schemes". Sir Syed University Research Journal of Engineering & Technology 14, n. 2 (27 dicembre 2024): 1–6. https://doi.org/10.33317/ssurj.638.

Testo completo

Abstract (sommario):

Reinforcement learning (RL) has emerged as a powerful tool for creating artificial intelligence systems (AIS) and solving problems which require sequential decision-making. Reinforcement learning has achieved some impressive achievements in recent years, surpassing humans in a variety of areas. According to recent research, deep learning (DL) techniques are used with techniques of reinforcement learning to recognize meaningful identification for a problem regarding high dimensional raw data input & enough to solve artificial general intelligence (AGI). In addition to the main concepts, this paper highlights the intuition behind the use RL and deep Q-network (DQN) over other algorithms. In this research paper, different methods and details for dealing with reinforcement learning difficulties have been presented. Finally, various difficulties of the reinforcement learning have been addressed.

Gli stili APA, Harvard, Vancouver, ISO e altri

29

Schweighofer, Nicolas, e Kenji Doya. "Meta-learning in Reinforcement Learning". Neural Networks 16, n. 1 (gennaio 2003): 5–9. http://dx.doi.org/10.1016/s0893-6080(02)00228-9.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

30

Cetin, Edoardo, e Oya Celiktutan. "Learning Pessimism for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 6 (26 giugno 2023): 6971–79. http://dx.doi.org/10.1609/aaai.v37i6.25852.

Testo completo

Abstract (sommario):

Off-policy deep reinforcement learning algorithms commonly compensate for overestimation bias during temporal-difference learning by utilizing pessimistic estimates of the expected target returns. In this work, we propose Generalized Pessimism Learning (GPL), a strategy employing a novel learnable penalty to enact such pessimism. In particular, we propose to learn this penalty alongside the critic with dual TD-learning, a new procedure to estimate and minimize the magnitude of the target returns bias with trivial computational cost. GPL enables us to accurately counteract overestimation bias throughout training without incurring the downsides of overly pessimistic targets. By integrating GPL with popular off-policy algorithms, we achieve state-of-the-art results in both competitive proprioceptive and pixel-based benchmarks.

Gli stili APA, Harvard, Vancouver, ISO e altri

31

Chakraborty, Montosh, Shivakrishna Gouroju, Pinki Garg e Karthikeyan P. "PBL: An Effective Method Of Reinforcement Learning". International Journal of Integrative Medical Sciences 2, n. 6 (30 giugno 2015): 134–38. http://dx.doi.org/10.16965/ijims.2015.119.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

32

De, Ashis, Barun Mazumdar, Aritra Dhabal, Saikat Bhattacharjee, Aridip Maity e Sourav Bandopadhyay. "Design of PID Controller using Reinforcement Learning". International Journal of Research Publication and Reviews 4, n. 11 (6 novembre 2023): 443–52. http://dx.doi.org/10.55248/gengpi.4.1123.113004.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

33

Pusparini, Desy. "Giving Reinforcement with 2.0 Framework by Teacher: A Photovoice of Undergraduate Students in the EFL Classroom". JSSH (Jurnal Sains Sosial dan Humaniora) 3, n. 1 (13 agosto 2019): 21. http://dx.doi.org/10.30595/jssh.v3i1.3841.

Testo completo

Abstract (sommario):

Abstract - Reinforcement has been used in many areas of educational institution. In the learning activity, reinforcements are given by the teacher as feedback for what students have done. By using reinforcement in the learning activity, the students are expected to feel comfortable to show themselves by responding questions, giving feedback, and expressing their opinions in the class. This study aims to investigate the effect of giving reinforcement by the teacher towards student's learning motivation. This research used the photovoice method and SHOWeD Analysis. The participants are 27 students in 5th semester of English Education Department in online class, consists of 7 males and 20 females with the average age of around 19-21 years old. The finding shows that giving reinforcement encourage student's motivation in the learning activity. As the implication, teachers apply reinforcement in order to make the students high-motivated in the class.

Gli stili APA, Harvard, Vancouver, ISO e altri

34

Vafashoar, Reza, e Mohammad Reza Meybodi. "Reinforcement learning in learning automata and cellular learning automata via multiple reinforcement signals". Knowledge-Based Systems 169 (aprile 2019): 1–27. http://dx.doi.org/10.1016/j.knosys.2019.01.021.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

35

Agrawal, Avinash J., Rashmi R. Welekar, Namita Parati, Pravin R. Satav, Uma Patel Thakur e Archana V. Potnurwar. "Reinforcement Learning and Advanced Reinforcement Learning to Improve Autonomous Vehicle Planning". International Journal on Recent and Innovation Trends in Computing and Communication 11, n. 7s (25 luglio 2023): 652–60. http://dx.doi.org/10.17762/ijritcc.v11i7s.7526.

Testo completo

Abstract (sommario):

Planning for autonomous vehicles is a challenging process that involves navigating through dynamic and unpredictable surroundings while making judgments in real-time. Traditional planning methods sometimes rely on predetermined rules or customized heuristics, which could not generalize well to various driving conditions. In this article, we provide a unique framework to enhance autonomous vehicle planning by fusing conventional RL methods with cutting-edge reinforcement learning techniques. To handle many elements of planning issues, our system integrates cutting-edge algorithms including deep reinforcement learning, hierarchical reinforcement learning, and meta-learning. Our framework helps autonomous vehicles make decisions that are more reliable and effective by utilizing the advantages of these cutting-edge strategies.With the use of the RLTT technique, an autonomous vehicle can learn about the intentions and preferences of human drivers by inferring the underlying reward function from expert behaviour that has been seen. The autonomous car can make safer and more human-like decisions by learning from expert demonstrations about the fundamental goals and limitations of driving. Large-scale simulations and practical experiments can be carried out to gauge the effectiveness of the suggested approach. On the basis of parameters like safety, effectiveness, and human likeness, the autonomous vehicle planning system's performance can be assessed. The outcomes of these assessments can help to inform future developments and offer insightful information about the strengths and weaknesses of the strategy.

Gli stili APA, Harvard, Vancouver, ISO e altri

36

Vamvoudakis, Kyriakos G., Yan Wan e Frank L. Lewis. "Workshop on Distributed Reinforcement Learning and Reinforcement-Learning Games [Conference Reports]". IEEE Control Systems 39, n. 6 (dicembre 2019): 122–24. http://dx.doi.org/10.1109/mcs.2019.2938053.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

37

Liu, Shiyi. "Research of Multi-agent Deep Reinforcement Learning based on Value Factorization". Highlights in Science, Engineering and Technology 39 (1 aprile 2023): 848–54. http://dx.doi.org/10.54097/hset.v39i.6655.

Testo completo

Abstract (sommario):

One of the numerous multi-agents’ deep reinforcements learning methods and a hotspot for research in the field is multi-agent deep reinforcement learning based on value factorization. In order to effectively address the issues of environmental instability and the exponential expansion of action space in multi-agent systems, it uses some constraints to break down the joint action value function of the multi-agent system into a specific combination of individual action value functions. Firstly, in this paper, the reason for the factorization of value function is explained. The fundamentals of multi-agent deep reinforcement learning are then introduced. The multi-agent deep reinforcement learning algorithms based on value factorization may then be separated into simple factorization and attention-mechanism based algorithms depending on whether other mechanisms are incorporated and which various mechanisms are introduced. Then several typical algorithms are introduced and their advantages and disadvantages are compared and analyzed. Finally, the content of reinforcement learning elaborated in this paper is summarized.

Gli stili APA, Harvard, Vancouver, ISO e altri

38

Bae, Jung Ho, Yun-Seong Kang, Sukmin Yoon, Yong-Duk Kim e Sungho Kim. "Aircraft Reinforcement Learning using Curriculum Learning". Journal of KIISE 48, n. 6 (30 giugno 2021): 707–12. http://dx.doi.org/10.5626/jok.2021.48.6.707.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

39

Matsubara, Takamitsu. "Learning Control Policies by Reinforcement Learning". Journal of the Robotics Society of Japan 36, n. 9 (2018): 597–600. http://dx.doi.org/10.7210/jrsj.36.597.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

40

Fachantidis, Anestis, Matthew Taylor e Ioannis Vlahavas. "Learning to Teach Reinforcement Learning Agents". Machine Learning and Knowledge Extraction 1, n. 1 (6 dicembre 2017): 21–42. http://dx.doi.org/10.3390/make1010002.

Testo completo

Abstract (sommario):

In this article, we study the transfer learning model of action advice under a budget. We focus on reinforcement learning teachers providing action advice to heterogeneous students playing the game of Pac-Man under a limited advice budget. First, we examine several critical factors affecting advice quality in this setting, such as the average performance of the teacher, its variance and the importance of reward discounting in advising. The experiments show that the best performers are not always the best teachers and reveal the non-trivial importance of the coefficient of variation (CV) as a statistic for choosing policies that generate advice. The CV statistic relates variance to the corresponding mean. Second, the article studies policy learning for distributing advice under a budget. Whereas most methods in the relevant literature rely on heuristics for advice distribution, we formulate the problem as a learning one and propose a novel reinforcement learning algorithm capable of learning when to advise or not. The proposed algorithm is able to advise even when it does not have knowledge of the student’s intended action and needs significantly less training time compared to previous learning approaches. Finally, in this article, we argue that learning to advise under a budget is an instance of a more generic learning problem: Constrained Exploitation Reinforcement Learning.

Gli stili APA, Harvard, Vancouver, ISO e altri

41

NISHIZAWA, Chieko, e Hirokazu MATSUI. "Reinforcement learning with multiplex learning spaces". Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 (2016): 1P1–04b3. http://dx.doi.org/10.1299/jsmermd.2016.1p1-04b3.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

42

Yücesoy, Yiğit E. Yücesoy, e M. Borahan Tümer. "Hierarchical Reinforcement Learning with Context Detection (HRL-CD)". International Journal of Machine Learning and Computing 5, n. 5 (ottobre 2015): 353–58. http://dx.doi.org/10.7763/ijmlc.2015.v5.533.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

43

G Soares Azhari, Teotino. "Semantic Reinforcement Learning Model for Education Question Answering". International Journal of Science and Research (IJSR) 12, n. 2 (5 febbraio 2023): 1648–53. http://dx.doi.org/10.21275/sr23213125341.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

44

Dang, Ngoc Trung, e Phuong Nam Dao. "Data-Driven Reinforcement Learning Control for Quadrotor Systems". International Journal of Mechanical Engineering and Robotics Research 13, n. 5 (2024): 495–501. http://dx.doi.org/10.18178/ijmerr.13.5.495-501.

Testo completo

Abstract (sommario):

This paper aims to solve the tracking problem and optimality effectiveness of an Unmanned Aerial Vehicle (UAV) by model-free data Reinforcement Learning (RL) algorithms in both sub-systems of attitude and position. First, a cascade UAV model structure is given to establish the control system diagram with two corresponding attitude and position control loops. Second, based on the computation of the time derivative of the Bellman function by two different methods, the combination of the Bellman function and the optimal control is adopted to maintain the control signal as time converges to infinity with the addition of a discount factor. Third, according to off policy technique, the two proposed model-free RL algorithms are designed for attitude and position sub-systems in UAV control structure with a discount factor, respectively. In particular, the designed algorithms not only solve the trajectory tracking problem but also guarantee the optimality performance. Finally, an illustrative system is used to verify the performance of the proposed model-free data RL algorithms in the UAV control system.

Gli stili APA, Harvard, Vancouver, ISO e altri

45

Shrivastava, Soumya. "Role of Reinforcement Learning in Financial Management Strategy". International Journal of Science and Research (IJSR) 11, n. 1 (5 gennaio 2022): 1556–62. http://dx.doi.org/10.21275/sr22128210442.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

46

White, Devin, Mingkang Wu, Ellen Novoseller, Vernon J. Lawhern, Nicholas Waytowich e Yongcan Cao. "Rating-Based Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 9 (24 marzo 2024): 10207–15. http://dx.doi.org/10.1609/aaai.v38i9.28886.

Testo completo

Abstract (sommario):

This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.

Gli stili APA, Harvard, Vancouver, ISO e altri

47

Kim, Man-Je, Hyunsoo Park e Chang Wook Ahn. "Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning". Electronics 11, n. 7 (28 marzo 2022): 1069. http://dx.doi.org/10.3390/electronics11071069.

Testo completo

Abstract (sommario):

Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement learning. We propose a method of policy assimilation to solve this problem. This method was applied to MO-V-MPO, one of preference-based multi-objective reinforcement learning, to increase diversity. The performance of this method has been verified through experiments in a continuous control environment.

Gli stili APA, Harvard, Vancouver, ISO e altri

48

Li, Chengan. "Research advanced in the integration of federated learning and reinforcement learning". Applied and Computational Engineering 40, n. 1 (21 febbraio 2024): 147–54. http://dx.doi.org/10.54254/2755-2721/40/20230641.

Testo completo

Abstract (sommario):

Reinforcement learning (RL) and federated learning (FL) are two important machine learning paradigms. Reinforcement learning is concerned with enabling intelligence to learn optimal policies when interacting with an environment, while federated learning is concerned with collaboratively training models on distributed equipment while preserving data privacy. In recent years, the fusion and complementarity of reinforcement learning, and federated learning have attracted increasing research interest, providing new directions for the development of the machine learning community. Focusing on the integration of reinforcement learning and federated learning, this paper introduces in detail the latest technological developments in the integration of reinforcement learning and federated learning, and discusses the main challenges, existing methods and future directions of this intersection. Specifically, based on the introduction of classical reinforcement learning and federated learning. In addition, this document introduces cutting-edge results on the integration of reinforcement learning and joint learning and discusses the problems and future directions of the integration.

Gli stili APA, Harvard, Vancouver, ISO e altri

49

Datta, Shounak, Yanjun Li, Matthew M. Ruppert, Yuanfang Ren, Benjamin Shickel, Tezcan Ozrazgat-Baslanti, Parisa Rashidi e Azra Bihorac. "Reinforcement learning in surgery". Surgery 170, n. 1 (luglio 2021): 329–32. http://dx.doi.org/10.1016/j.surg.2020.11.040.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

50

Khan, Koffka, e Wayne Goodridge. "Reinforcement Learning In DASH". International Journal of Advanced Networking and Applications 11, n. 05 (2020): 4386–92. http://dx.doi.org/10.35444/ijana.2020.11052.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Articoli di riviste sul tema "Reinforcement Learning"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili