Gotowa bibliografia na temat „Factored reinforcement learning”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Spis treści
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Factored reinforcement learning”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Factored reinforcement learning"
Wu, Bo, Yan Peng Feng i Hong Yan Zheng. "A Model-Based Factored Bayesian Reinforcement Learning Approach". Applied Mechanics and Materials 513-517 (luty 2014): 1092–95. http://dx.doi.org/10.4028/www.scientific.net/amm.513-517.1092.
Pełny tekst źródłaLi, Chao, Yupeng Zhang, Jianqi Wang, Yujing Hu, Shaokang Dong, Wenbin Li, Tangjie Lv, Changjie Fan i Yang Gao. "Optimistic Value Instructors for Cooperative Multi-Agent Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 38, nr 16 (24.03.2024): 17453–60. http://dx.doi.org/10.1609/aaai.v38i16.29694.
Pełny tekst źródłaKveton, Branislav, i Georgios Theocharous. "Structured Kernel-Based Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 27, nr 1 (30.06.2013): 569–75. http://dx.doi.org/10.1609/aaai.v27i1.8669.
Pełny tekst źródłaSimão, Thiago D., i Matthijs T. J. Spaan. "Safe Policy Improvement with Baseline Bootstrapping in Factored Environments". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 4967–74. http://dx.doi.org/10.1609/aaai.v33i01.33014967.
Pełny tekst źródłaTruong, Van Binh, i Long Bao Le. "Electric vehicle charging design: The factored action based reinforcement learning approach". Applied Energy 359 (kwiecień 2024): 122737. http://dx.doi.org/10.1016/j.apenergy.2024.122737.
Pełny tekst źródłaSIMM, Jaak, Masashi SUGIYAMA i Hirotaka HACHIYA. "Multi-Task Approach to Reinforcement Learning for Factored-State Markov Decision Problems". IEICE Transactions on Information and Systems E95.D, nr 10 (2012): 2426–37. http://dx.doi.org/10.1587/transinf.e95.d.2426.
Pełny tekst źródłaWang, Zizhao, Caroline Wang, Xuesu Xiao, Yuke Zhu i Peter Stone. "Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 38, nr 14 (24.03.2024): 15778–86. http://dx.doi.org/10.1609/aaai.v38i14.29507.
Pełny tekst źródłaMohamad Hafiz Abu Bakar, Abu Ubaidah bin Shamsudin, Ruzairi Abdul Rahim, Zubair Adil Soomro i Andi Adrianshah. "Comparison Method Q-Learning and SARSA for Simulation of Drone Controller using Reinforcement Learning". Journal of Advanced Research in Applied Sciences and Engineering Technology 30, nr 3 (15.05.2023): 69–78. http://dx.doi.org/10.37934/araset.30.3.6978.
Pełny tekst źródłaKong, Minseok, i Jungmin So. "Empirical Analysis of Automated Stock Trading Using Deep Reinforcement Learning". Applied Sciences 13, nr 1 (3.01.2023): 633. http://dx.doi.org/10.3390/app13010633.
Pełny tekst źródłaMutti, Mirco, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein i Marcello Restelli. "Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 8 (26.06.2023): 9251–59. http://dx.doi.org/10.1609/aaai.v37i8.26109.
Pełny tekst źródłaRozprawy doktorskie na temat "Factored reinforcement learning"
Kozlova, Olga. "Hierarchical and factored reinforcement learning". Paris 6, 2010. http://www.theses.fr/2010PA066196.
Pełny tekst źródłaTournaire, Thomas. "Model-based reinforcement learning for dynamic resource allocation in cloud environments". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS004.
Pełny tekst źródłaThe emergence of new technologies (Internet of Things, smart cities, autonomous vehicles, health, industrial automation, ...) requires efficient resource allocation to satisfy the demand. These new offers are compatible with new 5G network infrastructure since it can provide low latency and reliability. However, these new needs require high computational power to fulfill the demand, implying more energy consumption in particular in cloud infrastructures and more particularly in data centers. Therefore, it is critical to find new solutions that can satisfy these needs still reducing the power usage of resources in cloud environments. In this thesis we propose and compare new AI solutions (Reinforcement Learning) to orchestrate virtual resources in virtual network environments such that performances are guaranteed and operational costs are minimised. We consider queuing systems as a model for clouds IaaS infrastructures and bring learning methodologies to efficiently allocate the right number of resources for the users.Our objective is to minimise a cost function considering performance costs and operational costs. We go through different types of reinforcement learning algorithms (from model-free to relational model-based) to learn the best policy. Reinforcement learning is concerned with how a software agent ought to take actions in an environment to maximise some cumulative reward. We first develop queuing model of a cloud system with one physical node hosting several virtual resources. On this first part we assume the agent perfectly knows the model (dynamics of the environment and the cost function), giving him the opportunity to perform dynamic programming methods for optimal policy computation. Since the model is known in this part, we also concentrate on the properties of the optimal policies, which are threshold-based and hysteresis-based rules. This allows us to integrate the structural property of the policies into MDP algorithms. After providing a concrete cloud model with exponential arrivals with real intensities and energy data for cloud provider, we compare in this first approach efficiency and time computation of MDP algorithms against heuristics built on top of the queuing Markov Chain stationary distributions.In a second part we consider that the agent does not have access to the model of the environment and concentrate our work with reinforcement learning techniques, especially model-based reinforcement learning. We first develop model-based reinforcement learning methods where the agent can re-use its experience replay to update its value function. We also consider MDP online techniques where the autonomous agent approximates environment model to perform dynamic programming. This part is evaluated in a larger network environment with two physical nodes in tandem and we assess convergence time and accuracy of different reinforcement learning methods, mainly model-based techniques versus the state-of-the-art model-free methods (e.g. Q-Learning).The last part focuses on model-based reinforcement learning techniques with relational structure between environment variables. As these tandem networks have structural properties due to their infrastructure shape, we investigate factored and causal approaches built-in reinforcement learning methods to integrate this information. We provide the autonomous agent with a relational knowledge of the environment where it can understand how variables are related to each other. The main goal is to accelerate convergence by: first having a more compact representation with factorisation where we devise a factored MDP online algorithm that we evaluate and compare with model-free and model-based reinforcement learning algorithms; second integrating causal and counterfactual reasoning that can tackle environments with partial observations and unobserved confounders
Magnan, Jean-Christophe. "Représentations graphiques de fonctions et processus décisionnels Markoviens factorisés". Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066042/document.
Pełny tekst źródłaIn decision theoretic planning, the factored framework (Factored Markovian Decision Process, FMDP) has produced several efficient algorithms in order to resolve large sequential decision making under uncertainty problems. The efficiency of this algorithms relies on data structures such as decision trees or algebraïc decision diagrams (ADDs). These planification technics are exploited in Reinforcement Learning by the architecture SDyna in order to resolve large and unknown problems. However, state-of-the-art learning and planning algorithms used in SDyna require the problem to be specified uniquely using binary variables and/or to use improvable data structure in term of compactness. In this book, we present our research works that seek to elaborate and to use a new data structure more efficient and less restrictive, and to integrate it in a new instance of the SDyna architecture. In a first part, we present the state-of-the-art modeling tools used in the algorithms that tackle large sequential decision making under uncertainty problems. We detail the modeling using decision trees and ADDs. Then we introduce the Ordered and Reduced Graphical Representation of Function, a new data structure that we propose in this thesis to deal with the various problems concerning the ADDs. We demonstrate that ORGRFs improve on ADDs to model large problems. In a second part, we go over the resolution of large sequential decision under uncertainty problems using Dynamic Programming. After the introduction of the main algorithms, we see in details the factored alternative. We indicate the improvable points of these factored versions. We describe our new algorithm that improve on these points and exploit the ORGRFs previously introduced. In a last part, we speak about the use of FMDPs in Reinforcement Learning. Then we introduce a new algorithm to learn the new datastrcture we propose. Thanks to this new algorithm, a new instance of the SDyna architecture is proposed, based on the ORGRFs : the SPIMDDI instance. We test its efficiency on several standard problems from the litterature. Finally, we present some works around this new instance. We detail a new algorithm for efficient exploration-exploitation compromise management, aiming to simplify F-RMax. Then we speak about an application of SPIMDDI to the managements of units in a strategic real time video game
Magnan, Jean-Christophe. "Représentations graphiques de fonctions et processus décisionnels Markoviens factorisés". Electronic Thesis or Diss., Paris 6, 2016. http://www.theses.fr/2016PA066042.
Pełny tekst źródłaIn decision theoretic planning, the factored framework (Factored Markovian Decision Process, FMDP) has produced several efficient algorithms in order to resolve large sequential decision making under uncertainty problems. The efficiency of this algorithms relies on data structures such as decision trees or algebraïc decision diagrams (ADDs). These planification technics are exploited in Reinforcement Learning by the architecture SDyna in order to resolve large and unknown problems. However, state-of-the-art learning and planning algorithms used in SDyna require the problem to be specified uniquely using binary variables and/or to use improvable data structure in term of compactness. In this book, we present our research works that seek to elaborate and to use a new data structure more efficient and less restrictive, and to integrate it in a new instance of the SDyna architecture. In a first part, we present the state-of-the-art modeling tools used in the algorithms that tackle large sequential decision making under uncertainty problems. We detail the modeling using decision trees and ADDs. Then we introduce the Ordered and Reduced Graphical Representation of Function, a new data structure that we propose in this thesis to deal with the various problems concerning the ADDs. We demonstrate that ORGRFs improve on ADDs to model large problems. In a second part, we go over the resolution of large sequential decision under uncertainty problems using Dynamic Programming. After the introduction of the main algorithms, we see in details the factored alternative. We indicate the improvable points of these factored versions. We describe our new algorithm that improve on these points and exploit the ORGRFs previously introduced. In a last part, we speak about the use of FMDPs in Reinforcement Learning. Then we introduce a new algorithm to learn the new datastrcture we propose. Thanks to this new algorithm, a new instance of the SDyna architecture is proposed, based on the ORGRFs : the SPIMDDI instance. We test its efficiency on several standard problems from the litterature. Finally, we present some works around this new instance. We detail a new algorithm for efficient exploration-exploitation compromise management, aiming to simplify F-RMax. Then we speak about an application of SPIMDDI to the managements of units in a strategic real time video game
Heron, Michael James. "The ACCESS Framework : reinforcement learning for accessibility and cognitive support for older adults". Thesis, University of Dundee, 2011. https://discovery.dundee.ac.uk/en/studentTheses/0952d5ff-7a23-4c29-b050-fd799035652c.
Pełny tekst źródłaAl-Safi, Abdullah Taha. "Social reinforcement and risk-taking factors to enhance creativity in Saudi Arabian school children". Thesis, Cardiff University, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.296226.
Pełny tekst źródłaAbed-Alguni, Bilal Hashem Kalil. "Cooperative reinforcement learning for independent learners". Thesis, 2014. http://hdl.handle.net/1959.13/1052917.
Pełny tekst źródłaMachine learning in multi-agent domains poses several research challenges. One challenge is how to model cooperation between reinforcement learners. Cooperation between independent reinforcement learners is known to accelerate convergence to optimal solutions. In large state space problems, independent reinforcement learners normally cooperate to accelerate the learning process using decomposition techniques or knowledge sharing strategies. This thesis presents two techniques to multi-agent reinforcement learning and a comparison study. The first technique is a formal decomposition model and an algorithm for distributed systems. The second technique is a cooperative Q-learning algorithm for multi-goal decomposable systems. The comparison study compares the performance of some of the best known cooperative Q-learning algorithms for independent learners. Distributed systems are normally organised into two levels: system and subsystem levels. This thesis presents a formal solution for decomposition of Markov Decision Processes (MDPs) in distributed systems that takes advantage of the organisation of distributed systems and provides support for migration of learners. This is accomplished by two proposals: a Distributed, Hierarchical Learning Model (DHLM) and an Intelligent Distributed Q-Learning algorithm (IDQL) that are based on three specialisations of agents: workers, tutors and consultants. Worker agents are the actual learners and performers of tasks, while tutor agents and consultant agents are coordinators at the subsystem level and the system level, respectively. A main duty of consultant and tutor agents is the assignment of problem space to worker agents. The experimental results in a distributed hunter prey problem suggest that IDQL converges to a solution faster than the single agent Q-learning approach. An important feature of DHLM is that it provides a solution for migration of agents. This feature provides support for the IDQL algorithm where the problem space of each worker agent can change dynamically. Other hierarchical RL models do not cover this issue. Problems that have multiple goal-states can be decomposed into sub-problems by taking advantage of the loosely-coupled bonds among the goal states. In such problems, each goal state and its problem space form a sub-problem. This thesis introduces Q-learning with Aggregation algorithm (QA-learning), an algorithm for problems with multiple goal-states that is based on two roles: learner and tutor. A learner is an agent that learns and uses the knowledge of its neighbours (tutors) to construct its Q-table. A tutor is a learner that is ready to share its Q-table with its neighbours (learners). These roles are based on the concept of learners reusing tutors' sub-solutions. This algorithm provides solutions to problems with multiple goal-states. In this algorithm, each learner incorporates its tutors' knowledge into its own Q-table calculations. A comprehensive solution can then be obtained by combining these partial solutions. The experimental results in an instance of the shortest path problem suggest that the output of QA-learning is comparable to the output of a single Q-learner whose problem space is the whole system. But the QA-learning algorithm converges to a solution faster than a single learner approach. Cooperative Q-learning algorithms for independent learners accelerate the learning process of individual learners. In this type of Q-learning, independent learners share and update their Q-values by following a sharing strategy after some episodes learning independently. This thesis presents a comparison study of the performance of some famous cooperative Q-learning algorithms (BEST-Q, AVE-Q, PSO-Q, and WSS) as well as an algorithm that aggregates their results. These algorithms are compared in two cases: equal experience and different experiences cases. In the first case, the learners have equal learning time, while in the second case, the learners have different learning times. The comparison study also examines the effects of the frequency of Q-value sharing on the learning speed of independent learners. The experimental results in the equal experience case indicate that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. While, the experimental results in the different experiences case suggest that each of the cooperative Q-learning algorithms performs similarly, but better than single agent Q-learning. In both cases, high-frequency sharing of Q-values accelerates the convergence to optimal solutions compared to low-frequency sharing. Low-frequency Q-value sharing degrades the performance of the cooperative Q-learning algorithms in the equal experience and different experiences cases.
Baker, Travis Edward. "Genetics, drugs, and cognitive control: uncovering individual differences in substance dependence". Thesis, 2012. http://hdl.handle.net/1828/4265.
Pełny tekst źródłaGraduate
Książki na temat "Factored reinforcement learning"
Sallans, Brian. Reinforcement learning for factored Markov decision processes. 2002.
Znajdź pełny tekst źródłaCzęści książek na temat "Factored reinforcement learning"
Sigaud, Olivier, Martin V. Butz, Olga Kozlova i Christophe Meyer. "Anticipatory Learning Classifier Systems and Factored Reinforcement Learning". W Anticipatory Behavior in Adaptive Learning Systems, 321–33. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-02565-5_18.
Pełny tekst źródłaKozlova, Olga, Olivier Sigaud i Christophe Meyer. "TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs". W From Animals to Animats 11, 489–500. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-15193-4_46.
Pełny tekst źródłaKozlova, Olga, Olivier Sigaud, Pierre-Henri Wuillemin i Christophe Meyer. "Considering Unseen States as Impossible in Factored Reinforcement Learning". W Machine Learning and Knowledge Discovery in Databases, 721–35. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04180-8_64.
Pełny tekst źródłaDegris, Thomas, Olivier Sigaud i Pierre-Henri Wuillemin. "Exploiting Additive Structure in Factored MDPs for Reinforcement Learning". W Lecture Notes in Computer Science, 15–26. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89722-4_2.
Pełny tekst źródłaCoqueret, Guillaume, i Tony Guida. "Reinforcement learning". W Machine Learning for Factor Investing, 257–72. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003121596-20.
Pełny tekst źródłaKlar, M., J. Mertes, M. Glatt, B. Ravani i J. C. Aurich. "A Holistic Framework for Factory Planning Using Reinforcement Learning". W Proceedings of the 3rd Conference on Physical Modeling for Virtual Manufacturing Systems and Processes, 129–48. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-35779-4_8.
Pełny tekst źródłaWu, Tingyao, i Werner Van Leekwijck. "Factor Selection for Reinforcement Learning in HTTP Adaptive Streaming". W MultiMedia Modeling, 553–67. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-04114-8_47.
Pełny tekst źródłaAlur, Rajeev, Osbert Bastani, Kishor Jothimurugan, Mateo Perez, Fabio Somenzi i Ashutosh Trivedi. "Policy Synthesis and Reinforcement Learning for Discounted LTL". W Computer Aided Verification, 415–35. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-37706-8_21.
Pełny tekst źródłaVitorino, João, Rui Andrade, Isabel Praça, Orlando Sousa i Eva Maia. "A Comparative Analysis of Machine Learning Techniques for IoT Intrusion Detection". W Foundations and Practice of Security, 191–207. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-08147-7_13.
Pełny tekst źródłaHammler, Patric, Nicolas Riesterer, Gang Mu i Torsten Braun. "Multi-Echelon Inventory Optimization Using Deep Reinforcement Learning". W Quantitative Models in Life Science Business, 73–93. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-11814-2_5.
Pełny tekst źródłaStreszczenia konferencji na temat "Factored reinforcement learning"
Strehl, Alexander L. "Model-Based Reinforcement Learning in Factored-State MDPs". W 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. IEEE, 2007. http://dx.doi.org/10.1109/adprl.2007.368176.
Pełny tekst źródłaSahin, Coskun, Erkin Cilden i Faruk Polat. "Memory efficient factored abstraction for reinforcement learning". W 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF). IEEE, 2015. http://dx.doi.org/10.1109/cybconf.2015.7175900.
Pełny tekst źródłaYao, Hengshuai, Csaba Szepesvari, Bernardo Avila Pires i Xinhua Zhang. "Pseudo-MDPs and factored linear action models". W 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 2014. http://dx.doi.org/10.1109/adprl.2014.7010633.
Pełny tekst źródłaDegris, Thomas, Olivier Sigaud i Pierre-Henri Wuillemin. "Learning the structure of Factored Markov Decision Processes in reinforcement learning problems". W the 23rd international conference. New York, New York, USA: ACM Press, 2006. http://dx.doi.org/10.1145/1143844.1143877.
Pełny tekst źródłaWu, Bo, i Yanpeng Feng. "Monte-Carlo Bayesian Reinforcement Learning Using a Compact Factored Representation". W 2017 4th International Conference on Information Science and Control Engineering (ICISCE). IEEE, 2017. http://dx.doi.org/10.1109/icisce.2017.104.
Pełny tekst źródłaSimão, Thiago D. "Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments". W Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/919.
Pełny tekst źródłaKroon, Mark, i Shimon Whiteson. "Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs". W 2009 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2009. http://dx.doi.org/10.1109/icmla.2009.71.
Pełny tekst źródłaFrance, Kordel K., i John W. Sheppard. "Factored Particle Swarm Optimization for Policy Co-training in Reinforcement Learning". W GECCO '23: Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3583131.3590376.
Pełny tekst źródłaSimão, Thiago D., i Matthijs T. J. Spaan. "Structure Learning for Safe Policy Improvement". W Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/479.
Pełny tekst źródłaPanda, Swetasudha, i Yevgeniy Vorobeychik. "Scalable Initial State Interdiction for Factored MDPs". W Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/667.
Pełny tekst źródłaRaporty organizacyjne na temat "Factored reinforcement learning"
Rinaudo, Christina, William Leonard, Jaylen Hopson, Christopher Morey, Robert Hilborn i Theresa Coumbe. Enabling understanding of artificial intelligence (AI) agent wargaming decisions through visualizations. Engineer Research and Development Center (U.S.), kwiecień 2024. http://dx.doi.org/10.21079/11681/48418.
Pełny tekst źródła