Log in

Relevant bibliographies by topics / Reinforcement Learning / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Reinforcement Learning.

Dissertations / Theses on the topic 'Reinforcement Learning'

Author: Grafiati

Published: 4 June 2021

Last updated: 25 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Reinforcement Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Izquierdo, Ayala Pablo. "Learning comparison: Reinforcement Learning vs Inverse Reinforcement Learning : How well does inverse reinforcement learning perform in simple markov decision processes in comparison to reinforcement learning?" Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259371.

Full text

Abstract:

This research project elaborates a qualitative comparison between two different learning approaches, Reinforcement Learning (RL) and Inverse Reinforcement Learning (IRL) over the Gridworld Markov Decision Process. The interest focus will be set on the second learning paradigm, IRL, as it is considered to be relatively new and little work has been developed in this field of study. As observed, RL outperforms IRL, obtaining a correct solution in all the different scenarios studied. However, the behaviour of the IRL algorithms can be improved and this will be shown and analyzed as part of the sco

APA, Harvard, Vancouver, ISO, and other styles

2

Seymour, B. J. "Aversive reinforcement learning." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/800107/.

Full text

Abstract:

We hypothesise that human aversive learning can be described algorithmically by Reinforcement Learning models. Our first experiment uses a second-order conditioning design to study sequential outcome prediction. We show that aversive prediction errors are expressed robustly in the ventral striatum, supporting the validity of temporal difference algorithms (as in reward learning), and suggesting a putative critical area for appetitive-aversive interactions. With this in mind, the second experiment explores the nature of pain relief, which as expounded in theories of motivational opponency, is r

APA, Harvard, Vancouver, ISO, and other styles

3

Akrour, Riad. "Robust Preference Learning-based Reinforcement Learning." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112236/document.

Full text

Abstract:

Les contributions de la thèse sont centrées sur la prise de décisions séquentielles et plus spécialement sur l'Apprentissage par Renforcement (AR). Prenant sa source de l'apprentissage statistique au même titre que l'apprentissage supervisé et non-supervisé, l'AR a gagné en popularité ces deux dernières décennies en raisons de percées aussi bien applicatives que théoriques. L'AR suppose que l'agent (apprenant) ainsi que son environnement suivent un processus de décision stochastique Markovien sur un espace d'états et d'actions. Le processus est dit de décision parce que l'agent est appelé à ch

APA, Harvard, Vancouver, ISO, and other styles

4

Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Yang, Zhaoyuan Yang. "Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu152411491981452.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Cortesi, Daniele. "Reinforcement Learning in Rogue." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16138/.

Full text

Abstract:

In this work we use Reinforcement Learning to play the famous Rogue, a dungeon-crawler videogame father of the rogue-like genre. By employing different algorithms we substantially improve on the results obtained in previous work, addressing and solving the problems that were arisen. We then devise and perform new experiments to test the limits of our own solution and encounter additional and unexpected issues in the process. In one of the investigated scenario we clearly see that our approach is not yet enough to even perform better than a random agent and propose ideas for future works.

APA, Harvard, Vancouver, ISO, and other styles

7

Girgin, Sertan. "Abstraction In Reinforcement Learning." Phd thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12608257/index.pdf.

Full text

Abstract:

Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment. Generally, the problem to be solved contains subtasks that repeat at different regions of the state space. Without any guidance an agent has to learn the solutions of all subtask instances independently, which degrades the learning performance. In this thesis, we propose two approaches to build connections between different regions of the search space leading to better utilization of gained experience and accelerate learning is proposed. In the fir

APA, Harvard, Vancouver, ISO, and other styles

8

Suay, Halit Bener. "Reinforcement Learning from Demonstration." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-dissertations/173.

Full text

Abstract:

Off-the-shelf Reinforcement Learning (RL) algorithms suffer from slow learning performance, partly because they are expected to learn a task from scratch merely through an agent's own experience. In this thesis, we show that learning from scratch is a limiting factor for the learning performance, and that when prior knowledge is available RL agents can learn a task faster. We evaluate relevant previous work and our own algorithms in various experiments. Our first contribution is the first implementation and evaluation of an existing interactive RL algorithm in a real-world domain with a human

APA, Harvard, Vancouver, ISO, and other styles

9

Gao, Yang. "Argumentation accelerated reinforcement learning." Thesis, Imperial College London, 2014. http://hdl.handle.net/10044/1/26603.

Full text

Abstract:

Reinforcement Learning (RL) is a popular statistical Artificial Intelligence (AI) technique for building autonomous agents, but it suffers from the curse of dimensionality: the computational requirement for obtaining the optimal policies grows exponentially with the size of the state space. Integrating heuristics into RL has proven to be an effective approach to combat this curse, but deriving high-quality heuristics from people's (typically conflicting) domain knowledge is challenging, yet it received little research attention. Argumentation theory is a logic-based AI technique well-known for

APA, Harvard, Vancouver, ISO, and other styles

10

Alexander, John W. "Transfer in reinforcement learning." Thesis, University of Aberdeen, 2015. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=227908.

Full text

Abstract:

The problem of developing skill repertoires autonomously in robotics and artificial intelligence is becoming ever more pressing. Currently, the issues of how to apply prior knowledge to new situations and which knowledge to apply have not been sufficiently studied. We present a transfer setting where a reinforcement learning agent faces multiple problem solving tasks drawn from an unknown generative process, where each task has similar dynamics. The task dynamics are changed by varying in the transition function between states. The tasks are presented sequentially with the latest task presente

APA, Harvard, Vancouver, ISO, and other styles

11

Leslie, David S. "Reinforcement learning in games." Thesis, University of Bristol, 2004. http://hdl.handle.net/1983/420b3f4b-a8b3-4a65-be23-6d21f6785364.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Schneider, Markus. "Reinforcement Learning für Laufroboter." [S.l. : s.n.], 2007. http://nbn-resolving.de/urn:nbn:de:bsz:747-opus-344.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Wülfing, Jan [Verfasser], and Martin [Akademischer Betreuer] Riedmiller. "Stable deep reinforcement learning." Freiburg : Universität, 2019. http://d-nb.info/1204826188/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Zhang, Jingwei [Verfasser], and Wolfram [Akademischer Betreuer] Burgard. "Learning navigation policies with deep reinforcement learning." Freiburg : Universität, 2021. http://d-nb.info/1235325571/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Rottmann, Axel [Verfasser], and Wolfram [Akademischer Betreuer] Burgard. "Approaches to online reinforcement learning for miniature airships = Online Reinforcement Learning Verfahren für Miniaturluftschiffe." Freiburg : Universität, 2012. http://d-nb.info/1123473560/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Hengst, Bernhard Computer Science &amp Engineering Faculty of Engineering UNSW. "Discovering hierarchy in reinforcement learning." Awarded by:University of New South Wales. Computer Science and Engineering, 2003. http://handle.unsw.edu.au/1959.4/20497.

Full text

Abstract:

This thesis addresses the open problem of automatically discovering hierarchical structure in reinforcement learning. Current algorithms for reinforcement learning fail to scale as problems become more complex. Many complex environments empirically exhibit hierarchy and can be modeled as interrelated subsystems, each in turn with hierarchic structure. Subsystems are often repetitive in time and space, meaning that they reoccur as components of different tasks or occur multiple times in different circumstances in the environment. A learning agent may sometimes scale to larger problems if it suc

APA, Harvard, Vancouver, ISO, and other styles

17

Blixt, Rikard, and Anders Ye. "Reinforcement learning AI to Hive." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-134908.

Full text

Abstract:

This report is about the game Hive, which is a very unique board game. Firstly we cover what Hive is, and then later details on our implementations of it, which issues we ran into during the implementation and how we solved those issues. Also we attempted to make an AI and by using reinforcement learning teaching it to become good at playing Hive. More precisely we used two AI that has no knowledge of Hive other than game rules. This however turned out to be impossible within reasonable timeframe, our estimations is that it would have to run on an upper-end home computer for at least 140 years

APA, Harvard, Vancouver, ISO, and other styles

18

Borgstrand, Richard, and Patrik Servin. "Reinforcement Learning AI till Fightingspel." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3113.

Full text

Abstract:

Utförandet av projektet har varit att implementera två stycken fightingspels Artificiell Intelligens (kommer att förkortas AI). En oadaptiv och mer deterministisk AI och en adaptiv dynamisk AI som använder reinforcement learning. Detta har utförts med att skripta beteendet av AI:n i en gratis 2D fightingsspels motor som heter ”MUGEN”. AI:n använder sig utav skriptade sekvenser som utförs med MUGEN’s egna trigger och state system. Detta system kollar om de skriptade specifierade kraven är uppfyllda för AI:n att ska ”trigga”, utföra den bestämda handlingen. Den mer statiska AI:n har blivit uppby

APA, Harvard, Vancouver, ISO, and other styles

19

Arnekvist, Isac. "Reinforcement learning for robotic manipulation." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-216386.

Full text

Abstract:

Reinforcement learning was recently successfully used for real-world robotic manipulation tasks, without the need for human demonstration, usinga normalized advantage function-algorithm (NAF). Limitations on the shape of the advantage function however poses doubts to what kind of policies can be learned using this method. For similar tasks, convolutional neural networks have been used for pose estimation from images taken with fixed position cameras. For some applications however, this might not be a valid assumption. It was also shown that the quality of policies for robotic tasks severely de

APA, Harvard, Vancouver, ISO, and other styles

20

Cleland, Benjamin George. "Reinforcement Learning for Racecar Control." The University of Waikato, 2006. http://hdl.handle.net/10289/2507.

Full text

Abstract:

This thesis investigates the use of reinforcement learning to learn to drive a racecar in the simulated environment of the Robot Automobile Racing Simulator. Real-life race driving is known to be difficult for humans, and expert human drivers use complex sequences of actions. There are a large number of variables, some of which change stochastically and all of which may affect the outcome. This makes driving a promising domain for testing and developing Machine Learning techniques that have the potential to be robust enough to work in the real world. Therefore the principles of the algorithm

APA, Harvard, Vancouver, ISO, and other styles

21

Kim, Min Sub Computer Science &amp Engineering Faculty of Engineering UNSW. "Reinforcement learning by incremental patching." Awarded by:University of New South Wales, 2007. http://handle.unsw.edu.au/1959.4/39716.

Full text

Abstract:

This thesis investigates how an autonomous reinforcement learning agent can improve on an approximate solution by augmenting it with a small patch, which overrides the approximate solution at certain states of the problem. In reinforcement learning, many approximate solutions are smaller and easier to produce than ???flat??? solutions that maintain distinct parameters for each fully enumerated state, but the best solution within the constraints of the approximation may fall well short of global optimality. This thesis proposes that the remaining gap to global optimality can be efficiently mini

APA, Harvard, Vancouver, ISO, and other styles

22

Patrascu, Relu-Eugen. "Adaptive exploration in reinforcement learning." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ35921.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Li, Jingxian. "Reinforcement learning using sensorimotor traces." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/45590.

Full text

Abstract:

The skilled motions of humans and animals are the result of learning good solutions to difficult sensorimotor control problems. This thesis explores new models for using reinforcement learning to acquire motion skills, with potential applications to computer animation and robotics. Reinforcement learning offers a principled methodology for tackling control problems. However, it is difficult to apply in high-dimensional settings, such as the ones that we wish to explore, where the body can have many degrees of freedom, the environment can have significant complexity, and there can be fur

APA, Harvard, Vancouver, ISO, and other styles

24

Rummery, Gavin Adrian. "Problem solving with reinforcement learning." Thesis, University of Cambridge, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.363828.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

McCabe, Jonathan Aiden. "Reinforcement learning in virtual reality." Thesis, University of Cambridge, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.608852.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Budhraja, Karan Kumar. "Neuroevolution Based Inverse Reinforcement Learning." Thesis, University of Maryland, Baltimore County, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10140581.

Full text

Abstract:

<p> Motivated by such learning in nature, the problem of Learning from Demonstration is targeted at learning to perform tasks based on observed examples. One of the approaches to Learning from Demonstration is Inverse Reinforcement Learning, in which actions are observed to infer rewards. This work combines a feature based state evaluation approach to Inverse Reinforcement Learning with neuroevolution, a paradigm for modifying neural networks based on their performance on a given task. Neural networks are used to learn from a demonstrated expert policy and are evolved to generate a policy simi

APA, Harvard, Vancouver, ISO, and other styles

27

Piano, Francesco. "Deep Reinforcement Learning con PyTorch." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25340/.

Full text

Abstract:

Il Reinforcement Learning è un campo di ricerca del Machine Learning in cui la risoluzione di problemi da parte di un agente avviene scegliendo l’azione più idonea da eseguire attraverso un processo di apprendimento iterativo, in un ambiente dinamico che lo incentiva tramite ricompense. Il Deep Learning, anch’esso approccio del Machine Learning, sfruttando una rete neurale artificiale è in grado di applicare metodi di apprendimento per rappresentazione allo scopo di ottenere una struttura dei dati più idonea ad essere elaborata. Solo recentemente il Deep Reinforcement Learning, creato

APA, Harvard, Vancouver, ISO, and other styles

28

Kozlova, Olga. "Hierarchical and factored reinforcement learning." Paris 6, 2010. http://www.theses.fr/2010PA066196.

Full text

Abstract:

Les méthodes d'apprentissage par renforcement factorisé et hiérarchique (HFRL) sont basées sur le formalisme des processus de décision markoviens factorisées (FMDP) et les MDP hiérarchiques (HMDP). Dans cette thèse, nous proposons une méthode de HFRL qui utilise les approches d’apprentissage par renforcement indirect et le formalisme des options pour résoudre les problèmes de prise de décision dans les environnements dynamiques sans connaissance a priori de la structure du problème. Dans la première contribution de cette thèse, nous montrons comment modéliser les problèmes où certaines combina

APA, Harvard, Vancouver, ISO, and other styles

29

Blows, Curtly. "Reinforcement learning for telescope optimisation." Master's thesis, Faculty of Science, 2019. http://hdl.handle.net/11427/31352.

Full text

Abstract:

Reinforcement learning is a relatively new and unexplored branch of machine learning with a wide variety of applications. This study investigates reinforcement learning and provides an overview of its application to a variety of different problems. We then explore the possible use of reinforcement learning for telescope target selection and scheduling in astronomy with the hope of effectively mimicking the choices made by professional astronomers. This is relevant as next-generation astronomy surveys will require near realtime decision making in response to high-speed transient discoveries. We

APA, Harvard, Vancouver, ISO, and other styles

30

Stigenberg, Jakob. "Scheduling using Deep Reinforcement Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-284506.

Full text

Abstract:

As radio networks have continued to evolve in recent decades, so have theircomplexity and the difficulty in efficiently utilizing the available resources. Ina cellular network, the scheduler controls the allocation of time, frequencyand spatial resources to users in both uplink and downlink directions. Thescheduler is therefore a key component in terms of efficient usage of networkresources. Although the scope and characteristics of available resources forschedulers are well defined in network standards, e.g. Long-Term Evolutionor New Radio, its real implementation is not. Most previous work f

APA, Harvard, Vancouver, ISO, and other styles

31

Jesu, Alberto. "Reinforcement learning over encrypted data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23257/.

Full text

Abstract:

Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinfor

APA, Harvard, Vancouver, ISO, and other styles

32

Suggs, Sterling. "Reinforcement Learning with Auxiliary Memory." BYU ScholarsArchive, 2021. https://scholarsarchive.byu.edu/etd/9028.

Full text

Abstract:

Deep reinforcement learning algorithms typically require vast amounts of data to train to a useful level of performance. Each time new data is encountered, the network must inefficiently update all of its parameters. Auxiliary memory units can help deep neural networks train more efficiently by separating computation from storage, and providing a means to rapidly store and retrieve precise information. We present four deep reinforcement learning models augmented with external memory, and benchmark their performance on ten tasks from the Arcade Learning Environment. Our discussion and insights

APA, Harvard, Vancouver, ISO, and other styles

33

Liu, Chong. "Reinforcement learning with time perception." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/reinforcement-learning-with-time-perception(a03580bd-2dd6-4172-a061-90e8ac3022b8).html.

Full text

Abstract:

Classical value estimation reinforcement learning algorithms do not perform very well in dynamic environments. On the other hand, the reinforcement learning of animals is quite flexible: they can adapt to dynamic environments very quickly and deal with noisy inputs very effectively. One feature that may contribute to animals' good performance in dynamic environments is that they learn and perceive the time to reward. In this research, we attempt to learn and perceive the time to reward and explore situations where the learned time information can be used to improve the performance of the learn

APA, Harvard, Vancouver, ISO, and other styles

34

Tluk, von Toschanowitz Katharina. "Relevance determination in reinforcement learning." Tönning Lübeck Marburg Der Andere Verl, 2009. http://d-nb.info/993341128/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Bonneau, Maxime. "Reinforcement Learning for 5G Handover." Thesis, Linköpings universitet, Statistik och maskininlärning, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-140816.

Full text

Abstract:

The development of the 5G network is in progress, and one part of the process that needs to be optimised is the handover. This operation, consisting of changing the base station (BS) providing data to a user equipment (UE), needs to be efficient enough to be a seamless operation. From the BS point of view, this operation should be as economical as possible, while satisfying the UE needs. In this thesis, the problem of 5G handover has been addressed, and the chosen tool to solve this problem is reinforcement learning. A review of the different methods proposed by reinforcement learning led to

APA, Harvard, Vancouver, ISO, and other styles

36

Ovidiu, Chelcea Vlad, and Björn Ståhl. "Deep Reinforcement Learning for Snake." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239362.

Full text

Abstract:

The world has recently seen a large increase in both research and development and layman use of machine learning. Machine learning has a broad application domain, e.g, in marketing, production and finance. Although these applications have a predetermined set of rules or goals, this project deals with another aspect of machine learning which is general intelligence. During the course of the project a non-human player (known as agent) will learn how to play the game SNAKE without any outside influence or knowledge of the environment dynamics. After having the agent train for 66 hours and almost

APA, Harvard, Vancouver, ISO, and other styles

37

Edlund, Joar, and Jack Jönsson. "Reinforcement Learning for Video Games." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239363.

Full text

Abstract:

We present an implementation of a specific type of deep reinforcement learning algorithm known as deep Qlearning. With a Convolutional Neural Network (CNN) combined with our Q-learning algorithm, we trained an agent to play the game of Snake. The input to the CNN is the raw pixel values from the Snake environment and the output is a value function which estimates future rewards for different actions. We implemented the Q-learning algorithm on a grid based and a pixel based representation of the Snake environment and found that the algorithm can perform at human level on smaller grid based repr

APA, Harvard, Vancouver, ISO, and other styles

38

Magnusson, Björn, and Måns Forslund. "SAFE AND EFFICIENT REINFORCEMENT LEARNING." Thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-76588.

Full text

Abstract:

Pre-programming a robot may be efficient to some extent, but since a human has code the robot it will only be as efficient as the programming. The problem can solved by using machine learning, which lets the robot learn the most efficient way by itself. This thesis is continuation of a previous work that covered the development of the framework Safe-To-Explore-State-Spaces (STESS) for safe robot manipulation. This thesis evaluates the efficiency of the Q-Learning with normalized advantage function (NAF), a deep reinforcement learning algorithm, when integrated with the safety framework ST

APA, Harvard, Vancouver, ISO, and other styles

39

Liu, Bai S. M. Massachusetts Institute of Technology. "Reinforcement learning in network control." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122414.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2019<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (pages 59-91).<br>With the rapid growth of information technology, network systems have become increasingly complex. In particular, designing network control policies requires knowledge of underlying network dynamics, which are often unknown, and need to be learned. Existing reinforcement learning methods such as Q-Learning, Actor-Critic, etc. are heuristic and do not offer performance guarantees. In contrast, mode

APA, Harvard, Vancouver, ISO, and other styles

40

Garcelon, Evrard. "Constrained Exploration in Reinforcement Learning." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAG007.

Full text

Abstract:

Une application majeure de l'apprentissage machine automatisée est la personnalisation des différents contenus recommandé à différents utilisateurs. Généralement, les algorithmes étant à la base de ces systèmes sont dit supervisé. C'est-à-dire que les données utilisées lors de la phase d'apprentissage sont supposées provenir de la même distribution. Cependant, ces données sont générées par des interactions entre un utilisateur et ces mêmes algorithmes. Ainsi, les recommandations pour un utilisateur à un instant t peuvent modifier l'ensemble des recommandations pertinentes à un instant ultérieu

APA, Harvard, Vancouver, ISO, and other styles

41

Wei, Ermo. "Learning to Play Cooperative Games via Reinforcement Learning." Thesis, George Mason University, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=13420351.

Full text

Abstract:

<p> Being able to accomplish tasks with multiple learners through learning has long been a goal of the multiagent systems and machine learning communities. One of the main approaches people have taken is reinforcement learning, but due to certain conditions and restrictions, applying reinforcement learning in a multiagent setting has not achieved the same level of success when compared to its single agent counterparts. </p><p> This thesis aims to make coordination better for agents in cooperative games by improving on reinforcement learning algorithms in several ways. I begin by examining ce

APA, Harvard, Vancouver, ISO, and other styles

42

Stachenfeld, Kimberly. "Learning Neural Representations that Support Efficient Reinforcement Learning." Thesis, Princeton University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10824319.

Full text

Abstract:

<p>RL has been transformative for neuroscience by providing a normative anchor for interpreting neural and behavioral data. End-to-end RL methods have scored impressive victories with minimal compromises in autonomy, hand-engineering, and generality. The cost of this minimalism in practice is that model-free RL methods are slow to learn and generalize poorly. Humans and animals exhibit substantially improved flexibility and generalize learned information rapidly to new environment by learning invariants of the environment and features of the environment that support fast learning rapid transfe

APA, Harvard, Vancouver, ISO, and other styles

43

Effraimidis, Dimitros. "Computation approaches for continuous reinforcement learning problems." Thesis, University of Westminster, 2016. https://westminsterresearch.westminster.ac.uk/item/q0y82/computation-approaches-for-continuous-reinforcement-learning-problems.

Full text

Abstract:

Optimisation theory is at the heart of any control process, where we seek to control the behaviour of a system through a set of actions. Linear control problems have been extensively studied, and optimal control laws have been identified. But the world around us is highly non-linear and unpredictable. For these dynamic systems, which don’t possess the nice mathematical properties of the linear counterpart, the classic control theory breaks and other methods have to be employed. But nature thrives by optimising non-linear and over-complicated systems. Evolutionary Computing (EC) methods exploit

APA, Harvard, Vancouver, ISO, and other styles

44

Le, Piane Fabio. "Training cognitivo adattativo mediante Reinforcement Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17289/.

Full text

Abstract:

La sclerosi multipla (SM) è una malattia autoimmune che colpisce il sistema nervoso centrale causando varie alterazioni organiche e funzionali. In particolare, una rilevante percentuale di pazienti sviluppa deficit in differenti domini cognitivi. Per limitare la progressione di tali deficit, team specialistici hanno ideato dei protocolli per la riabilitazione cognitiva. Per effettuare le sedute di riabilitazione, i pazienti devono recarsi in cliniche specializzate, necessitando dell'assistenza di personale qualificato e svolgendo gli esercizi tramite scrittura su carta. In seguito, si è inizi

APA, Harvard, Vancouver, ISO, and other styles

45

Mariani, Tommaso. "Deep reinforcement learning for industrial applications." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20548/.

Full text

Abstract:

In recent years there has been a growing attention from the world of research and companies in the field of Machine Learning. This interest, thanks mainly to the increasing availability of large amounts of data, and the respective strengthening of the hardware sector useful for their analysis, has led to the birth of Deep Learning. The growing computing capacity and the use of mathematical optimization techniques, already studied in depth but with few applications due to a low computational power, have then allowed the development of a new approach called Reinforcement Learning. This thesis w

APA, Harvard, Vancouver, ISO, and other styles

46

Rossi, Martina. "Opponent Modelling using Inverse Reinforcement Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22263/.

Full text

Abstract:

Un’area di ricerca particolarmente attiva ultimamente nel campo dell'intelligenza artificiale (IA) riguarda lo studio di agenti autonomi, notevolmente diffusi anche nella vita quotidiana. L'obiettivo principale è sviluppare agenti che interagiscano in modo efficiente con altri agenti o esseri umani. Di conseguenza, queste relazioni potrebbero essere notevolmente semplificate grazie alla capacità di dedurre autonomamente le preferenze di altre entità e di adattare di conseguenza la strategia dell'agente. Pertanto, lo scopo di questa tesi è implementare un agente, in grado di apprendere, che int

APA, Harvard, Vancouver, ISO, and other styles

47

Borga, Magnus. "Reinforcement Learning Using Local Adaptive Models." Licentiate thesis, Linköping University, Linköping University, Computer Vision, 1995. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-53352.

Full text

Abstract:

<p>In this thesis, the theory of reinforcement learning is described and its relation to learning in biological systems is discussed. Some basic issues in reinforcement learning, the credit assignment problem and perceptual aliasing, are considered. The methods of temporal difference are described. Three important design issues are discussed: information representation and system architecture, rules for improving the behaviour and rules for the reward mechanisms. The use of local adaptive models in reinforcement learning is suggested and exemplified by some experiments. This idea is behind all

APA, Harvard, Vancouver, ISO, and other styles

48

Mastour, Eshgh Somayeh Sadat. "Distributed Reinforcement Learning for Overlay Networks." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-92131.

Full text

Abstract:

In this thesis, we study Collaborative Reinforcement Learning (CRL) in the context of Information Retrieval in unstructured distributed systems. Collaborative reinforcement learning is an extension to reinforcement learning to support multiple agents that both share value functions and cooperate to solve tasks. Specifically, we propose and develop an algorithm for searching in peer to peer systems by using collaborative reinforcement learning. We present a search technique that achieve higher performance than currently available techniques, but is straightforward and practical enough to be eas

APA, Harvard, Vancouver, ISO, and other styles

49

Humphrys, Mark. "Action selection methods using reinforcement learning." Thesis, University of Cambridge, 1996. https://www.repository.cam.ac.uk/handle/1810/252269.

Full text

Abstract:

The Action Selection problem is the problem of run-time choice between conflicting and heterogeneous goals, a central problem in the simulation of whole creatures (as opposed to the solution of isolated uninterrupted tasks). This thesis argues that Reinforcement Learning has been overlooked in the solution of the Action Selection problem. Considering a decentralised model of mind, with internal tension and competition between selfish behaviors, this thesis introduces an algorithm called "W-learning", whereby different parts of the mind modify their behavior based on whether or not they are suc

APA, Harvard, Vancouver, ISO, and other styles

50

Namvar, Gharehshiran Omid. "Reinforcement learning in non-stationary games." Thesis, University of British Columbia, 2015. http://hdl.handle.net/2429/51993.

Full text

Abstract:

The unifying theme of this thesis is the design and analysis of adaptive procedures that are aimed at learning the optimal decision in the presence of uncertainty. The first part is devoted to strategic decision making involving multiple individuals with conflicting interests. This is the subject of non-cooperative game theory. The proliferation of social networks has led to new ways of sharing information. Individuals subscribe to social groups, in which their experiences are shared. This new information patterns facilitate the resolution of uncertainties. We present an adaptive learning alg

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!