Academic literature on the topic 'Reinforcement learning (Machine learning)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Reinforcement learning (Machine learning).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Reinforcement learning (Machine learning)"

1

Ishii, Shin, and Wako Yoshida. "Part 4: Reinforcement learning: Machine learning and natural learning." New Generation Computing 24, no. 3 (September 2006): 325–50. http://dx.doi.org/10.1007/bf03037338.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wang, Zizhuang. "Temporal-Related Convolutional-Restricted-Boltzmann-Machine Capable of Learning Relational Order via Reinforcement Learning Procedure." International Journal of Machine Learning and Computing 7, no. 1 (February 2017): 1–8. http://dx.doi.org/10.18178/ijmlc.2017.7.1.610.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Butlin, Patrick. "Machine Learning, Functions and Goals." Croatian journal of philosophy 22, no. 66 (December 27, 2022): 351–70. http://dx.doi.org/10.52685/cjp.22.66.5.

Full text
Abstract:
Machine learning researchers distinguish between reinforcement learning and supervised learning and refer to reinforcement learning systems as “agents”. This paper vindicates the claim that systems trained by reinforcement learning are agents while those trained by supervised learning are not. Systems of both kinds satisfy Dretske’s criteria for agency, because they both learn to produce outputs selectively in response to inputs. However, reinforcement learning is sensitive to the instrumental value of outputs, giving rise to systems which exploit the effects of outputs on subsequent inputs to achieve good performance over episodes of interaction with their environments. Supervised learning systems, in contrast, merely learn to produce better outputs in response to individual inputs.
APA, Harvard, Vancouver, ISO, and other styles
4

Martín-Guerrero, José D., and Lucas Lamata. "Reinforcement Learning and Physics." Applied Sciences 11, no. 18 (September 16, 2021): 8589. http://dx.doi.org/10.3390/app11188589.

Full text
Abstract:
Machine learning techniques provide a remarkable tool for advancing scientific research, and this area has significantly grown in the past few years. In particular, reinforcement learning, an approach that maximizes a (long-term) reward by means of the actions taken by an agent in a given environment, can allow one for optimizing scientific discovery in a variety of fields such as physics, chemistry, and biology. Morover, physical systems, in particular quantum systems, may allow one for more efficient reinforcement learning protocols. In this review, we describe recent results in the field of reinforcement learning and physics. We include standard reinforcement learning techniques in the computer science community for enhancing physics research, as well as the more recent and emerging area of quantum reinforcement learning, inside quantum machine learning, for improving reinforcement learning computations.
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Yicen, Yu Lu, Xi Li, Wenxin Qiao, Zhiwei Li, and Donghao Zhao. "SFC Embedding Meets Machine Learning: Deep Reinforcement Learning Approaches." IEEE Communications Letters 25, no. 6 (June 2021): 1926–30. http://dx.doi.org/10.1109/lcomm.2021.3061991.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Popkov, Yuri S., Yuri A. Dubnov, and Alexey Yu Popkov. "Reinforcement Procedure for Randomized Machine Learning." Mathematics 11, no. 17 (August 23, 2023): 3651. http://dx.doi.org/10.3390/math11173651.

Full text
Abstract:
This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.
APA, Harvard, Vancouver, ISO, and other styles
7

Crawford, Daniel, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi, and Pooya Ronagh. "Reinforcement learning using quantum Boltzmann machines." Quantum Information and Computation 18, no. 1&2 (February 2018): 51–74. http://dx.doi.org/10.26421/qic18.1-2-3.

Full text
Abstract:
We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs are trained more effectively than restricted Boltzmann machines (RBM) with the same number of nodes. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This method also outperforms the reinforcement learning method that uses RBMs.
APA, Harvard, Vancouver, ISO, and other styles
8

Lamata, Lucas. "Quantum Reinforcement Learning with Quantum Photonics." Photonics 8, no. 2 (January 28, 2021): 33. http://dx.doi.org/10.3390/photonics8020033.

Full text
Abstract:
Quantum machine learning has emerged as a promising paradigm that could accelerate machine learning calculations. Inside this field, quantum reinforcement learning aims at designing and building quantum agents that may exchange information with their environment and adapt to it, with the aim of achieving some goal. Different quantum platforms have been considered for quantum machine learning and specifically for quantum reinforcement learning. Here, we review the field of quantum reinforcement learning and its implementation with quantum photonics. This quantum technology may enhance quantum computation and communication, as well as machine learning, via the fruitful marriage between these previously unrelated fields.
APA, Harvard, Vancouver, ISO, and other styles
9

Sahu, Santosh Kumar, Anil Mokhade, and Neeraj Dhanraj Bokde. "An Overview of Machine Learning, Deep Learning, and Reinforcement Learning-Based Techniques in Quantitative Finance: Recent Progress and Challenges." Applied Sciences 13, no. 3 (February 2, 2023): 1956. http://dx.doi.org/10.3390/app13031956.

Full text
Abstract:
Forecasting the behavior of the stock market is a classic but difficult topic, one that has attracted the interest of both economists and computer scientists. Over the course of the last couple of decades, researchers have investigated linear models as well as models that are based on machine learning (ML), deep learning (DL), reinforcement learning (RL), and deep reinforcement learning (DRL) in order to create an accurate predictive model. Machine learning algorithms can now extract high-level financial market data patterns. Investors are using deep learning models to anticipate and evaluate stock and foreign exchange markets due to the advantage of artificial intelligence. Recent years have seen a proliferation of the deep reinforcement learning algorithm’s application in algorithmic trading. DRL agents, which combine price prediction and trading signal production, have been used to construct several completely automated trading systems or strategies. Our objective is to enable interested researchers to stay current and easily imitate earlier findings. In this paper, we have worked to explain the utility of Machine Learning, Deep Learning, Reinforcement Learning, and Deep Reinforcement Learning in Quantitative Finance (QF) and the Stock Market. We also outline potential future study paths in this area based on the overview that was presented before.
APA, Harvard, Vancouver, ISO, and other styles
10

Fang, Qiang, Wenzhuo Zhang, and Xitong Wang. "Visual Navigation Using Inverse Reinforcement Learning and an Extreme Learning Machine." Electronics 10, no. 16 (August 18, 2021): 1997. http://dx.doi.org/10.3390/electronics10161997.

Full text
Abstract:
In this paper, we focus on the challenges of training efficiency, the designation of reward functions, and generalization in reinforcement learning for visual navigation and propose a regularized extreme learning machine-based inverse reinforcement learning approach (RELM-IRL) to improve the navigation performance. Our contributions are mainly three-fold: First, a framework combining extreme learning machine with inverse reinforcement learning is presented. This framework can improve the sample efficiency and obtain the reward function directly from the image information observed by the agent and improve the generation for the new target and the new environment. Second, the extreme learning machine is regularized by multi-response sparse regression and the leave-one-out method, which can further improve the generalization ability. Simulation experiments in the AI-THOR environment showed that the proposed approach outperformed previous end-to-end approaches, thus, demonstrating the effectiveness and efficiency of our approach.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Reinforcement learning (Machine learning)"

1

Hengst, Bernhard Computer Science &amp Engineering Faculty of Engineering UNSW. "Discovering hierarchy in reinforcement learning." Awarded by:University of New South Wales. Computer Science and Engineering, 2003. http://handle.unsw.edu.au/1959.4/20497.

Full text
Abstract:
This thesis addresses the open problem of automatically discovering hierarchical structure in reinforcement learning. Current algorithms for reinforcement learning fail to scale as problems become more complex. Many complex environments empirically exhibit hierarchy and can be modeled as interrelated subsystems, each in turn with hierarchic structure. Subsystems are often repetitive in time and space, meaning that they reoccur as components of different tasks or occur multiple times in different circumstances in the environment. A learning agent may sometimes scale to larger problems if it successfully exploits this repetition. Evidence suggests that a bottom up approach that repetitively finds building-blocks at one level of abstraction and uses them as background knowledge at the next level of abstraction, makes learning in many complex environments tractable. An algorithm, called HEXQ, is described that automatically decomposes and solves a multi-dimensional Markov decision problem (MDP) by constructing a multi-level hierarchy of interlinked subtasks without being given the model beforehand. The effectiveness and efficiency of the HEXQ decomposition depends largely on the choice of representation in terms of the variables, their temporal relationship and whether the problem exhibits a type of constrained stochasticity. The algorithm is first developed for stochastic shortest path problems and then extended to infinite horizon problems. The operation of the algorithm is demonstrated using a number of examples including a taxi domain, various navigation tasks, the Towers of Hanoi and a larger sporting problem. The main contributions of the thesis are the automation of (1)decomposition, (2) sub-goal identification, and (3) discovery of hierarchical structure for MDPs with states described by a number of variables or features. It points the way to further scaling opportunities that encompass approximations, partial observability, selective perception, relational representations and planning. The longer term research aim is to train rather than program intelligent agents
APA, Harvard, Vancouver, ISO, and other styles
2

Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Akrour, Riad. "Robust Preference Learning-based Reinforcement Learning." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112236/document.

Full text
Abstract:
Les contributions de la thèse sont centrées sur la prise de décisions séquentielles et plus spécialement sur l'Apprentissage par Renforcement (AR). Prenant sa source de l'apprentissage statistique au même titre que l'apprentissage supervisé et non-supervisé, l'AR a gagné en popularité ces deux dernières décennies en raisons de percées aussi bien applicatives que théoriques. L'AR suppose que l'agent (apprenant) ainsi que son environnement suivent un processus de décision stochastique Markovien sur un espace d'états et d'actions. Le processus est dit de décision parce que l'agent est appelé à choisir à chaque pas de temps du processus l'action à prendre. Il est dit stochastique parce que le choix d'une action donnée en un état donné n'implique pas le passage systématique à un état particulier mais définit plutôt une distribution sur l'espace d'états. Il est dit Markovien parce que cette distribution ne dépend que de l'état et de l'action courante. En conséquence d'un choix d'action, l'agent reçoit une récompense. Le but de l'AR est alors de résoudre le problème d'optimisation retournant le comportement qui assure à l'agent une récompense maximale tout au long de son interaction avec l'environnement. D'un point de vue pratique, un large éventail de problèmes peuvent être transformés en un problème d'AR, du Backgammon (cf. TD-Gammon, l'une des premières grandes réussites de l'AR et de l'apprentissage statistique en général, donnant lieu à un joueur expert de classe internationale) à des problèmes de décision dans le monde industriel ou médical. Seulement, le problème d'optimisation résolu par l'AR dépend de la définition préalable d'une fonction de récompense adéquate nécessitant une expertise certaine du domaine d'intérêt mais aussi du fonctionnement interne des algorithmes d'AR. En ce sens, la première contribution de la thèse a été de proposer un nouveau cadre d'apprentissage, allégeant les prérequis exigés à l'utilisateur. Ainsi, ce dernier n'a plus besoin de connaître la solution exacte du problème mais seulement de pouvoir désigner entre deux comportements, celui qui s'approche le plus de la solution. L'apprentissage se déroule en interaction entre l'utilisateur et l'agent. Cette interaction s'articule autour des trois points suivants : i) L'agent exhibe un nouveau comportement ii) l'expert le compare au meilleur comportement jusqu'à présent iii) l'agent utilise ce retour pour mettre à jour son modèle des préférences puis choisit le prochain comportement à démontrer. Afin de réduire le nombre d'interactions nécessaires entre l'utilisateur et l'agent pour que ce dernier trouve le comportement optimal, la seconde contribution de la thèse a été de définir un critère théoriquement justifié faisant le compromis entre les désirs parfois contradictoires de prendre en compte les préférences de l'utilisateur tout en exhibant des comportements suffisamment différents de ceux déjà proposés. La dernière contribution de la thèse est d'assurer la robustesse de l'algorithme face aux éventuelles erreurs d'appréciation de l'utilisateur. Ce qui arrive souvent en pratique, spécialement au début de l'interaction, quand tous les comportements proposés par l'agent sont loin de la solution attendue
The thesis contributions resolves around sequential decision taking and more precisely Reinforcement Learning (RL). Taking its root in Machine Learning in the same way as supervised and unsupervised learning, RL quickly grow in popularity within the last two decades due to a handful of achievements on both the theoretical and applicative front. RL supposes that the learning agent and its environment follow a stochastic Markovian decision process over a state and action space. The process is said of decision as the agent is asked to choose at each time step an action to take. It is said stochastic as the effect of selecting a given action in a given state does not systematically yield the same state but rather defines a distribution over the state space. It is said to be Markovian as this distribution only depends on the current state-action pair. Consequently to the choice of an action, the agent receives a reward. The RL goal is then to solve the underlying optimization problem of finding the behaviour that maximizes the sum of rewards all along the interaction of the agent with its environment. From an applicative point of view, a large spectrum of problems can be cast onto an RL one, from Backgammon (TD-Gammon, was one of Machine Learning first success giving rise to a world class player of advanced level) to decision problems in the industrial and medical world. However, the optimization problem solved by RL depends on the prevous definition of a reward function that requires a certain level of domain expertise and also knowledge of the internal quirks of RL algorithms. As such, the first contribution of the thesis was to propose a learning framework that lightens the requirements made to the user. The latter does not need anymore to know the exact solution of the problem but to only be able to choose between two behaviours exhibited by the agent, the one that matches more closely the solution. Learning is interactive between the agent and the user and resolves around the three main following points: i) The agent demonstrates a behaviour ii) The user compares it w.r.t. to the current best one iii) The agent uses this feedback to update its preference model of the user and uses it to find the next behaviour to demonstrate. To reduce the number of required interactions before finding the optimal behaviour, the second contribution of the thesis was to define a theoretically sound criterion making the trade-off between the sometimes contradicting desires of complying with the user's preferences and demonstrating sufficiently different behaviours. The last contribution was to ensure the robustness of the algorithm w.r.t. the feedback errors that the user might make. Which happens more often than not in practice, especially at the initial phase of the interaction, when all the behaviours are far from the expected solution
APA, Harvard, Vancouver, ISO, and other styles
4

Lee, Siu-keung, and 李少強. "Reinforcement learning for intelligent assembly automation." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B31244397.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tebbifakhr, Amirhossein. "Machine Translation For Machines." Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/320504.

Full text
Abstract:
Traditionally, Machine Translation (MT) systems are developed by targeting fluency (i.e. output grammaticality) and adequacy (i.e. semantic equivalence with the source text) criteria that reflect the needs of human end-users. However, recent advancements in Natural Language Processing (NLP) and the introduction of NLP tools in commercial services have opened new opportunities for MT. A particularly relevant one is related to the application of NLP technologies in low-resource language settings, for which the paucity of training data reduces the possibility to train reliable services. In this specific condition, MT can come into play by enabling the so-called “translation-based” workarounds. The idea is simple: first, input texts in the low-resource language are translated into a resource-rich target language; then, the machine-translated text is processed by well-trained NLP tools in the target language; finally, the output of these downstream components is projected back to the source language. This results in a new scenario, in which the end-user of MT technology is no longer a human but another machine. We hypothesize that current MT training approaches are not the optimal ones for this setting, in which the objective is to maximize the performance of a downstream tool fed with machine-translated text rather than human comprehension. Under this hypothesis, this thesis introduces a new research paradigm, which we named “MT for machines”, addressing a number of questions that raise from this novel view of the MT problem. Are there different quality criteria for humans and machines? What makes a good translation from the machine standpoint? What are the trade-offs between the two notions of quality? How to pursue machine-oriented objectives? How to serve different downstream components with a single MT system? How to exploit knowledge transfer to operate in different language settings with a single MT system? Elaborating on these questions, this thesis: i) introduces a novel and challenging MT paradigm, ii) proposes an effective method based on Reinforcement Learning analysing its possible variants, iii) extends the proposed method to multitask and multilingual settings so as to serve different downstream applications and languages with a single MT system, iv) studies the trade-off between machine-oriented and human-oriented criteria, and v) discusses the successful application of the approach in two real-world scenarios.
APA, Harvard, Vancouver, ISO, and other styles
6

Yang, Zhaoyuan Yang. "Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu152411491981452.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Scholz, Jonathan. "Physics-based reinforcement learning for autonomous manipulation." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54366.

Full text
Abstract:
With recent research advances, the dream of bringing domestic robots into our everyday lives has become more plausible than ever. Domestic robotics has grown dramatically in the past decade, with applications ranging from house cleaning to food service to health care. To date, the majority of the planning and control machinery for these systems are carefully designed by human engineers. A large portion of this effort goes into selecting the appropriate models and control techniques for each application, and these skills take years to master. Relieving the burden on human experts is therefore a central challenge for bringing robot technology to the masses. This work addresses this challenge by introducing a physics engine as a model space for an autonomous robot, and defining procedures for enabling robots to decide when and how to learn these models. We also present an appropriate space of motor controllers for these models, and introduce ways to intelligently select when to use each controller based on the estimated model parameters. We integrate these components into a framework called Physics-Based Reinforcement Learning, which features a stochastic physics engine as the core model structure. Together these methods enable a robot to adapt to unfamiliar environments without human intervention. The central focus of this thesis is on fast online model learning for objects with under-specified dynamics. We develop our approach across a diverse range of domestic tasks, starting with a simple table-top manipulation task, followed by a mobile manipulation task involving a single utility cart, and finally an open-ended navigation task with multiple obstacles impeding robot progress. We also present simulation results illustrating the efficiency of our method compared to existing approaches in the learning literature.
APA, Harvard, Vancouver, ISO, and other styles
8

Cleland, Andrew Lewis. "Bounding Box Improvement with Reinforcement Learning." PDXScholar, 2018. https://pdxscholar.library.pdx.edu/open_access_etds/4438.

Full text
Abstract:
In this thesis, I explore a reinforcement learning technique for improving bounding box localizations of objects in images. The model takes as input a bounding box already known to overlap an object and aims to improve the fit of the box through a series of transformations that shift the location of the box by translation, or change its size or aspect ratio. Over the course of these actions, the model adapts to new information extracted from the image. This active localization approach contrasts with existing bounding-box regression methods, which extract information from the image only once. I implement, train, and test this reinforcement learning model using data taken from the Portland State Dog-Walking image set. The model balances exploration with exploitation in training using an ε-greedy policy. I find that the performance of the model is sensitive to the ε-greedy configuration used during training, performing best when the epsilon parameter is set to very low values over the course of training. With = 0.01, I find the algorithm can improve bounding boxes in about 78% of test cases for the "dog" object category, and 76% for the "human" category.
APA, Harvard, Vancouver, ISO, and other styles
9

Piano, Francesco. "Deep Reinforcement Learning con PyTorch." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25340/.

Full text
Abstract:
Il Reinforcement Learning è un campo di ricerca del Machine Learning in cui la risoluzione di problemi da parte di un agente avviene scegliendo l’azione più idonea da eseguire attraverso un processo di apprendimento iterativo, in un ambiente dinamico che lo incentiva tramite ricompense. Il Deep Learning, anch’esso approccio del Machine Learning, sfruttando una rete neurale artificiale è in grado di applicare metodi di apprendimento per rappresentazione allo scopo di ottenere una struttura dei dati più idonea ad essere elaborata. Solo recentemente il Deep Reinforcement Learning, creato dalla combinazione di questi due paradigmi di apprendimento, ha permesso di risolvere problemi considerati prima intrattabili riscuotendo un notevole successo e rinnovando l’interesse dei ricercatori riguardo l’applicazione degli algoritmi di Reinforcement Learning. Con questa tesi si è voluto approfondire lo studio del Reinforcement Learning applicato a problemi semplici, per poi esaminare come esso possa superare i propri limiti caratteristici attraverso l’utilizzo delle reti neurali artificiali, in modo da essere applicato in un contesto di Deep Learning attraverso l'utilizzo del framework PyTorch, una libreria attualmente molto usata per il calcolo scientifico e il Machine Learning.
APA, Harvard, Vancouver, ISO, and other styles
10

Suggs, Sterling. "Reinforcement Learning with Auxiliary Memory." BYU ScholarsArchive, 2021. https://scholarsarchive.byu.edu/etd/9028.

Full text
Abstract:
Deep reinforcement learning algorithms typically require vast amounts of data to train to a useful level of performance. Each time new data is encountered, the network must inefficiently update all of its parameters. Auxiliary memory units can help deep neural networks train more efficiently by separating computation from storage, and providing a means to rapidly store and retrieve precise information. We present four deep reinforcement learning models augmented with external memory, and benchmark their performance on ten tasks from the Arcade Learning Environment. Our discussion and insights will be helpful for future RL researchers developing their own memory agents.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Reinforcement learning (Machine learning)"

1

S, Sutton Richard, ed. Reinforcement learning. Boston: Kluwer Academic Publishers, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sutton, Richard S. Reinforcement Learning. Boston, MA: Springer US, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Pack, Kaelbling Leslie, ed. Recent advances in reinforcement learning. Boston: Kluwer Academic, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Szepesvári, Csaba. Algorithms for reinforcement learning. San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool, 2010.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kaelbling, Leslie Pack. Recent advances in reinforcement learning. Boston: Kluwer Academic, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sutton, Richard S. Reinforcement learning: An introduction. Cambridge, Mass: MIT Press, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kulkarni, Parag. Reinforcement and systemic machine learning for decision making. Hoboken, NJ: John Wiley & Sons, 2012.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kulkarni, Parag. Reinforcement and Systemic Machine Learning for Decision Making. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2012. http://dx.doi.org/10.1002/9781118266502.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Whiteson, Shimon. Adaptive representations for reinforcement learning. Berlin: Springer Verlag, 2010.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

IWLCS 2006 (2006 Seattle, Wash.). Learning classifier systems: 10th international workshop, IWLCS 2006, Seattle, MA, USA, July 8, 2006, and 11th international workshop, IWLCS 2007, London, UK, July 8, 2007 : revised selected papers. Berlin: Springer, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Reinforcement learning (Machine learning)"

1

Kalita, Jugal. "Reinforcement Learning." In Machine Learning, 193–230. Boca Raton: Chapman and Hall/CRC, 2022. http://dx.doi.org/10.1201/9781003002611-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zhou, Zhi-Hua. "Reinforcement Learning." In Machine Learning, 399–430. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-1967-3_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Geetha, T. V., and S. Sendhilkumar. "Reinforcement Learning." In Machine Learning, 271–94. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003290100-11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jo, Taeho. "Reinforcement Learning." In Machine Learning Foundations, 359–84. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-65900-4_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Buhmann, M. D., Prem Melville, Vikas Sindhwani, Novi Quadrianto, Wray L. Buntine, Luís Torgo, Xinhua Zhang, et al. "Reinforcement Learning." In Encyclopedia of Machine Learning, 849–51. Boston, MA: Springer US, 2011. http://dx.doi.org/10.1007/978-0-387-30164-8_714.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kubat, Miroslav. "Reinforcement Learning." In An Introduction to Machine Learning, 277–86. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20010-1_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kubat, Miroslav. "Reinforcement Learning." In An Introduction to Machine Learning, 331–39. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-63913-0_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Labaca Castro, Raphael. "Reinforcement Learning." In Machine Learning under Malware Attack, 51–60. Wiesbaden: Springer Fachmedien Wiesbaden, 2023. http://dx.doi.org/10.1007/978-3-658-40442-0_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Coqueret, Guillaume, and Tony Guida. "Reinforcement learning." In Machine Learning for Factor Investing, 257–72. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003121596-20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Norris, Donald J. "Reinforcement learning." In Machine Learning with the Raspberry Pi, 501–53. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5174-4_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Reinforcement learning (Machine learning)"

1

"PREDICTION FOR CONTROL DELAY ON REINFORCEMENT LEARNING." In Special Session on Machine Learning. SciTePress - Science and and Technology Publications, 2011. http://dx.doi.org/10.5220/0003883405790586.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Fu, Cailing, Jochen Stollenwerk, and Carlo Holly. "Reinforcement learning for guiding optimization processes in optical design." In Applications of Machine Learning 2022, edited by Michael E. Zelinski, Tarek M. Taha, and Jonathan Howe. SPIE, 2022. http://dx.doi.org/10.1117/12.2632425.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Tittaferrante, Andrew, and Abdulsalam Yassine. "Benchmarking Offline Reinforcement Learning." In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2022. http://dx.doi.org/10.1109/icmla55696.2022.00044.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Bernstein, Alexander V., and E. V. Burnaev. "Reinforcement learning in computer vision." In Tenth International Conference on Machine Vision (ICMV 2017), edited by Jianhong Zhou, Petia Radeva, Dmitry Nikolaev, and Antanas Verikas. SPIE, 2018. http://dx.doi.org/10.1117/12.2309945.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Natarajan, Sriraam, Gautam Kunapuli, Kshitij Judah, Prasad Tadepalli, Kristian Kersting, and Jude Shavlik. "Multi-Agent Inverse Reinforcement Learning." In 2010 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2010. http://dx.doi.org/10.1109/icmla.2010.65.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Xue, Jianyong, and Frédéric Alexandre. "Developmental Modular Reinforcement Learning." In ESANN 2022 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2022. http://dx.doi.org/10.14428/esann/2022.es2022-19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Urmanov, Marat, Madina Alimanova, and Askar Nurkey. "Training Unity Machine Learning Agents using reinforcement learning method." In 2019 15th International Conference on Electronics, Computer and Computation (ICECCO). IEEE, 2019. http://dx.doi.org/10.1109/icecco48375.2019.9043194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Jin, Zhuo-Jun, Hui Qian, and Miao-Liang Zhu. "Gaussian processes in inverse reinforcement learning." In 2010 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, 2010. http://dx.doi.org/10.1109/icmlc.2010.5581063.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Arques Corrales, Pilar, and Fidel Aznar Gregori. "Swarm AGV Optimization Using Deep Reinforcement Learning." In MLMI '20: 2020 The 3rd International Conference on Machine Learning and Machine Intelligence. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3426826.3426839.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Leopold, T., G. Kern-Isberner, and G. Peters. "Combining Reinforcement Learning and Belief Revision - A Learning System for Active Vision." In British Machine Vision Conference 2008. British Machine Vision Association, 2008. http://dx.doi.org/10.5244/c.22.48.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Reinforcement learning (Machine learning)"

1

Singh, Satinder, Andrew G. Barto, and Nuttapong Chentanez. Intrinsically Motivated Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, January 2005. http://dx.doi.org/10.21236/ada440280.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ghavamzadeh, Mohammad, and Sridhar Mahadevan. Hierarchical Multiagent Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, January 2004. http://dx.doi.org/10.21236/ada440418.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Harmon, Mance E., and Stephanie S. Harmon. Reinforcement Learning: A Tutorial. Fort Belvoir, VA: Defense Technical Information Center, January 1997. http://dx.doi.org/10.21236/ada323194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Tadepalli, Prasad, and Alan Fern. Partial Planning Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, August 2012. http://dx.doi.org/10.21236/ada574717.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Vesselinov, Velimir Valentinov. Machine Learning. Office of Scientific and Technical Information (OSTI), January 2019. http://dx.doi.org/10.2172/1492563.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Valiant, L. G. Machine Learning. Fort Belvoir, VA: Defense Technical Information Center, January 1993. http://dx.doi.org/10.21236/ada283386.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Chase, Melissa P. Machine Learning. Fort Belvoir, VA: Defense Technical Information Center, April 1990. http://dx.doi.org/10.21236/ada223732.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ghavamzadeh, Mohammad, and Sridhar Mahadevan. Hierarchical Average Reward Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, June 2003. http://dx.doi.org/10.21236/ada445728.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Johnson, Daniel W. Drive-Reinforcement Learning System Applications. Fort Belvoir, VA: Defense Technical Information Center, July 1992. http://dx.doi.org/10.21236/ada264514.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kagie, Matthew J., and Park Hays. FORTE Machine Learning. Office of Scientific and Technical Information (OSTI), August 2016. http://dx.doi.org/10.2172/1561828.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography