Gotowe bibliografie tematyczne / Reinforcement learning (Machine learning)

Gotowa bibliografia na temat „Reinforcement learning (Machine learning)”

Autor: Grafiati

Data publikacji: 4 czerwca 2021

Data aktualizacji: 7 września 2023

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Reinforcement learning (Machine learning)”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Reinforcement learning (Machine learning)"

Ishii, Shin, i Wako Yoshida. "Part 4: Reinforcement learning: Machine learning and natural learning". New Generation Computing 24, nr 3 (wrzesień 2006): 325–50. http://dx.doi.org/10.1007/bf03037338.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Wang, Zizhuang. "Temporal-Related Convolutional-Restricted-Boltzmann-Machine Capable of Learning Relational Order via Reinforcement Learning Procedure". International Journal of Machine Learning and Computing 7, nr 1 (luty 2017): 1–8. http://dx.doi.org/10.18178/ijmlc.2017.7.1.610.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Butlin, Patrick. "Machine Learning, Functions and Goals". Croatian journal of philosophy 22, nr 66 (27.12.2022): 351–70. http://dx.doi.org/10.52685/cjp.22.66.5.

Pełny tekst źródła

Streszczenie:

Machine learning researchers distinguish between reinforcement learning and supervised learning and refer to reinforcement learning systems as “agents”. This paper vindicates the claim that systems trained by reinforcement learning are agents while those trained by supervised learning are not. Systems of both kinds satisfy Dretske’s criteria for agency, because they both learn to produce outputs selectively in response to inputs. However, reinforcement learning is sensitive to the instrumental value of outputs, giving rise to systems which exploit the effects of outputs on subsequent inputs to achieve good performance over episodes of interaction with their environments. Supervised learning systems, in contrast, merely learn to produce better outputs in response to individual inputs.

Style APA, Harvard, Vancouver, ISO itp.

Martín-Guerrero, José D., i Lucas Lamata. "Reinforcement Learning and Physics". Applied Sciences 11, nr 18 (16.09.2021): 8589. http://dx.doi.org/10.3390/app11188589.

Pełny tekst źródła

Streszczenie:

Machine learning techniques provide a remarkable tool for advancing scientific research, and this area has significantly grown in the past few years. In particular, reinforcement learning, an approach that maximizes a (long-term) reward by means of the actions taken by an agent in a given environment, can allow one for optimizing scientific discovery in a variety of fields such as physics, chemistry, and biology. Morover, physical systems, in particular quantum systems, may allow one for more efficient reinforcement learning protocols. In this review, we describe recent results in the field of reinforcement learning and physics. We include standard reinforcement learning techniques in the computer science community for enhancing physics research, as well as the more recent and emerging area of quantum reinforcement learning, inside quantum machine learning, for improving reinforcement learning computations.

Style APA, Harvard, Vancouver, ISO itp.

Liu, Yicen, Yu Lu, Xi Li, Wenxin Qiao, Zhiwei Li i Donghao Zhao. "SFC Embedding Meets Machine Learning: Deep Reinforcement Learning Approaches". IEEE Communications Letters 25, nr 6 (czerwiec 2021): 1926–30. http://dx.doi.org/10.1109/lcomm.2021.3061991.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Popkov, Yuri S., Yuri A. Dubnov i Alexey Yu Popkov. "Reinforcement Procedure for Randomized Machine Learning". Mathematics 11, nr 17 (23.08.2023): 3651. http://dx.doi.org/10.3390/math11173651.

Pełny tekst źródła

Streszczenie:

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.

Style APA, Harvard, Vancouver, ISO itp.

Crawford, Daniel, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi i Pooya Ronagh. "Reinforcement learning using quantum Boltzmann machines". Quantum Information and Computation 18, nr 1&2 (luty 2018): 51–74. http://dx.doi.org/10.26421/qic18.1-2-3.

Pełny tekst źródła

Streszczenie:

We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs are trained more effectively than restricted Boltzmann machines (RBM) with the same number of nodes. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This method also outperforms the reinforcement learning method that uses RBMs.

Style APA, Harvard, Vancouver, ISO itp.

Lamata, Lucas. "Quantum Reinforcement Learning with Quantum Photonics". Photonics 8, nr 2 (28.01.2021): 33. http://dx.doi.org/10.3390/photonics8020033.

Pełny tekst źródła

Streszczenie:

Quantum machine learning has emerged as a promising paradigm that could accelerate machine learning calculations. Inside this field, quantum reinforcement learning aims at designing and building quantum agents that may exchange information with their environment and adapt to it, with the aim of achieving some goal. Different quantum platforms have been considered for quantum machine learning and specifically for quantum reinforcement learning. Here, we review the field of quantum reinforcement learning and its implementation with quantum photonics. This quantum technology may enhance quantum computation and communication, as well as machine learning, via the fruitful marriage between these previously unrelated fields.

Style APA, Harvard, Vancouver, ISO itp.

Sahu, Santosh Kumar, Anil Mokhade i Neeraj Dhanraj Bokde. "An Overview of Machine Learning, Deep Learning, and Reinforcement Learning-Based Techniques in Quantitative Finance: Recent Progress and Challenges". Applied Sciences 13, nr 3 (2.02.2023): 1956. http://dx.doi.org/10.3390/app13031956.

Pełny tekst źródła

Streszczenie:

Forecasting the behavior of the stock market is a classic but difficult topic, one that has attracted the interest of both economists and computer scientists. Over the course of the last couple of decades, researchers have investigated linear models as well as models that are based on machine learning (ML), deep learning (DL), reinforcement learning (RL), and deep reinforcement learning (DRL) in order to create an accurate predictive model. Machine learning algorithms can now extract high-level financial market data patterns. Investors are using deep learning models to anticipate and evaluate stock and foreign exchange markets due to the advantage of artificial intelligence. Recent years have seen a proliferation of the deep reinforcement learning algorithm’s application in algorithmic trading. DRL agents, which combine price prediction and trading signal production, have been used to construct several completely automated trading systems or strategies. Our objective is to enable interested researchers to stay current and easily imitate earlier findings. In this paper, we have worked to explain the utility of Machine Learning, Deep Learning, Reinforcement Learning, and Deep Reinforcement Learning in Quantitative Finance (QF) and the Stock Market. We also outline potential future study paths in this area based on the overview that was presented before.

Style APA, Harvard, Vancouver, ISO itp.

Fang, Qiang, Wenzhuo Zhang i Xitong Wang. "Visual Navigation Using Inverse Reinforcement Learning and an Extreme Learning Machine". Electronics 10, nr 16 (18.08.2021): 1997. http://dx.doi.org/10.3390/electronics10161997.

Pełny tekst źródła

Streszczenie:

In this paper, we focus on the challenges of training efficiency, the designation of reward functions, and generalization in reinforcement learning for visual navigation and propose a regularized extreme learning machine-based inverse reinforcement learning approach (RELM-IRL) to improve the navigation performance. Our contributions are mainly three-fold: First, a framework combining extreme learning machine with inverse reinforcement learning is presented. This framework can improve the sample efficiency and obtain the reward function directly from the image information observed by the agent and improve the generation for the new target and the new environment. Second, the extreme learning machine is regularized by multi-response sparse regression and the leave-one-out method, which can further improve the generalization ability. Simulation experiments in the AI-THOR environment showed that the proposed approach outperformed previous end-to-end approaches, thus, demonstrating the effectiveness and efficiency of our approach.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Rozprawy doktorskie na temat "Reinforcement learning (Machine learning)"

Hengst, Bernhard Computer Science &amp Engineering Faculty of Engineering UNSW. "Discovering hierarchy in reinforcement learning". Awarded by:University of New South Wales. Computer Science and Engineering, 2003. http://handle.unsw.edu.au/1959.4/20497.

Pełny tekst źródła

Streszczenie:

This thesis addresses the open problem of automatically discovering hierarchical structure in reinforcement learning. Current algorithms for reinforcement learning fail to scale as problems become more complex. Many complex environments empirically exhibit hierarchy and can be modeled as interrelated subsystems, each in turn with hierarchic structure. Subsystems are often repetitive in time and space, meaning that they reoccur as components of different tasks or occur multiple times in different circumstances in the environment. A learning agent may sometimes scale to larger problems if it successfully exploits this repetition. Evidence suggests that a bottom up approach that repetitively finds building-blocks at one level of abstraction and uses them as background knowledge at the next level of abstraction, makes learning in many complex environments tractable. An algorithm, called HEXQ, is described that automatically decomposes and solves a multi-dimensional Markov decision problem (MDP) by constructing a multi-level hierarchy of interlinked subtasks without being given the model beforehand. The effectiveness and efficiency of the HEXQ decomposition depends largely on the choice of representation in terms of the variables, their temporal relationship and whether the problem exhibits a type of constrained stochasticity. The algorithm is first developed for stochastic shortest path problems and then extended to infinite horizon problems. The operation of the algorithm is demonstrated using a number of examples including a taxi domain, various navigation tasks, the Towers of Hanoi and a larger sporting problem. The main contributions of the thesis are the automation of (1)decomposition, (2) sub-goal identification, and (3) discovery of hierarchical structure for MDPs with states described by a number of variables or features. It points the way to further scaling opportunities that encompass approximations, partial observability, selective perception, relational representations and planning. The longer term research aim is to train rather than program intelligent agents

Style APA, Harvard, Vancouver, ISO itp.

Tabell, Johnsson Marco, i Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents". Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Akrour, Riad. "Robust Preference Learning-based Reinforcement Learning". Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112236/document.

Pełny tekst źródła

Streszczenie:

Les contributions de la thèse sont centrées sur la prise de décisions séquentielles et plus spécialement sur l'Apprentissage par Renforcement (AR). Prenant sa source de l'apprentissage statistique au même titre que l'apprentissage supervisé et non-supervisé, l'AR a gagné en popularité ces deux dernières décennies en raisons de percées aussi bien applicatives que théoriques. L'AR suppose que l'agent (apprenant) ainsi que son environnement suivent un processus de décision stochastique Markovien sur un espace d'états et d'actions. Le processus est dit de décision parce que l'agent est appelé à choisir à chaque pas de temps du processus l'action à prendre. Il est dit stochastique parce que le choix d'une action donnée en un état donné n'implique pas le passage systématique à un état particulier mais définit plutôt une distribution sur l'espace d'états. Il est dit Markovien parce que cette distribution ne dépend que de l'état et de l'action courante. En conséquence d'un choix d'action, l'agent reçoit une récompense. Le but de l'AR est alors de résoudre le problème d'optimisation retournant le comportement qui assure à l'agent une récompense maximale tout au long de son interaction avec l'environnement. D'un point de vue pratique, un large éventail de problèmes peuvent être transformés en un problème d'AR, du Backgammon (cf. TD-Gammon, l'une des premières grandes réussites de l'AR et de l'apprentissage statistique en général, donnant lieu à un joueur expert de classe internationale) à des problèmes de décision dans le monde industriel ou médical. Seulement, le problème d'optimisation résolu par l'AR dépend de la définition préalable d'une fonction de récompense adéquate nécessitant une expertise certaine du domaine d'intérêt mais aussi du fonctionnement interne des algorithmes d'AR. En ce sens, la première contribution de la thèse a été de proposer un nouveau cadre d'apprentissage, allégeant les prérequis exigés à l'utilisateur. Ainsi, ce dernier n'a plus besoin de connaître la solution exacte du problème mais seulement de pouvoir désigner entre deux comportements, celui qui s'approche le plus de la solution. L'apprentissage se déroule en interaction entre l'utilisateur et l'agent. Cette interaction s'articule autour des trois points suivants : i) L'agent exhibe un nouveau comportement ii) l'expert le compare au meilleur comportement jusqu'à présent iii) l'agent utilise ce retour pour mettre à jour son modèle des préférences puis choisit le prochain comportement à démontrer. Afin de réduire le nombre d'interactions nécessaires entre l'utilisateur et l'agent pour que ce dernier trouve le comportement optimal, la seconde contribution de la thèse a été de définir un critère théoriquement justifié faisant le compromis entre les désirs parfois contradictoires de prendre en compte les préférences de l'utilisateur tout en exhibant des comportements suffisamment différents de ceux déjà proposés. La dernière contribution de la thèse est d'assurer la robustesse de l'algorithme face aux éventuelles erreurs d'appréciation de l'utilisateur. Ce qui arrive souvent en pratique, spécialement au début de l'interaction, quand tous les comportements proposés par l'agent sont loin de la solution attendue
The thesis contributions resolves around sequential decision taking and more precisely Reinforcement Learning (RL). Taking its root in Machine Learning in the same way as supervised and unsupervised learning, RL quickly grow in popularity within the last two decades due to a handful of achievements on both the theoretical and applicative front. RL supposes that the learning agent and its environment follow a stochastic Markovian decision process over a state and action space. The process is said of decision as the agent is asked to choose at each time step an action to take. It is said stochastic as the effect of selecting a given action in a given state does not systematically yield the same state but rather defines a distribution over the state space. It is said to be Markovian as this distribution only depends on the current state-action pair. Consequently to the choice of an action, the agent receives a reward. The RL goal is then to solve the underlying optimization problem of finding the behaviour that maximizes the sum of rewards all along the interaction of the agent with its environment. From an applicative point of view, a large spectrum of problems can be cast onto an RL one, from Backgammon (TD-Gammon, was one of Machine Learning first success giving rise to a world class player of advanced level) to decision problems in the industrial and medical world. However, the optimization problem solved by RL depends on the prevous definition of a reward function that requires a certain level of domain expertise and also knowledge of the internal quirks of RL algorithms. As such, the first contribution of the thesis was to propose a learning framework that lightens the requirements made to the user. The latter does not need anymore to know the exact solution of the problem but to only be able to choose between two behaviours exhibited by the agent, the one that matches more closely the solution. Learning is interactive between the agent and the user and resolves around the three main following points: i) The agent demonstrates a behaviour ii) The user compares it w.r.t. to the current best one iii) The agent uses this feedback to update its preference model of the user and uses it to find the next behaviour to demonstrate. To reduce the number of required interactions before finding the optimal behaviour, the second contribution of the thesis was to define a theoretically sound criterion making the trade-off between the sometimes contradicting desires of complying with the user's preferences and demonstrating sufficiently different behaviours. The last contribution was to ensure the robustness of the algorithm w.r.t. the feedback errors that the user might make. Which happens more often than not in practice, especially at the initial phase of the interaction, when all the behaviours are far from the expected solution

Style APA, Harvard, Vancouver, ISO itp.

Lee, Siu-keung, i 李少強. "Reinforcement learning for intelligent assembly automation". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B31244397.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Tebbifakhr, Amirhossein. "Machine Translation For Machines". Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/320504.

Pełny tekst źródła

Streszczenie:

Traditionally, Machine Translation (MT) systems are developed by targeting fluency (i.e. output grammaticality) and adequacy (i.e. semantic equivalence with the source text) criteria that reflect the needs of human end-users. However, recent advancements in Natural Language Processing (NLP) and the introduction of NLP tools in commercial services have opened new opportunities for MT. A particularly relevant one is related to the application of NLP technologies in low-resource language settings, for which the paucity of training data reduces the possibility to train reliable services. In this specific condition, MT can come into play by enabling the so-called “translation-based” workarounds. The idea is simple: first, input texts in the low-resource language are translated into a resource-rich target language; then, the machine-translated text is processed by well-trained NLP tools in the target language; finally, the output of these downstream components is projected back to the source language. This results in a new scenario, in which the end-user of MT technology is no longer a human but another machine. We hypothesize that current MT training approaches are not the optimal ones for this setting, in which the objective is to maximize the performance of a downstream tool fed with machine-translated text rather than human comprehension. Under this hypothesis, this thesis introduces a new research paradigm, which we named “MT for machines”, addressing a number of questions that raise from this novel view of the MT problem. Are there different quality criteria for humans and machines? What makes a good translation from the machine standpoint? What are the trade-offs between the two notions of quality? How to pursue machine-oriented objectives? How to serve different downstream components with a single MT system? How to exploit knowledge transfer to operate in different language settings with a single MT system? Elaborating on these questions, this thesis: i) introduces a novel and challenging MT paradigm, ii) proposes an effective method based on Reinforcement Learning analysing its possible variants, iii) extends the proposed method to multitask and multilingual settings so as to serve different downstream applications and languages with a single MT system, iv) studies the trade-off between machine-oriented and human-oriented criteria, and v) discusses the successful application of the approach in two real-world scenarios.

Style APA, Harvard, Vancouver, ISO itp.

Yang, Zhaoyuan Yang. "Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach". The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu152411491981452.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Scholz, Jonathan. "Physics-based reinforcement learning for autonomous manipulation". Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54366.

Pełny tekst źródła

Streszczenie:

With recent research advances, the dream of bringing domestic robots into our everyday lives has become more plausible than ever. Domestic robotics has grown dramatically in the past decade, with applications ranging from house cleaning to food service to health care. To date, the majority of the planning and control machinery for these systems are carefully designed by human engineers. A large portion of this effort goes into selecting the appropriate models and control techniques for each application, and these skills take years to master. Relieving the burden on human experts is therefore a central challenge for bringing robot technology to the masses. This work addresses this challenge by introducing a physics engine as a model space for an autonomous robot, and defining procedures for enabling robots to decide when and how to learn these models. We also present an appropriate space of motor controllers for these models, and introduce ways to intelligently select when to use each controller based on the estimated model parameters. We integrate these components into a framework called Physics-Based Reinforcement Learning, which features a stochastic physics engine as the core model structure. Together these methods enable a robot to adapt to unfamiliar environments without human intervention. The central focus of this thesis is on fast online model learning for objects with under-specified dynamics. We develop our approach across a diverse range of domestic tasks, starting with a simple table-top manipulation task, followed by a mobile manipulation task involving a single utility cart, and finally an open-ended navigation task with multiple obstacles impeding robot progress. We also present simulation results illustrating the efficiency of our method compared to existing approaches in the learning literature.

Style APA, Harvard, Vancouver, ISO itp.

Cleland, Andrew Lewis. "Bounding Box Improvement with Reinforcement Learning". PDXScholar, 2018. https://pdxscholar.library.pdx.edu/open_access_etds/4438.

Pełny tekst źródła

Streszczenie:

In this thesis, I explore a reinforcement learning technique for improving bounding box localizations of objects in images. The model takes as input a bounding box already known to overlap an object and aims to improve the fit of the box through a series of transformations that shift the location of the box by translation, or change its size or aspect ratio. Over the course of these actions, the model adapts to new information extracted from the image. This active localization approach contrasts with existing bounding-box regression methods, which extract information from the image only once. I implement, train, and test this reinforcement learning model using data taken from the Portland State Dog-Walking image set. The model balances exploration with exploitation in training using an ε-greedy policy. I find that the performance of the model is sensitive to the ε-greedy configuration used during training, performing best when the epsilon parameter is set to very low values over the course of training. With = 0.01, I find the algorithm can improve bounding boxes in about 78% of test cases for the "dog" object category, and 76% for the "human" category.

Style APA, Harvard, Vancouver, ISO itp.

Piano, Francesco. "Deep Reinforcement Learning con PyTorch". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25340/.

Pełny tekst źródła

Streszczenie:

Il Reinforcement Learning è un campo di ricerca del Machine Learning in cui la risoluzione di problemi da parte di un agente avviene scegliendo l’azione più idonea da eseguire attraverso un processo di apprendimento iterativo, in un ambiente dinamico che lo incentiva tramite ricompense. Il Deep Learning, anch’esso approccio del Machine Learning, sfruttando una rete neurale artificiale è in grado di applicare metodi di apprendimento per rappresentazione allo scopo di ottenere una struttura dei dati più idonea ad essere elaborata. Solo recentemente il Deep Reinforcement Learning, creato dalla combinazione di questi due paradigmi di apprendimento, ha permesso di risolvere problemi considerati prima intrattabili riscuotendo un notevole successo e rinnovando l’interesse dei ricercatori riguardo l’applicazione degli algoritmi di Reinforcement Learning. Con questa tesi si è voluto approfondire lo studio del Reinforcement Learning applicato a problemi semplici, per poi esaminare come esso possa superare i propri limiti caratteristici attraverso l’utilizzo delle reti neurali artificiali, in modo da essere applicato in un contesto di Deep Learning attraverso l'utilizzo del framework PyTorch, una libreria attualmente molto usata per il calcolo scientifico e il Machine Learning.

Style APA, Harvard, Vancouver, ISO itp.

Suggs, Sterling. "Reinforcement Learning with Auxiliary Memory". BYU ScholarsArchive, 2021. https://scholarsarchive.byu.edu/etd/9028.

Pełny tekst źródła

Streszczenie:

Deep reinforcement learning algorithms typically require vast amounts of data to train to a useful level of performance. Each time new data is encountered, the network must inefficiently update all of its parameters. Auxiliary memory units can help deep neural networks train more efficiently by separating computation from storage, and providing a means to rapidly store and retrieve precise information. We present four deep reinforcement learning models augmented with external memory, and benchmark their performance on ten tasks from the Arcade Learning Environment. Our discussion and insights will be helpful for future RL researchers developing their own memory agents.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Książki na temat "Reinforcement learning (Machine learning)"

S, Sutton Richard, red. Reinforcement learning. Boston: Kluwer Academic Publishers, 1992.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Sutton, Richard S. Reinforcement Learning. Boston, MA: Springer US, 1992.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Pack, Kaelbling Leslie, red. Recent advances in reinforcement learning. Boston: Kluwer Academic, 1996.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Szepesvári, Csaba. Algorithms for reinforcement learning. San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool, 2010.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kaelbling, Leslie Pack. Recent advances in reinforcement learning. Boston: Kluwer Academic, 1996.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Sutton, Richard S. Reinforcement learning: An introduction. Cambridge, Mass: MIT Press, 1998.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kulkarni, Parag. Reinforcement and systemic machine learning for decision making. Hoboken, NJ: John Wiley & Sons, 2012.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kulkarni, Parag. Reinforcement and Systemic Machine Learning for Decision Making. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2012. http://dx.doi.org/10.1002/9781118266502.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Whiteson, Shimon. Adaptive representations for reinforcement learning. Berlin: Springer Verlag, 2010.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

IWLCS 2006 (2006 Seattle, Wash.). Learning classifier systems: 10th international workshop, IWLCS 2006, Seattle, MA, USA, July 8, 2006, and 11th international workshop, IWLCS 2007, London, UK, July 8, 2007 : revised selected papers. Berlin: Springer, 2008.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Części książek na temat "Reinforcement learning (Machine learning)"

Kalita, Jugal. "Reinforcement Learning". W Machine Learning, 193–230. Boca Raton: Chapman and Hall/CRC, 2022. http://dx.doi.org/10.1201/9781003002611-5.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Zhou, Zhi-Hua. "Reinforcement Learning". W Machine Learning, 399–430. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-1967-3_16.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Geetha, T. V., i S. Sendhilkumar. "Reinforcement Learning". W Machine Learning, 271–94. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003290100-11.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Jo, Taeho. "Reinforcement Learning". W Machine Learning Foundations, 359–84. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-65900-4_16.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Buhmann, M. D., Prem Melville, Vikas Sindhwani, Novi Quadrianto, Wray L. Buntine, Luís Torgo, Xinhua Zhang i in. "Reinforcement Learning". W Encyclopedia of Machine Learning, 849–51. Boston, MA: Springer US, 2011. http://dx.doi.org/10.1007/978-0-387-30164-8_714.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kubat, Miroslav. "Reinforcement Learning". W An Introduction to Machine Learning, 277–86. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20010-1_14.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kubat, Miroslav. "Reinforcement Learning". W An Introduction to Machine Learning, 331–39. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-63913-0_17.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Labaca Castro, Raphael. "Reinforcement Learning". W Machine Learning under Malware Attack, 51–60. Wiesbaden: Springer Fachmedien Wiesbaden, 2023. http://dx.doi.org/10.1007/978-3-658-40442-0_6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Coqueret, Guillaume, i Tony Guida. "Reinforcement learning". W Machine Learning for Factor Investing, 257–72. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003121596-20.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Norris, Donald J. "Reinforcement learning". W Machine Learning with the Raspberry Pi, 501–53. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5174-4_9.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Reinforcement learning (Machine learning)"

"PREDICTION FOR CONTROL DELAY ON REINFORCEMENT LEARNING". W Special Session on Machine Learning. SciTePress - Science and and Technology Publications, 2011. http://dx.doi.org/10.5220/0003883405790586.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Fu, Cailing, Jochen Stollenwerk i Carlo Holly. "Reinforcement learning for guiding optimization processes in optical design". W Applications of Machine Learning 2022, redaktorzy Michael E. Zelinski, Tarek M. Taha i Jonathan Howe. SPIE, 2022. http://dx.doi.org/10.1117/12.2632425.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Tittaferrante, Andrew, i Abdulsalam Yassine. "Benchmarking Offline Reinforcement Learning". W 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2022. http://dx.doi.org/10.1109/icmla55696.2022.00044.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Bernstein, Alexander V., i E. V. Burnaev. "Reinforcement learning in computer vision". W Tenth International Conference on Machine Vision (ICMV 2017), redaktorzy Jianhong Zhou, Petia Radeva, Dmitry Nikolaev i Antanas Verikas. SPIE, 2018. http://dx.doi.org/10.1117/12.2309945.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Natarajan, Sriraam, Gautam Kunapuli, Kshitij Judah, Prasad Tadepalli, Kristian Kersting i Jude Shavlik. "Multi-Agent Inverse Reinforcement Learning". W 2010 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2010. http://dx.doi.org/10.1109/icmla.2010.65.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Xue, Jianyong, i Frédéric Alexandre. "Developmental Modular Reinforcement Learning". W ESANN 2022 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2022. http://dx.doi.org/10.14428/esann/2022.es2022-19.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Urmanov, Marat, Madina Alimanova i Askar Nurkey. "Training Unity Machine Learning Agents using reinforcement learning method". W 2019 15th International Conference on Electronics, Computer and Computation (ICECCO). IEEE, 2019. http://dx.doi.org/10.1109/icecco48375.2019.9043194.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Jin, Zhuo-Jun, Hui Qian i Miao-Liang Zhu. "Gaussian processes in inverse reinforcement learning". W 2010 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, 2010. http://dx.doi.org/10.1109/icmlc.2010.5581063.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Arques Corrales, Pilar, i Fidel Aznar Gregori. "Swarm AGV Optimization Using Deep Reinforcement Learning". W MLMI '20: 2020 The 3rd International Conference on Machine Learning and Machine Intelligence. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3426826.3426839.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Leopold, T., G. Kern-Isberner i G. Peters. "Combining Reinforcement Learning and Belief Revision - A Learning System for Active Vision". W British Machine Vision Conference 2008. British Machine Vision Association, 2008. http://dx.doi.org/10.5244/c.22.48.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Raporty organizacyjne na temat "Reinforcement learning (Machine learning)"

Singh, Satinder, Andrew G. Barto i Nuttapong Chentanez. Intrinsically Motivated Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, styczeń 2005. http://dx.doi.org/10.21236/ada440280.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Ghavamzadeh, Mohammad, i Sridhar Mahadevan. Hierarchical Multiagent Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, styczeń 2004. http://dx.doi.org/10.21236/ada440418.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Harmon, Mance E., i Stephanie S. Harmon. Reinforcement Learning: A Tutorial. Fort Belvoir, VA: Defense Technical Information Center, styczeń 1997. http://dx.doi.org/10.21236/ada323194.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Tadepalli, Prasad, i Alan Fern. Partial Planning Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, sierpień 2012. http://dx.doi.org/10.21236/ada574717.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Vesselinov, Velimir Valentinov. Machine Learning. Office of Scientific and Technical Information (OSTI), styczeń 2019. http://dx.doi.org/10.2172/1492563.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Valiant, L. G. Machine Learning. Fort Belvoir, VA: Defense Technical Information Center, styczeń 1993. http://dx.doi.org/10.21236/ada283386.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chase, Melissa P. Machine Learning. Fort Belvoir, VA: Defense Technical Information Center, kwiecień 1990. http://dx.doi.org/10.21236/ada223732.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Ghavamzadeh, Mohammad, i Sridhar Mahadevan. Hierarchical Average Reward Reinforcement Learning. Fort Belvoir, VA: Defense Technical Information Center, czerwiec 2003. http://dx.doi.org/10.21236/ada445728.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Johnson, Daniel W. Drive-Reinforcement Learning System Applications. Fort Belvoir, VA: Defense Technical Information Center, lipiec 1992. http://dx.doi.org/10.21236/ada264514.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kagie, Matthew J., i Park Hays. FORTE Machine Learning. Office of Scientific and Technical Information (OSTI), sierpień 2016. http://dx.doi.org/10.2172/1561828.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!