Academic literature on the topic 'Reinforcement Learning Generalization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Reinforcement Learning Generalization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Reinforcement Learning Generalization"

1

Kwon, Sunggyu, and Kwang Y. Lee. "GENERALIZATION OF REINFORCEMENT LEARNING WITH CMAC." IFAC Proceedings Volumes 38, no. 1 (2005): 360–65. http://dx.doi.org/10.3182/20050703-6-cz-1902.01138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wu, Keyu, Min Wu, Zhenghua Chen, Yuecong Xu, and Xiaoli Li. "Generalizing Reinforcement Learning through Fusing Self-Supervised Learning into Intrinsic Motivation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8683–90. http://dx.doi.org/10.1609/aaai.v36i8.20847.

Full text
Abstract:
Despite the great potential of reinforcement learning (RL) in solving complex decision-making problems, generalization remains one of its key challenges, leading to difficulty in deploying learned RL policies to new environments. In this paper, we propose to improve the generalization of RL algorithms through fusing Self-supervised learning into Intrinsic Motivation (SIM). Specifically, SIM boosts representation learning through driving the cross-correlation matrix between the embeddings of augmented and non-augmented samples close to the identity matrix. This aims to increase the similarity between the embedding vectors of a sample and its augmented version while minimizing the redundancy between the components of these vectors. Meanwhile, the redundancy reduction based self-supervised loss is converted to an intrinsic reward to further improve generalization in RL via an auxiliary objective. As a general paradigm, SIM can be implemented on top of any RL algorithm. Extensive evaluations have been performed on a diversity of tasks. Experimental results demonstrate that SIM consistently outperforms the state-of-the-art methods and exhibits superior generalization capability and sample efficiency.
APA, Harvard, Vancouver, ISO, and other styles
3

Wimmer, G. Elliott, Nathaniel D. Daw, and Daphna Shohamy. "Generalization of value in reinforcement learning by humans." European Journal of Neuroscience 35, no. 7 (April 2012): 1092–104. http://dx.doi.org/10.1111/j.1460-9568.2012.08017.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hashemzadeh, Maryam, Reshad Hosseini, and Majid Nili Ahmadabadi. "Clustering subspace generalization to obtain faster reinforcement learning." Evolving Systems 11, no. 1 (July 4, 2019): 89–103. http://dx.doi.org/10.1007/s12530-019-09290-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Gershman, Samuel J., and Yael Niv. "Novelty and Inductive Generalization in Human Reinforcement Learning." Topics in Cognitive Science 7, no. 3 (March 23, 2015): 391–415. http://dx.doi.org/10.1111/tops.12138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Matiisen, Tambet, Aqeel Labash, Daniel Majoral, Jaan Aru, and Raul Vicente. "Do Deep Reinforcement Learning Agents Model Intentions?" Stats 6, no. 1 (December 28, 2022): 50–66. http://dx.doi.org/10.3390/stats6010004.

Full text
Abstract:
Inferring other agents’ mental states, such as their knowledge, beliefs and intentions, is thought to be essential for effective interactions with other agents. Recently, multi-agent systems trained via deep reinforcement learning have been shown to succeed in solving various tasks. Still, how each agent models or represents other agents in their environment remains unclear. In this work, we test whether deep reinforcement learning agents trained with the multi-agent deep deterministic policy gradient (MADDPG) algorithm explicitly represent other agents’ intentions (their specific aims or plans) during a task in which the agents have to coordinate the covering of different spots in a 2D environment. In particular, we tracked over time the performance of a linear decoder trained to predict the final targets of all agents from the hidden-layer activations of each agent’s neural network controller. We observed that the hidden layers of agents represented explicit information about other agents’ intentions, i.e., the target landmark the other agent ended up covering. We also performed a series of experiments in which some agents were replaced by others with fixed targets to test the levels of generalization of the trained agents. We noticed that during the training phase, the agents developed a preference for each landmark, which hindered generalization. To alleviate the above problem, we evaluated simple changes to the MADDPG training algorithm which lead to better generalization against unseen agents. Our method for confirming intention modeling in deep learning agents is simple to implement and can be used to improve the generalization of multi-agent systems in fields such as robotics, autonomous vehicles and smart cities.
APA, Harvard, Vancouver, ISO, and other styles
7

Fang, Qiang, Wenzhuo Zhang, and Xitong Wang. "Visual Navigation Using Inverse Reinforcement Learning and an Extreme Learning Machine." Electronics 10, no. 16 (August 18, 2021): 1997. http://dx.doi.org/10.3390/electronics10161997.

Full text
Abstract:
In this paper, we focus on the challenges of training efficiency, the designation of reward functions, and generalization in reinforcement learning for visual navigation and propose a regularized extreme learning machine-based inverse reinforcement learning approach (RELM-IRL) to improve the navigation performance. Our contributions are mainly three-fold: First, a framework combining extreme learning machine with inverse reinforcement learning is presented. This framework can improve the sample efficiency and obtain the reward function directly from the image information observed by the agent and improve the generation for the new target and the new environment. Second, the extreme learning machine is regularized by multi-response sparse regression and the leave-one-out method, which can further improve the generalization ability. Simulation experiments in the AI-THOR environment showed that the proposed approach outperformed previous end-to-end approaches, thus, demonstrating the effectiveness and efficiency of our approach.
APA, Harvard, Vancouver, ISO, and other styles
8

Hatcho, Yasuyo, Kiyohiko Hattori, and Keiki Takadama. "Time Horizon Generalization in Reinforcement Learning: Generalizing Multiple Q-Tables in Q-Learning Agents." Journal of Advanced Computational Intelligence and Intelligent Informatics 13, no. 6 (November 20, 2009): 667–74. http://dx.doi.org/10.20965/jaciii.2009.p0667.

Full text
Abstract:
This paper focuses on generalization in reinforcement learning from the time horizon viewpoint, exploring the method that generalizes multiple Q-tables in the multiagent reinforcement learning domain. For this purpose, we propose time horizon generalization for reinforcement learning, which consists of (1) Q-table selection method and (2) Q-table merge timing method, enabling agents to (1) select which Q-tables can be generalized from among many Q-tables and (2) determine when the selected Q-tables should be generalized. Intensive simulation on the bargaining game as sequential interaction game have revealed the following implications: (1) both Q-table selection and merging timing methods help replicate the subject experimental results without ad-hoc parameter setting; and (2) such replication succeeds by agents using the proposed methods with smaller numbers of Q-tables.
APA, Harvard, Vancouver, ISO, and other styles
9

Kaelbling, L. P., M. L. Littman, and A. W. Moore. "Reinforcement Learning: A Survey." Journal of Artificial Intelligence Research 4 (May 1, 1996): 237–85. http://dx.doi.org/10.1613/jair.301.

Full text
Abstract:
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Minbeom, Kyeongha Rho, Yong-duk Kim, and Kyomin Jung. "Action-driven contrastive representation for reinforcement learning." PLOS ONE 17, no. 3 (March 18, 2022): e0265456. http://dx.doi.org/10.1371/journal.pone.0265456.

Full text
Abstract:
In reinforcement learning, reward-driven feature learning directly from high-dimensional images faces two challenges: sample-efficiency for solving control tasks and generalization to unseen observations. In prior works, these issues have been addressed through learning representation from pixel inputs. However, their representation faced the limitations of being vulnerable to the high diversity inherent in environments or not taking the characteristics for solving control tasks. To attenuate these phenomena, we propose the novel contrastive representation method, Action-Driven Auxiliary Task (ADAT), which forces a representation to concentrate on essential features for deciding actions and ignore control-irrelevant details. In the augmented state-action dictionary of ADAT, the agent learns representation to maximize agreement between observations sharing the same actions. The proposed method significantly outperforms model-free and model-based algorithms in the Atari and OpenAI ProcGen, widely used benchmarks for sample-efficiency and generalization.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Reinforcement Learning Generalization"

1

Stanley, Kelly N. "The influence of training structure and instructions on generalized stimulus equivalence classes and typicality effects /." Electronic version (PDF), 2004. http://dl.uncw.edu/etd/2004/stanleyk/kellystanley.html.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wilson, Jeanette E. "Training structure, naming and typically effects in equivalence class formation /." Electronic version (PDF), 2006. http://dl.uncw.edu/etd/2006/wilsonj/jeanettewilson.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Böhmer, Wendelin [Verfasser], Klaus [Akademischer Betreuer] Obermayer, Klaus [Gutachter] Obermayer, Marc [Gutachter] Toussaint, and Manfred [Gutachter] Opper. "Representation and generalization in autonomous reinforcement learning / Wendelin Böhmer ; Gutachter: Klaus Obermayer, Marc Toussaint, Manfred Opper ; Betreuer: Klaus Obermayer." Berlin : Technische Universität Berlin, 2017. http://d-nb.info/1156183960/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sansing, Elizabeth M. "Teaching Observational Learning to Children with Autism: An In-vivo and Video-Model Assessment." Thesis, University of North Texas, 2017. https://digital.library.unt.edu/ark:/67531/metadc1062891/.

Full text
Abstract:
Observational learning (OL) occurs when an individual contacts reinforcement as a direct result of discriminating the observed consequences of other individuals' responses. Individuals with autism spectrum disorder (ASD) may have deficits in observational learning and previous research has demonstrated that teaching a series of prerequisite skills (i.e., attending, imitation, delayed imitation, and consequence discrimination) can result in observational learning. We sequentially taught these prerequisite skills for three young children with ASD across three play-based tasks. We assessed the direct and indirect effects of training by assessing OL before and after instruction across tasks and task variations (for two participants) during both in-vivo and video-model probes using a concurrent multiple-probe design. All participants acquired the prerequisite skills and demonstrated observational learning during probes of directly-trained tasks. Generalization results varied across participants. Observational learning generalized to one untrained task for one participant. For the other two participants, observational learning generalized to variations of the trained tasks but not to untrained tasks. Generalization additionally occurred during the in-vivo probes for both participants for whom we assessed this response. Implications of these findings, as well as directions for future research, are discussed.
APA, Harvard, Vancouver, ISO, and other styles
5

Leffler, Bethany R. "Perception-based generalization in model-based reinforcement learning." 2009. http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.000051041.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Mehta, Bhairav. "On learning and generalization in unstructured taskspaces." Thesis, 2020. http://hdl.handle.net/1866/24327.

Full text
Abstract:
L'apprentissage robotique est incroyablement prometteur pour l'intelligence artificielle incarnée, avec un apprentissage par renforcement apparemment parfait pour les robots du futur: apprendre de l'expérience, s'adapter à la volée et généraliser à des scénarios invisibles. Cependant, notre réalité actuelle nécessite de grandes quantités de données pour former la plus simple des politiques d'apprentissage par renforcement robotique, ce qui a suscité un regain d'intérêt de la formation entièrement dans des simulateurs de physique efficaces. Le but étant l'intelligence incorporée, les politiques formées à la simulation sont transférées sur du matériel réel pour évaluation; cependant, comme aucune simulation n'est un modèle parfait du monde réel, les politiques transférées se heurtent à l'écart de transfert sim2real: les erreurs se sont produites lors du déplacement des politiques des simulateurs vers le monde réel en raison d'effets non modélisés dans des modèles physiques inexacts et approximatifs. La randomisation de domaine - l'idée de randomiser tous les paramètres physiques dans un simulateur, forçant une politique à être robuste aux changements de distribution - s'est avérée utile pour transférer des politiques d'apprentissage par renforcement sur de vrais robots. En pratique, cependant, la méthode implique un processus difficile, d'essais et d'erreurs, montrant une grande variance à la fois en termes de convergence et de performances. Nous introduisons Active Domain Randomization, un algorithme qui implique l'apprentissage du curriculum dans des espaces de tâches non structurés (espaces de tâches où une notion de difficulté - tâches intuitivement faciles ou difficiles - n'est pas facilement disponible). La randomisation de domaine active montre de bonnes performances sur le pourrait utiliser zero shot sur de vrais robots. La thèse introduit également d'autres variantes de l'algorithme, dont une qui permet d'incorporer un a priori de sécurité et une qui s'applique au domaine de l'apprentissage par méta-renforcement. Nous analysons également l'apprentissage du curriculum dans une perspective d'optimisation et tentons de justifier les avantages de l'algorithme en étudiant les interférences de gradient.
Robotic learning holds incredible promise for embodied artificial intelligence, with reinforcement learning seemingly a strong candidate to be the \textit{software} of robots of the future: learning from experience, adapting on the fly, and generalizing to unseen scenarios. However, our current reality requires vast amounts of data to train the simplest of robotic reinforcement learning policies, leading to a surge of interest of training entirely in efficient physics simulators. As the goal is embodied intelligence, policies trained in simulation are transferred onto real hardware for evaluation; yet, as no simulation is a perfect model of the real world, transferred policies run into the sim2real transfer gap: the errors accrued when shifting policies from simulators to the real world due to unmodeled effects in inaccurate, approximate physics models. Domain randomization - the idea of randomizing all physical parameters in a simulator, forcing a policy to be robust to distributional shifts - has proven useful in transferring reinforcement learning policies onto real robots. In practice, however, the method involves a difficult, trial-and-error process, showing high variance in both convergence and performance. We introduce Active Domain Randomization, an algorithm that involves curriculum learning in unstructured task spaces (task spaces where a notion of difficulty - intuitively easy or hard tasks - is not readily available). Active Domain Randomization shows strong performance on zero-shot transfer on real robots. The thesis also introduces other variants of the algorithm, including one that allows for the incorporation of a safety prior and one that is applicable to the field of Meta-Reinforcement Learning. We also analyze curriculum learning from an optimization perspective and attempt to justify the benefit of the algorithm by studying gradient interference.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Reinforcement Learning Generalization"

1

Higa, Jennifer J. The effects of stimulus class on dimensional contrast. 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Reinforcement Learning Generalization"

1

Fonteneau, Raphael, Susan A. Murphy, Louis Wehenkel, and Damien Ernst. "Towards Min Max Generalization in Reinforcement Learning." In Communications in Computer and Information Science, 61–77. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-19890-8_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Xudong, Gong, Jia Hongda, Zhou Xing, Feng Dawei, Ding Bo, and Xu Jie. "Improving Policy Generalization for Teacher-Student Reinforcement Learning." In Knowledge Science, Engineering and Management, 39–47. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-55393-7_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ponsen, Marc, Matthew E. Taylor, and Karl Tuyls. "Abstraction and Generalization in Reinforcement Learning: A Summary and Framework." In Adaptive and Learning Agents, 1–32. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11814-2_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zholus, Artem, and Aleksandr I. Panov. "Case-Based Task Generalization in Model-Based Reinforcement Learning." In Artificial General Intelligence, 344–54. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-93758-4_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Qian, Yiming, Fangzhou Xiong, and Zhiyong Liu. "Intra-domain Knowledge Generalization in Cross-Domain Lifelong Reinforcement Learning." In Communications in Computer and Information Science, 386–94. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-63823-8_45.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wan, Kejia, Xinhai Xu, and Yuan Li. "Improving Generalization of Reinforcement Learning for Multi-agent Combating Games." In Neural Information Processing, 64–74. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-92270-2_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Naruse, Keitarou, and Yukinori Kakazu. "Rule Generation and Generalization by Inductive Decision Tree and Reinforcement Learning." In Distributed Autonomous Robotic Systems, 91–98. Tokyo: Springer Japan, 1994. http://dx.doi.org/10.1007/978-4-431-68275-2_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Shibata, Takeshi, Ryo Yoshinaka, and Takashi Chikayama. "Probabilistic Generalization of Simple Grammars and Its Application to Reinforcement Learning." In Lecture Notes in Computer Science, 348–62. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11894841_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Zou, Qiming, and Einoshin Suzuki. "Contrastive Goal Grouping for Policy Generalization in Goal-Conditioned Reinforcement Learning." In Neural Information Processing, 240–53. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-92185-9_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Jianghao, Weihong Bi, and Mingda Li. "Hybrid Reinforcement Learning and Uneven Generalization of Learning Space Method for Robot Obstacle Avoidance." In Lecture Notes in Electrical Engineering, 175–82. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-38460-8_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Reinforcement Learning Generalization"

1

Hansen, Nicklas, and Xiaolong Wang. "Generalization in Reinforcement Learning by Soft Data Augmentation." In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021. http://dx.doi.org/10.1109/icra48506.2021.9561103.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

"A CAUTIOUS APPROACH TO GENERALIZATION IN REINFORCEMENT LEARNING." In 2nd International Conference on Agents and Artificial Intelligence. SciTePress - Science and and Technology Publications, 2010. http://dx.doi.org/10.5220/0002726900640073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Yong, Chunwei Wu, Xidong Xi, Yan Li, Guitao Cao, Wenming Cao, and Hong Wang. "Adversarial Discriminative Feature Separation for Generalization in Reinforcement Learning." In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892539.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Xu, Yunqiu, Meng Fang, Ling Chen, Yali Du, and Chengqi Zhang. "Generalization in Text-based Games via Hierarchical Reinforcement Learning." In Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.findings-emnlp.116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ouyang, Wenbin, Yisen Wang, Shaochen Han, Zhejian Jin, and Paul Weng. "Improving Generalization of Deep Reinforcement Learning-based TSP Solvers." In 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2021. http://dx.doi.org/10.1109/ssci50451.2021.9659970.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kim, Kyungsoo, Jeongsoo Ha, and Yusung Kim. "Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/437.

Full text
Abstract:
Vision-based reinforcement learning requires efficient and robust representations of image-based observations, especially when the images contain distracting (task-irrelevant) elements such as shadows, clouds, and light. It becomes more important if those distractions are not exposed during training. We design a Self-Predictive Dynamics (SPD) method to extract task-relevant features efficiently, even in unseen observations after training. SPD uses weak and strong augmentations in parallel, and learns representations by predicting inverse and forward transitions across the two-way augmented versions. In a set of MuJoCo visual control tasks and an autonomous driving task (CARLA), SPD outperforms previous studies in complex observations, and significantly improves the generalization performance for unseen observations. Our code is available at https://github.com/unigary/SPD.
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Tianying, Hao Zhang, Wei Qi Toh, Hongyuan Zhu, Cheston Tan, Yan Wu, Yong Liu, and Wei Jing. "Efficient Robotic Task Generalization Using Deep Model Fusion Reinforcement Learning." In 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2019. http://dx.doi.org/10.1109/robio49542.2019.8961391.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Oonishi, Hiroya, and Hitoshi Iima. "Improving generalization ability in a puzzle game using reinforcement learning." In 2017 IEEE Conference on Computational Intelligence and Games (CIG). IEEE, 2017. http://dx.doi.org/10.1109/cig.2017.8080441.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kanagawa, Yuji, and Tomoyuki Kaneko. "Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning." In 2019 IEEE Conference on Games (CoG). IEEE, 2019. http://dx.doi.org/10.1109/cig.2019.8848075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Fang, Fen, Wenyu Liang, Yan Wu, Qianli Xu, and Joo-Hwee Lim. "Improving Generalization of Reinforcement Learning Using a Bilinear Policy Network." In 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022. http://dx.doi.org/10.1109/icip46576.2022.9897349.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Reinforcement Learning Generalization"

1

A Decision-Making Method for Connected Autonomous Driving Based on Reinforcement Learning. SAE International, December 2020. http://dx.doi.org/10.4271/2020-01-5154.

Full text
Abstract:
At present, with the development of Intelligent Vehicle Infrastructure Cooperative Systems (IVICS), the decision-making for automated vehicle based on connected environment conditions has attracted more attentions. Reliability, efficiency and generalization performance are the basic requirements for the vehicle decision-making system. Therefore, this paper proposed a decision-making method for connected autonomous driving based on Wasserstein Generative Adversarial Nets-Deep Deterministic Policy Gradient (WGAIL-DDPG) algorithm. In which, the key components for reinforcement learning (RL) model, reward function, is designed from the aspect of vehicle serviceability, such as safety, ride comfort and handling stability. To reduce the complexity of the proposed model, an imitation learning strategy is introduced to improve the RL training process. Meanwhile, the model training strategy based on cloud computing effectively solves the problem of insufficient computing resources of the vehicle-mounted system. Test results show that the proposed method can improve the efficiency for RL training process with reliable decision making performance and reveals excellent generalization capability.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography