Academic literature on the topic 'Improper reinforcement learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Improper reinforcement learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Improper reinforcement learning"

1

Dass, Shuvalaxmi, and Akbar Siami Namin. "Reinforcement Learning for Generating Secure Configurations." Electronics 10, no. 19 (2021): 2392. http://dx.doi.org/10.3390/electronics10192392.

Full text
Abstract:
Many security problems in software systems are because of vulnerabilities caused by improper configurations. A poorly configured software system leads to a multitude of vulnerabilities that can be exploited by adversaries. The problem becomes even more serious when the architecture of the underlying system is static and the misconfiguration remains for a longer period of time, enabling adversaries to thoroughly inspect the software system under attack during the reconnaissance stage. Employing diversification techniques such as Moving Target Defense (MTD) can minimize the risk of exposing vuln
APA, Harvard, Vancouver, ISO, and other styles
2

Zhai, Peng, Jie Luo, Zhiyan Dong, Lihua Zhang, Shunli Wang, and Dingkang Yang. "Robust Adversarial Reinforcement Learning with Dissipation Inequation Constraint." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (2022): 5431–39. http://dx.doi.org/10.1609/aaai.v36i5.20481.

Full text
Abstract:
Robust adversarial reinforcement learning is an effective method to train agents to manage uncertain disturbance and modeling errors in real environments. However, for systems that are sensitive to disturbances or those that are difficult to stabilize, it is easier to learn a powerful adversary than establish a stable control policy. An improper strong adversary can destabilize the system, introduce biases in the sampling process, make the learning process unstable, and even reduce the robustness of the policy. In this study, we consider the problem of ensuring system stability during training
APA, Harvard, Vancouver, ISO, and other styles
3

Chen, Ya-Ling, Yan-Rou Cai, and Ming-Yang Cheng. "Vision-Based Robotic Object Grasping—A Deep Reinforcement Learning Approach." Machines 11, no. 2 (2023): 275. http://dx.doi.org/10.3390/machines11020275.

Full text
Abstract:
This paper focuses on developing a robotic object grasping approach that possesses the ability of self-learning, is suitable for small-volume large variety production, and has a high success rate in object grasping/pick-and-place tasks. The proposed approach consists of a computer vision-based object detection algorithm and a deep reinforcement learning algorithm with self-learning capability. In particular, the You Only Look Once (YOLO) algorithm is employed to detect and classify all objects of interest within the field of view of a camera. Based on the detection/localization and classificat
APA, Harvard, Vancouver, ISO, and other styles
4

Bi, Yunrui, Qinglin Ding, Yijun Du, Di Liu, and Shuaihang Ren. "Intelligent Traffic Control Decision-Making Based on Type-2 Fuzzy and Reinforcement Learning." Electronics 13, no. 19 (2024): 3894. http://dx.doi.org/10.3390/electronics13193894.

Full text
Abstract:
Intelligent traffic control decision-making has long been a crucial issue for improving the efficiency and safety of the intelligent transportation system. The deficiencies of the Type-1 fuzzy traffic control system in dealing with uncertainty have led to a reduced ability to address traffic congestion. Therefore, this paper proposes a Type-2 fuzzy controller for a single intersection. Based on real-time traffic flow information, the green timing of each phase is dynamically determined to achieve the minimum average vehicle delay. Additionally, in traffic light control, various factors (such a
APA, Harvard, Vancouver, ISO, and other styles
5

Hurtado-Gómez, Julián, Juan David Romo, Ricardo Salazar-Cabrera, Álvaro Pachón de la Cruz, and Juan Manuel Madrid Molina. "Traffic Signal Control System Based on Intelligent Transportation System and Reinforcement Learning." Electronics 10, no. 19 (2021): 2363. http://dx.doi.org/10.3390/electronics10192363.

Full text
Abstract:
Traffic congestion has several causes, including insufficient road capacity, unrestricted demand and improper scheduling of traffic signal phases. A great variety of efforts have been made to properly program such phases. Some of them are based on traditional transportation assumptions, and others are adaptive, allowing the system to learn the control law (signal program) from data obtained from different sources. Reinforcement Learning (RL) is a technique commonly used in previous research. However, properly determining the states and the reward is key to obtain good results and to have a rea
APA, Harvard, Vancouver, ISO, and other styles
6

Ziwei Pan, Ziwei Pan. "Design of Interactive Cultural Brand Marketing System based on Cloud Service Platform." 網際網路技術學刊 23, no. 2 (2022): 321–34. http://dx.doi.org/10.53106/160792642022032302012.

Full text
Abstract:
<p>Changes in the marketing environment and consumer behavior are the driving force for the development of online marketing. Although traditional marketing communication still exists, it has been unable to adapt to the marketing needs of modern cultural brands. On this basis, this paper combines the cloud service platform to design an interactive cultural brand marketing system. In view of the problems of improper task scheduling and resource waste in cloud platform resource scheduling in actual situations, a dynamic resource scheduling optimization model under the cloud platform environ
APA, Harvard, Vancouver, ISO, and other styles
7

Kim, Byeongjun, Gunam Kwon, Chaneun Park, and Nam Kyu Kwon. "The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place." Biomimetics 8, no. 2 (2023): 240. http://dx.doi.org/10.3390/biomimetics8020240.

Full text
Abstract:
This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Na. "Edge computing based english translation model using fuzzy semantic optimal control technique." PLOS One 20, no. 6 (2025): e0320481. https://doi.org/10.1371/journal.pone.0320481.

Full text
Abstract:
People’s need for English translation is gradually growing in the modern era of technological advancements, and a computer that can comprehend and interpret English is now more crucial than ever. Some issues, including ambiguity in English translation and improper word choice in translation techniques, must be addressed to enhance the quality of the English translation model and accuracy based on the corpus. Hence, an edge computing-based translation model (FSRL-P2O) is proposed to improve translation accuracy by using huge bilingual corpora, considering Fuzzy Semantic (FS) properties, and max
APA, Harvard, Vancouver, ISO, and other styles
9

Zhu, Wangwang, Shuli Wen, Qiang Zhao, Bing Zhang, Yuqing Huang, and Miao Zhu. "Deep Reinforcement Learning Based Optimal Operation of Low-Carbon Island Microgrid with High Renewables and Hybrid Hydrogen–Energy Storage System." Journal of Marine Science and Engineering 13, no. 2 (2025): 225. https://doi.org/10.3390/jmse13020225.

Full text
Abstract:
Hybrid hydrogen–energy storage systems play a significant role in the operation of islands microgrid with high renewable energy penetration: maintaining balance between the power supply and load demand. However, improper operation leads to undesirable costs and increases risks to voltage stability. Here, multi-time-scale scheduling is developed to reduce power costs and improve the operation performance of an island microgrid by integrating deep reinforcement learning with discrete wavelet transform to decompose and mitigate power fluctuations. Specifically, in the day-ahead stage, hydrogen pr
APA, Harvard, Vancouver, ISO, and other styles
10

Ritonga, Mahyudin, and Fitria Sartika. "Muyûl al-Talâmidh fî Tadrîs al-Qirâ’ah." Jurnal Alfazuna : Jurnal Pembelajaran Bahasa Arab dan Kebahasaaraban 6, no. 1 (2021): 36–52. http://dx.doi.org/10.15642/alfazuna.v6i1.1715.

Full text
Abstract:
Purpose- This study aims to reveal the spirit and motivation of learners in studying Qiro'ah, specifically the study is focused on the description of the forms of motivation of learners, factors that affect the motivation of learners in learning qiro'ah, as well as the steps taken by teachers in improving the spirit of learners in learning qiro'ah. Design/Methodology/Approach- Research is carried out with a qualitative approach, data collection techniques are observation, interview and documentation studies. This approach was chosen considering that the research data found and analyzed is natu
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Improper reinforcement learning"

1

BRUCHON, NIKY. "Feasibility Investigation on Several Reinforcement Learning Techniques to Improve the Performance of the FERMI Free-Electron Laser." Doctoral thesis, Università degli Studi di Trieste, 2021. http://hdl.handle.net/11368/2982117.

Full text
Abstract:
The research carried out in particle accelerator facilities does not concern only particle and condensed matter physics, although these are the main topics covered in the field. Indeed, since a particle accelerator is composed of many different sub-systems, its proper functioning depends both on each of these parts and their interconnection. It follows that the study, implementation, and improvement of the various sub-systems are fundamental points of investigation too. In particular, an interesting aspect for the automation engineering community is the control of such systems that usually are
APA, Harvard, Vancouver, ISO, and other styles
2

Kreutmayr, Fabian, and Markus Imlauer. "Application of machine learning to improve to performance of a pressure-controlled system." Technische Universität Dresden, 2020. https://tud.qucosa.de/id/qucosa%3A71076.

Full text
Abstract:
Due to the robustness and flexibility of hydraulic components, hydraulic control systems are used in a wide range of applications under various environmental conditions. However, the coverage of this broad field of applications often comes with a loss of performance. Especially when conditions and working points change often, hydraulic control systems cannot work at their optimum. Flexible electronic controllers in combination with techniques from the field of machine learning have the potential to overcome these issues. By applying a reinforcement learning algorithm, this paper examines wheth
APA, Harvard, Vancouver, ISO, and other styles
3

Zaki, Mohammadi. "Algorithms for Online Learning in Structured Environments." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/6080.

Full text
Abstract:
Online learning deals with the study of making decisions sequentially using information gathered along the way. Typical goals of an online learning agent can be to maximize the reward gained during learning or to identify the best possible action to take with the maximum (expected) reward. We study this problem in the setting where the environment has some inbuilt structure. This structure can be exploited by the learning agent while making decisions to accelerate the process of learning from data. We study a number of such problems in this dissertation. We begin with regret minimizati
APA, Harvard, Vancouver, ISO, and other styles
4

Chi, Lu-cheng, and 紀律呈. "An Improved Deep Reinforcement Learning with Sparse Rewards." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/eq94pr.

Full text
Abstract:
碩士<br>國立中山大學<br>電機工程學系研究所<br>107<br>In reinforcement learning, how an agent explores in an environment with sparse rewards is a long-standing problem. An improved deep reinforcement learning described in this thesis encourages an agent to explore unvisited environmental states in an environment with sparse rewards. In deep reinforcement learning, an agent directly uses an image observation from environment as an input to the neural network. However, some neglected observations from environment, such as depth, might provide valuable information. An improved deep reinforcement learning described
APA, Harvard, Vancouver, ISO, and other styles
5

Hsin-Jung, Huang, and 黃信榮. "Applying Reinforcement Learning to Improve NPC game Character Intelligence." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/38802886766630465543.

Full text
Abstract:
碩士<br>大葉大學<br>資訊管理學系碩士班<br>95<br>Today, video games are the most popular entertainment for young people. With rapidly developed computer technology, the quality and complexity of AI (Artificial In-telligence) used in computer games are gradually increasing. Today, AI has become a vital element of computer games. Intelligent NPC (Non-Player Character) which can act as playmates is becoming the essential element for most video games. How to enhance the intelligence of game characters has become an important research topic. This study proposes a cooperative reinforcement learning structure of NPC
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Chia-Hao, and 陳家豪. "Improve Top ASR Hypothesis with Re-correction by Reinforcement Learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/zde779.

Full text
Abstract:
碩士<br>國立中央大學<br>資訊工程學系<br>107<br>In real situations, utterances are transcribed by ASR(Automatic Speech Recognition) systems, which usually propose multiple candidate transcriptions(hypothesis). Most of the time, the first hypothesis is the best and most commonly used. But the first hypothesis of ASR in a noisy environment often misses some words that are important to the LU(Language Understanding), and these words can be found among second hypothesis. But on the whole, the first ASR hypothesis is significantly better than the second ASR hypothesis. It is not the best choice if we abandon the
APA, Harvard, Vancouver, ISO, and other styles
7

Hsu, Yung-Chi, and 徐永吉. "Improved Safe Reinforcement Learning Based Self Adaptive Evolutionary Algorithms for Neuro-Fuzzy Controller Design." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/43659775487135397105.

Full text
Abstract:
博士<br>國立交通大學<br>電機與控制工程系所<br>97<br>In this dissertation, improved safe reinforcement learning based self adaptive evolutionary algorithms (ISRL-SAEAs) are proposed for TSK-type neuro-fuzzy controller design. The ISRL-SAEAs can improve not only the reinforcement signal designed but also traditional evolutionary algorithms. There are two parts in the proposed ISRL-SAEAs. In the first part, the SAEAs are proposed to solve the following problems: 1) all the fuzzy rules are encoded into one chromosome; 2) the number of fuzzy rules has to be assigned in advance; and 3) the population cannot evaluate
APA, Harvard, Vancouver, ISO, and other styles
8

Lin, Ching-Pin, and 林敬斌. "Using Reinforcement Learning to Improve a Simple Intra-day Trading System of Taiwan Stock Index Future." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/34369847383488676186.

Full text
Abstract:
碩士<br>國立臺灣大學<br>資訊工程學研究所<br>97<br>This thesis applied Q-learning algorithm of reinforcement learning to improve a simple intra-day trading system of Taiwan stock index future. We simulate the performance of the original strategy by back-testing it with historical data. Furthermore, we use historical information as training data for reinforcement learning and examine the improved achievement. The training data are the tick data of every trading day from 2003 to 2007 and the testing period is from January 2008 to May 2009. The original strategy is a trend-following channel breakout system. We ta
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Improper reinforcement learning"

1

Urtāns, Ēvalds. Function shaping in deep learning. RTU Press, 2021. http://dx.doi.org/10.7250/9789934226854.

Full text
Abstract:
This work describes the importance of loss functions and related methods for deep reinforcement learning and deep metric learning. A novel MDQN loss function outperformed DDQN loss function in PLE computer game environments, and a novel Exponential Triplet loss function outperformed the Triplet loss function in the face re-identification task with VGGFace2 dataset reaching 85,7 % accuracy using zero-shot setting. This work also presents a novel UNet-RNN-Skip model to improve the performance of the value function for path planning tasks.
APA, Harvard, Vancouver, ISO, and other styles
2

Rohsenow, Damaris J., and Megan M. Pinkston-Camp. Cognitive-Behavioral Approaches. Edited by Kenneth J. Sher. Oxford University Press, 2014. http://dx.doi.org/10.1093/oxfordhb/9780199381708.013.010.

Full text
Abstract:
Cognitive-behavioral approaches to treatment are derived from learning principles underlying behavioral and/or cognitive therapy. Only evidence-based approaches are recommended for practice. Support for different approaches varies across substance use disorders. For alcohol use disorders, cognitive-behavioral coping skills training and cue-exposure treatment are beneficial when added to an integrated treatment program. For cocaine dependence, contingency management combined with coping skills training or community reinforcement, and coping skills training added to a full treatment program, pro
APA, Harvard, Vancouver, ISO, and other styles
3

Carmo, Mafalda. Education Applications & Developments VI. inScience Press, 2021. http://dx.doi.org/10.36315/2021eadvi.

Full text
Abstract:
In this sixth volume, a dedicated set of authors explore the Education field, contributing to the frontlines of knowledge. Success depends on the participation of those who wish to find creative solutions and believe in their potential to change the world, altogether to increase public engagement and cooperation from communities. Part of our mission is to serve society with these initiatives and promote knowledge, therefore it requires the reinforcement of research efforts, education and science and cooperation between the most diverse studies and backgrounds. The contents of this 6th edition
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Improper reinforcement learning"

1

Wang, Kunfu, Ruolin Xing, Wei Feng, and Baiqiao Huang. "A Method of UAV Formation Transformation Based on Reinforcement Learning Multi-agent." In Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications. Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-2456-9_20.

Full text
Abstract:
AbstractIn the face of increasingly complex combat tasks and unpredictable combat environment, a single UAV can not meet the operational requirements, and UAVs perform tasks in a cooperative way. In this paper, an improved heuristic reinforcement learning algorithm is proposed to solve the formation transformation problem of multiple UAVs by using multi-agent reinforcement learning algorithm and heuristic function. With the help of heuristic back-propagation algorithm for formation transformation, the convergence efficiency of reinforcement learning is improved. Through the above reinforcement
APA, Harvard, Vancouver, ISO, and other styles
2

Singh, Moirangthem Tiken, Aninda Chakrabarty, Bhargab Sarma, and Sourav Dutta. "An Improved On-Policy Reinforcement Learning Algorithm." In Advances in Intelligent Systems and Computing. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-7394-1_30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Saeed, Muhammad, Hassaan Muhammad, Narmeen Sabah, et al. "Reinforcement Learning to Improve Finite Element Simulations for Shaft and Hub Connections." In ARENA2036. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-88831-1_26.

Full text
Abstract:
Abstract Advancements in technology and numerical methods have shifted from slow, resource-intensive software to faster predictive solutions powered by artificial intelligence (AI). An exemplary case is the analysis of interference fit connections between a cylindrical shaft and hub, which has the potential to redefine optimal design, minimizing stress and maximizing torque transmission. Traditional experimental analysis using Finite Element Method (FEM) simulations is undeniably time-consuming, inefficient, and complex, thus necessitating the deployment of AI as a pivotal tool in industrial a
APA, Harvard, Vancouver, ISO, and other styles
4

Ma, Ping, and Hong-Li Zhang. "Improved Artificial Bee Colony Algorithm Based on Reinforcement Learning." In Intelligent Computing Theories and Application. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-42294-7_64.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dai, Zixiang, and Mingyan Jiang. "An Improved Lion Swarm Algorithm Based on Reinforcement Learning." In Advances in Intelligent Automation and Soft Computing. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-81007-8_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kim, Jongrae. "Improved Robustness Analysis of Reinforcement Learning Embedded Control Systems." In Robot Intelligence Technology and Applications 6. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-97672-9_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Han, Hongwei, Guanghong Gong, and Ni Li. "Improved Priority-Based Hindsight Experience Replay in Reinforcement Learning." In Communications in Computer and Information Science. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-7225-4_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ghaffari, Mohsen, Mahsa Varshosaz, Einar Broch Johnsen, and Andrzej Wąsowski. "Symbolic State Partitioning for Reinforcement Learning." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-90900-9_7.

Full text
Abstract:
Abstract Tabular reinforcement learning methods cannot operate directly on continuous state spaces. One solution to this problem is to partition the state space. A good partitioning enables generalization during learning and more efficient exploitation of prior experiences. Consequently, the learning process becomes faster and produces more reliable policies. However, partitioning introduces approximation, which is particularly harmful in the presence of nonlinear relations between state components. An ideal partition should be as coarse as possible, while capturing the key structure of the st
APA, Harvard, Vancouver, ISO, and other styles
9

Reid, Mark, and Malcolm Ryan. "Using ILP to Improve Planning in Hierarchical Reinforcement Learning." In Inductive Logic Programming. Springer Berlin Heidelberg, 2000. http://dx.doi.org/10.1007/3-540-44960-4_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Callegari, Daniel Antonio, and Flávio Moreira de Oliveira. "Applying Reinforcement Learning to Improve MCOE, an Intelligent Learning Environment for Ecology." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2000. http://dx.doi.org/10.1007/10720076_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Improper reinforcement learning"

1

Santhiya, S. Anu, N. Janavee, B. Yazhini, and K. Subramanian. "Improved Reinforcement Learning Path Planning Algorithm." In 2025 Emerging Technologies for Intelligent Systems (ETIS). IEEE, 2025. https://doi.org/10.1109/etis64005.2025.10961154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kavedia, Manoj Sheshmal, and Jitendrakumar Namdeo Shinde. "Improved Reinforcement Learning Model for Traffic Management." In 2024 4th Asian Conference on Innovation in Technology (ASIANCON). IEEE, 2024. https://doi.org/10.1109/asiancon62057.2024.10837974.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ghaffari, Mohsen, Cong Chen, Mahsa Varshosaz, Einar Broch Johnsen, and Andrzej Wąsowski. "Symbolic State Seeding Improves Coverage of Reinforcement Learning." In 2025 IEEE/ACM 20th Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS). IEEE, 2025. https://doi.org/10.1109/seams66627.2025.00009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Wen, Jing Wang, and Ning Wan. "Meta-reinforcement Learning Task Planning Based on Improved Curriculum Learning Sampling." In 2024 China Automation Congress (CAC). IEEE, 2024. https://doi.org/10.1109/cac63892.2024.10864762.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Shao, Yongqi, Renxin Xu, Cong Tan, Gaofeng Liu, Tao Fang, and Hong Huo. "Reinforcement Learning for Improved Alignment in Radiology Reports Generation." In 2024 China Automation Congress (CAC). IEEE, 2024. https://doi.org/10.1109/cac63892.2024.10864815.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Shuhai, Ruixiang Gao, and Yu Bai. "Reinforcement Learning-based Improved DDPG for Robotic Arm Grasping." In 2024 2nd International Conference on Artificial Intelligence and Automation Control (AIAC). IEEE, 2024. https://doi.org/10.1109/aiac63745.2024.10899651.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Liu, Zhenting, and Shan Liu. "Improved Residual Reinforcement Learning for Dynamic Obstacle Avoidance of Robotic Arm." In 2024 IEEE 13th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, 2024. http://dx.doi.org/10.1109/ddcls61622.2024.10606847.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Xiaoyan, Yujuan Zhang, and Lan Huang. "Improve Robustness of Safe Reinforcement Learning Against Adversarial Attacks." In 2024 4th International Conference on Electronic Information Engineering and Computer Science (EIECS). IEEE, 2024. https://doi.org/10.1109/eiecs63941.2024.10800497.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Yan, Xishan, and Weiguang Han. "A Vehicular Networking Routing Protocol Based on Improved Reinforcement Learning." In 2024 16th International Conference on Communication Software and Networks (ICCSN). IEEE, 2024. https://doi.org/10.1109/iccsn63464.2024.10793278.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Zhiling, Youcheng Wang, Qin Zhao, Ru Duan, and Xiaoxia Lu. "An AVC Optimization Strategy Based on Improved Deep Reinforcement Learning." In 2024 China International Conference on Electricity Distribution (CICED). IEEE, 2024. http://dx.doi.org/10.1109/ciced63421.2024.10753745.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Improper reinforcement learning"

1

Liu, Tairan. Addressing Urban Traffic Congestion: A Deep Reinforcement Learning-Based Approach. Mineta Transportation Institute, 2025. https://doi.org/10.31979/mti.2025.2322.

Full text
Abstract:
In an innovative venture, the research team embarked on a mission to redefine urban traffic flow by introducing an automated way to manage traffic light timings. This project integrates two critical technologies, Deep Q-Networks (DQN) and Auto-encoders, into reinforcement learning, with the goal of making traffic smoother and reducing the all-too-common road congestion in simulated city environments. Deep Q-Networks (DQN) are a form of reinforcement learning algorithms that learns the best actions to take in various situations through trial and error. Auto-encoders, on the other hand, are tool
APA, Harvard, Vancouver, ISO, and other styles
2

Pasupuleti, Murali Krishna. Stochastic Computation for AI: Bayesian Inference, Uncertainty, and Optimization. National Education Services, 2025. https://doi.org/10.62311/nesx/rriv325.

Full text
Abstract:
Abstract: Stochastic computation is a fundamental approach in artificial intelligence (AI) that enables probabilistic reasoning, uncertainty quantification, and robust decision-making in complex environments. This research explores the theoretical foundations, computational techniques, and real-world applications of stochastic methods, focusing on Bayesian inference, Monte Carlo methods, stochastic optimization, and uncertainty-aware AI models. Key topics include probabilistic graphical models, Markov Chain Monte Carlo (MCMC), variational inference, stochastic gradient descent (SGD), and Bayes
APA, Harvard, Vancouver, ISO, and other styles
3

Cristia, Julian P., Santiago Cueto, Ofer Malamud, and Raphaëlle Aulagnon. Streaking to Success: The Effects of Highlighting Streaks on Student Effort and Achievement. Inter-American Development Bank, 2024. http://dx.doi.org/10.18235/0012912.

Full text
Abstract:
We examine whether highlighting streaks encourages 4th to 6th grade students in Peru to increase their use of an online math platform and improve learning. Sixty thousand students were randomly assigned to receive messages that i) highlighted streaks, ii) provided personalized reminders with positive reinforcement, or iii) provided generic reminders, while others were assigned to a control group. Highlighting streaks and providing personalized reminders significantly increased platform use compared to generic reminders and the control group, with streaks more effective on the intensive margin
APA, Harvard, Vancouver, ISO, and other styles
4

Rinuado, Christina, William Leonard, Christopher Morey, Theresa Coumbe, Jaylen Hopson, and Robert Hilborn. Artificial intelligence (AI)–enabled wargaming agent training. Engineer Research and Development Center (U.S.), 2024. http://dx.doi.org/10.21079/11681/48419.

Full text
Abstract:
Fiscal Year 2021 (FY21) work from the Engineer Research and Development Center Institute for Systems Engineering Research lever-aged deep reinforcement learning to develop intelligent systems (red team agents) capable of exhibiting credible behavior within a military course of action wargaming maritime framework infrastructure. Building from the FY21 research, this research effort sought to explore options to improve upon the wargaming framework infrastructure and to investigate opportunities to improve artificial intelligence (AI) agent behavior. Wargaming framework infrastructure enhancement
APA, Harvard, Vancouver, ISO, and other styles
5

Miles, Gaines E., Yael Edan, F. Tom Turpin, et al. Expert Sensor for Site Specification Application of Agricultural Chemicals. United States Department of Agriculture, 1995. http://dx.doi.org/10.32747/1995.7570567.bard.

Full text
Abstract:
In this work multispectral reflectance images are used in conjunction with a neural network classifier for the purpose of detecting and classifying weeds under real field conditions. Multispectral reflectance images which contained different combinations of weeds and crops were taken under actual field conditions. This multispectral reflectance information was used to develop algorithms that could segment the plants from the background as well as classify them into weeds or crops. In order to segment the plants from the background the multispectrial reflectance of plants and background were st
APA, Harvard, Vancouver, ISO, and other styles
6

A Decision-Making Method for Connected Autonomous Driving Based on Reinforcement Learning. SAE International, 2020. http://dx.doi.org/10.4271/2020-01-5154.

Full text
Abstract:
At present, with the development of Intelligent Vehicle Infrastructure Cooperative Systems (IVICS), the decision-making for automated vehicle based on connected environment conditions has attracted more attentions. Reliability, efficiency and generalization performance are the basic requirements for the vehicle decision-making system. Therefore, this paper proposed a decision-making method for connected autonomous driving based on Wasserstein Generative Adversarial Nets-Deep Deterministic Policy Gradient (WGAIL-DDPG) algorithm. In which, the key components for reinforcement learning (RL) model
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!