Academic literature on the topic 'Actor-critic algorithm'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Actor-critic algorithm.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Actor-critic algorithm"
Wang, Jing, and Ioannis Ch Paschalidis. "An Actor-Critic Algorithm With Second-Order Actor and Critic." IEEE Transactions on Automatic Control 62, no. 6 (June 2017): 2689–703. http://dx.doi.org/10.1109/tac.2016.2616384.
Full textZheng, Liyuan, Tanner Fiez, Zane Alumbaugh, Benjamin Chasnov, and Lillian J. Ratliff. "Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 9217–24. http://dx.doi.org/10.1609/aaai.v36i8.20908.
Full textIwaki, Ryo, and Minoru Asada. "Implicit incremental natural actor critic algorithm." Neural Networks 109 (January 2019): 103–12. http://dx.doi.org/10.1016/j.neunet.2018.10.007.
Full textKim, Gi-Soo, Jane P. Kim, and Hyun-Joon Yang. "Robust Tests in Online Decision-Making." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 10016–24. http://dx.doi.org/10.1609/aaai.v36i9.21240.
Full textSergey, Denisov, and Jee-Hyong Lee. "Actor-Critic Algorithm with Transition Cost Estimation." International Journal of Fuzzy Logic and Intelligent Systems 16, no. 4 (December 25, 2016): 270–75. http://dx.doi.org/10.5391/ijfis.2016.16.4.270.
Full textAhmed, Ayman Elshabrawy M. "Controller parameter tuning using actor-critic algorithm." IOP Conference Series: Materials Science and Engineering 610 (October 11, 2019): 012054. http://dx.doi.org/10.1088/1757-899x/610/1/012054.
Full textDing, Siyuan, Shengxiang Li, Guangyi Liu, Ou Li, Ke Ke, Yijie Bai, and Weiye Chen. "Decentralized Multiagent Actor-Critic Algorithm Based on Message Diffusion." Journal of Sensors 2021 (December 8, 2021): 1–14. http://dx.doi.org/10.1155/2021/8739206.
Full textHafez, Muhammad Burhan, Cornelius Weber, Matthias Kerzel, and Stefan Wermter. "Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning." Paladyn, Journal of Behavioral Robotics 10, no. 1 (January 1, 2019): 14–29. http://dx.doi.org/10.1515/pjbr-2019-0005.
Full textZhang, Haifei, Jian Xu, Jian Zhang, and Quan Liu. "Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms." Computational Intelligence and Neuroscience 2022 (November 18, 2022): 1–10. http://dx.doi.org/10.1155/2022/1117781.
Full textJain, Arushi, Gandharv Patil, Ayush Jain, Khimya Khetarpal, and Doina Precup. "Variance Penalized On-Policy and Off-Policy Actor-Critic." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 9 (May 18, 2021): 7899–907. http://dx.doi.org/10.1609/aaai.v35i9.16964.
Full textDissertations / Theses on the topic "Actor-critic algorithm"
Konda, Vijaymohan (Vijaymohan Gao) 1973. "Actor-critic algorithms." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/8120.
Full textIncludes bibliographical references (leaves 143-147).
Many complex decision making problems like scheduling in manufacturing systems, portfolio management in finance, admission control in communication networks etc., with clear and precise objectives, can be formulated as stochastic dynamic programming problems in which the objective of decision making is to maximize a single "overall" reward. In these formulations, finding an optimal decision policy involves computing a certain "value function" which assigns to each state the optimal reward one would obtain if the system was started from that state. This function then naturally prescribes the optimal policy, which is to take decisions that drive the system to states with maximum value. For many practical problems, the computation of the exact value function is intractable, analytically and numerically, due to the enormous size of the state space. Therefore one has to resort to one of the following approximation methods to find a good sub-optimal policy: (1) Approximate the value function. (2) Restrict the search for a good policy to a smaller family of policies. In this thesis, we propose and study actor-critic algorithms which combine the above two approaches with simulation to find the best policy among a parameterized class of policies. Actor-critic algorithms have two learning units: an actor and a critic. An actor is a decision maker with a tunable parameter. A critic is a function approximator. The critic tries to approximate the value function of the policy used by the actor, and the actor in turn tries to improve its policy based on the current approximation provided by the critic. Furthermore, the critic evolves on a faster time-scale than the actor.
(cont.) We propose several variants of actor-critic algorithms. In all the variants, the critic uses Temporal Difference (TD) learning with linear function approximation. Some of the variants are inspired by a new geometric interpretation of the formula for the gradient of the overall reward with respect to the actor parameters. This interpretation suggests a natural set of basis functions for the critic, determined by the family of policies parameterized by the actor's parameters. We concentrate on the average expected reward criterion but we also show how the algorithms can be modified for other objective criteria. We prove convergence of the algorithms for problems with general (finite, countable, or continuous) state and decision spaces. To compute the rate of convergence (ROC) of our algorithms, we develop a general theory of the ROC of two-time-scale algorithms and we apply it to study our algorithms. In the process, we study the ROC of TD learning and compare it with related methods such as Least Squares TD (LSTD). We study the effect of the basis functions used for linear function approximation on the ROC of TD. We also show that the ROC of actor-critic algorithms does not depend on the actual basis functions used in the critic but depends only on the subspace spanned by them and study this dependence. Finally, we compare the performance of our algorithms with other algorithms that optimize over a parameterized family of policies. We show that when only the "natural" basis functions are used for the critic, the rate of convergence of the actor- critic algorithms is the same as that of certain stochastic gradient descent algorithms ...
by Vijaymohan Konda.
Ph.D.
Saxena, Naman. "Average Reward Actor-Critic with Deterministic Policy Search." Thesis, 2023. https://etd.iisc.ac.in/handle/2005/6175.
Full textDiddigi, Raghuram Bharadwaj. "Reinforcement Learning Algorithms for Off-Policy, Multi-Agent Learning and Applications to Smart Grids." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5673.
Full textLakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization." Thesis, 2012. http://etd.iisc.ac.in/handle/2005/3245.
Full textLakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization." Thesis, 2012. http://hdl.handle.net/2005/3245.
Full textBook chapters on the topic "Actor-critic algorithm"
Kim, Chayoung, Jung-min Park, and Hye-young Kim. "An Actor-Critic Algorithm for SVM Hyperparameters." In Information Science and Applications 2018, 653–61. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-1056-0_64.
Full textZha, ZhongYi, XueSong Tang, and Bo Wang. "An Advanced Actor-Critic Algorithm for Training Video Game AI." In Neural Computing for Advanced Applications, 368–80. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-7670-6_31.
Full textMelo, Francisco S., and Manuel Lopes. "Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs." In Machine Learning and Knowledge Discovery in Databases, 66–81. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-87481-2_5.
Full textSun, Qifeng, Hui Ren, Youxiang Duan, and Yanan Yan. "The Adaptive PID Controlling Algorithm Using Asynchronous Advantage Actor-Critic Learning Method." In Simulation Tools and Techniques, 498–507. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32216-8_48.
Full textLiu, Guiliang, Xu Li, Miningming Sun, and Ping Li. "An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction." In Proceedings of the 2020 SIAM International Conference on Data Mining, 217–25. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2020. http://dx.doi.org/10.1137/1.9781611976236.25.
Full textCheng, Yuhu, Huanting Feng, and Xuesong Wang. "Actor-Critic Algorithm Based on Incremental Least-Squares Temporal Difference with Eligibility Trace." In Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, 183–88. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-25944-9_24.
Full textJiang, Haobo, Jianjun Qian, Jin Xie, and Jian Yang. "Episode-Experience Replay Based Tree-Backup Method for Off-Policy Actor-Critic Algorithm." In Pattern Recognition and Computer Vision, 562–73. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-03398-9_48.
Full textChuyen, T. D., Dao Huy Du, N. D. Dien, R. V. Hoa, and N. V. Toan. "Building Intelligent Navigation System for Mobile Robots Based on the Actor – Critic Algorithm." In Advances in Engineering Research and Application, 227–38. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-92574-1_24.
Full textZhang, Huaqing, Hongbin Ma, and Ying Jin. "An Improved Off-Policy Actor-Critic Algorithm with Historical Behaviors Reusing for Robotic Control." In Intelligent Robotics and Applications, 449–58. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-13841-6_41.
Full textPark, Jooyoung, Jongho Kim, and Daesung Kang. "An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm." In Computational Intelligence and Security, 65–72. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11596448_9.
Full textConference papers on the topic "Actor-critic algorithm"
Wang, Jing, and Ioannis Ch Paschalidis. "A Hessian actor-critic algorithm." In 2014 IEEE 53rd Annual Conference on Decision and Control (CDC). IEEE, 2014. http://dx.doi.org/10.1109/cdc.2014.7039533.
Full textYaputra, Jordi, and Suyanto Suyanto. "The Effect of Discounting Actor-loss in Actor-Critic Algorithm." In 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). IEEE, 2021. http://dx.doi.org/10.1109/isriti54043.2021.9702883.
Full textAleixo, Everton, Juan Colonna, and Raimundo Barreto. "SVC-A2C - Actor Critic Algorithm to Improve Smart Vacuum Cleaner." In IX Simpósio Brasileiro de Engenharia de Sistemas Computacionais. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/sbesc_estendido.2019.8637.
Full textPrabuchandran K.J., Shalabh Bhatnagar, and Vivek S. Borkar. "An actor critic algorithm based on Grassmanian search." In 2014 IEEE 53rd Annual Conference on Decision and Control (CDC). IEEE, 2014. http://dx.doi.org/10.1109/cdc.2014.7039948.
Full textYang, Zhuoran, Kaiqing Zhang, Mingyi Hong, and Tamer Basar. "A Finite Sample Analysis of the Actor-Critic Algorithm." In 2018 IEEE Conference on Decision and Control (CDC). IEEE, 2018. http://dx.doi.org/10.1109/cdc.2018.8619440.
Full textVrushabh, D., Shalini K, and K. Sonam. "Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator." In 2020 7th International Conference on Control, Decision and Information Technologies (CoDIT). IEEE, 2020. http://dx.doi.org/10.1109/codit49905.2020.9263785.
Full textPaschalidis, Ioannis Ch, and Yingwei Lin. "Mobile agent coordination via a distributed actor-critic algorithm." In Automation (MED 2011). IEEE, 2011. http://dx.doi.org/10.1109/med.2011.5983038.
Full textDiddigi, Raghuram Bharadwaj, Prateek Jain, Prabuchandran K. J, and Shalabh Bhatnagar. "Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm." In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892303.
Full textLiu, Bo, Yue Zhang, Shupo Fu, and Xuan Liu. "Reduce UAV Coverage Energy Consumption through Actor-Critic Algorithm." In 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN). IEEE, 2019. http://dx.doi.org/10.1109/msn48538.2019.00069.
Full textZhong, Shan, Quan Liu, Shengrong Gong, Qiming Fu, and Jin Xu. "Efficient actor-critic algorithm with dual piecewise model learning." In 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2017. http://dx.doi.org/10.1109/ssci.2017.8280911.
Full text