Gotowa bibliografia na temat „Actor-critic algorithm”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Actor-critic algorithm”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Actor-critic algorithm"
Wang, Jing, i Ioannis Ch Paschalidis. "An Actor-Critic Algorithm With Second-Order Actor and Critic". IEEE Transactions on Automatic Control 62, nr 6 (czerwiec 2017): 2689–703. http://dx.doi.org/10.1109/tac.2016.2616384.
Pełny tekst źródłaZheng, Liyuan, Tanner Fiez, Zane Alumbaugh, Benjamin Chasnov i Lillian J. Ratliff. "Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 8 (28.06.2022): 9217–24. http://dx.doi.org/10.1609/aaai.v36i8.20908.
Pełny tekst źródłaIwaki, Ryo, i Minoru Asada. "Implicit incremental natural actor critic algorithm". Neural Networks 109 (styczeń 2019): 103–12. http://dx.doi.org/10.1016/j.neunet.2018.10.007.
Pełny tekst źródłaKim, Gi-Soo, Jane P. Kim i Hyun-Joon Yang. "Robust Tests in Online Decision-Making". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 9 (28.06.2022): 10016–24. http://dx.doi.org/10.1609/aaai.v36i9.21240.
Pełny tekst źródłaSergey, Denisov, i Jee-Hyong Lee. "Actor-Critic Algorithm with Transition Cost Estimation". International Journal of Fuzzy Logic and Intelligent Systems 16, nr 4 (25.12.2016): 270–75. http://dx.doi.org/10.5391/ijfis.2016.16.4.270.
Pełny tekst źródłaAhmed, Ayman Elshabrawy M. "Controller parameter tuning using actor-critic algorithm". IOP Conference Series: Materials Science and Engineering 610 (11.10.2019): 012054. http://dx.doi.org/10.1088/1757-899x/610/1/012054.
Pełny tekst źródłaDing, Siyuan, Shengxiang Li, Guangyi Liu, Ou Li, Ke Ke, Yijie Bai i Weiye Chen. "Decentralized Multiagent Actor-Critic Algorithm Based on Message Diffusion". Journal of Sensors 2021 (8.12.2021): 1–14. http://dx.doi.org/10.1155/2021/8739206.
Pełny tekst źródłaHafez, Muhammad Burhan, Cornelius Weber, Matthias Kerzel i Stefan Wermter. "Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning". Paladyn, Journal of Behavioral Robotics 10, nr 1 (1.01.2019): 14–29. http://dx.doi.org/10.1515/pjbr-2019-0005.
Pełny tekst źródłaZhang, Haifei, Jian Xu, Jian Zhang i Quan Liu. "Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms". Computational Intelligence and Neuroscience 2022 (18.11.2022): 1–10. http://dx.doi.org/10.1155/2022/1117781.
Pełny tekst źródłaJain, Arushi, Gandharv Patil, Ayush Jain, Khimya Khetarpal i Doina Precup. "Variance Penalized On-Policy and Off-Policy Actor-Critic". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 9 (18.05.2021): 7899–907. http://dx.doi.org/10.1609/aaai.v35i9.16964.
Pełny tekst źródłaRozprawy doktorskie na temat "Actor-critic algorithm"
Konda, Vijaymohan (Vijaymohan Gao) 1973. "Actor-critic algorithms". Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/8120.
Pełny tekst źródłaIncludes bibliographical references (leaves 143-147).
Many complex decision making problems like scheduling in manufacturing systems, portfolio management in finance, admission control in communication networks etc., with clear and precise objectives, can be formulated as stochastic dynamic programming problems in which the objective of decision making is to maximize a single "overall" reward. In these formulations, finding an optimal decision policy involves computing a certain "value function" which assigns to each state the optimal reward one would obtain if the system was started from that state. This function then naturally prescribes the optimal policy, which is to take decisions that drive the system to states with maximum value. For many practical problems, the computation of the exact value function is intractable, analytically and numerically, due to the enormous size of the state space. Therefore one has to resort to one of the following approximation methods to find a good sub-optimal policy: (1) Approximate the value function. (2) Restrict the search for a good policy to a smaller family of policies. In this thesis, we propose and study actor-critic algorithms which combine the above two approaches with simulation to find the best policy among a parameterized class of policies. Actor-critic algorithms have two learning units: an actor and a critic. An actor is a decision maker with a tunable parameter. A critic is a function approximator. The critic tries to approximate the value function of the policy used by the actor, and the actor in turn tries to improve its policy based on the current approximation provided by the critic. Furthermore, the critic evolves on a faster time-scale than the actor.
(cont.) We propose several variants of actor-critic algorithms. In all the variants, the critic uses Temporal Difference (TD) learning with linear function approximation. Some of the variants are inspired by a new geometric interpretation of the formula for the gradient of the overall reward with respect to the actor parameters. This interpretation suggests a natural set of basis functions for the critic, determined by the family of policies parameterized by the actor's parameters. We concentrate on the average expected reward criterion but we also show how the algorithms can be modified for other objective criteria. We prove convergence of the algorithms for problems with general (finite, countable, or continuous) state and decision spaces. To compute the rate of convergence (ROC) of our algorithms, we develop a general theory of the ROC of two-time-scale algorithms and we apply it to study our algorithms. In the process, we study the ROC of TD learning and compare it with related methods such as Least Squares TD (LSTD). We study the effect of the basis functions used for linear function approximation on the ROC of TD. We also show that the ROC of actor-critic algorithms does not depend on the actual basis functions used in the critic but depends only on the subspace spanned by them and study this dependence. Finally, we compare the performance of our algorithms with other algorithms that optimize over a parameterized family of policies. We show that when only the "natural" basis functions are used for the critic, the rate of convergence of the actor- critic algorithms is the same as that of certain stochastic gradient descent algorithms ...
by Vijaymohan Konda.
Ph.D.
Saxena, Naman. "Average Reward Actor-Critic with Deterministic Policy Search". Thesis, 2023. https://etd.iisc.ac.in/handle/2005/6175.
Pełny tekst źródłaDiddigi, Raghuram Bharadwaj. "Reinforcement Learning Algorithms for Off-Policy, Multi-Agent Learning and Applications to Smart Grids". Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5673.
Pełny tekst źródłaLakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization". Thesis, 2012. http://etd.iisc.ac.in/handle/2005/3245.
Pełny tekst źródłaLakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization". Thesis, 2012. http://hdl.handle.net/2005/3245.
Pełny tekst źródłaCzęści książek na temat "Actor-critic algorithm"
Kim, Chayoung, Jung-min Park i Hye-young Kim. "An Actor-Critic Algorithm for SVM Hyperparameters". W Information Science and Applications 2018, 653–61. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-1056-0_64.
Pełny tekst źródłaZha, ZhongYi, XueSong Tang i Bo Wang. "An Advanced Actor-Critic Algorithm for Training Video Game AI". W Neural Computing for Advanced Applications, 368–80. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-7670-6_31.
Pełny tekst źródłaMelo, Francisco S., i Manuel Lopes. "Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs". W Machine Learning and Knowledge Discovery in Databases, 66–81. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-87481-2_5.
Pełny tekst źródłaSun, Qifeng, Hui Ren, Youxiang Duan i Yanan Yan. "The Adaptive PID Controlling Algorithm Using Asynchronous Advantage Actor-Critic Learning Method". W Simulation Tools and Techniques, 498–507. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32216-8_48.
Pełny tekst źródłaLiu, Guiliang, Xu Li, Miningming Sun i Ping Li. "An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction". W Proceedings of the 2020 SIAM International Conference on Data Mining, 217–25. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2020. http://dx.doi.org/10.1137/1.9781611976236.25.
Pełny tekst źródłaCheng, Yuhu, Huanting Feng i Xuesong Wang. "Actor-Critic Algorithm Based on Incremental Least-Squares Temporal Difference with Eligibility Trace". W Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, 183–88. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-25944-9_24.
Pełny tekst źródłaJiang, Haobo, Jianjun Qian, Jin Xie i Jian Yang. "Episode-Experience Replay Based Tree-Backup Method for Off-Policy Actor-Critic Algorithm". W Pattern Recognition and Computer Vision, 562–73. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-03398-9_48.
Pełny tekst źródłaChuyen, T. D., Dao Huy Du, N. D. Dien, R. V. Hoa i N. V. Toan. "Building Intelligent Navigation System for Mobile Robots Based on the Actor – Critic Algorithm". W Advances in Engineering Research and Application, 227–38. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-92574-1_24.
Pełny tekst źródłaZhang, Huaqing, Hongbin Ma i Ying Jin. "An Improved Off-Policy Actor-Critic Algorithm with Historical Behaviors Reusing for Robotic Control". W Intelligent Robotics and Applications, 449–58. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-13841-6_41.
Pełny tekst źródłaPark, Jooyoung, Jongho Kim i Daesung Kang. "An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm". W Computational Intelligence and Security, 65–72. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11596448_9.
Pełny tekst źródłaStreszczenia konferencji na temat "Actor-critic algorithm"
Wang, Jing, i Ioannis Ch Paschalidis. "A Hessian actor-critic algorithm". W 2014 IEEE 53rd Annual Conference on Decision and Control (CDC). IEEE, 2014. http://dx.doi.org/10.1109/cdc.2014.7039533.
Pełny tekst źródłaYaputra, Jordi, i Suyanto Suyanto. "The Effect of Discounting Actor-loss in Actor-Critic Algorithm". W 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). IEEE, 2021. http://dx.doi.org/10.1109/isriti54043.2021.9702883.
Pełny tekst źródłaAleixo, Everton, Juan Colonna i Raimundo Barreto. "SVC-A2C - Actor Critic Algorithm to Improve Smart Vacuum Cleaner". W IX Simpósio Brasileiro de Engenharia de Sistemas Computacionais. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/sbesc_estendido.2019.8637.
Pełny tekst źródłaPrabuchandran K.J., Shalabh Bhatnagar i Vivek S. Borkar. "An actor critic algorithm based on Grassmanian search". W 2014 IEEE 53rd Annual Conference on Decision and Control (CDC). IEEE, 2014. http://dx.doi.org/10.1109/cdc.2014.7039948.
Pełny tekst źródłaYang, Zhuoran, Kaiqing Zhang, Mingyi Hong i Tamer Basar. "A Finite Sample Analysis of the Actor-Critic Algorithm". W 2018 IEEE Conference on Decision and Control (CDC). IEEE, 2018. http://dx.doi.org/10.1109/cdc.2018.8619440.
Pełny tekst źródłaVrushabh, D., Shalini K i K. Sonam. "Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator". W 2020 7th International Conference on Control, Decision and Information Technologies (CoDIT). IEEE, 2020. http://dx.doi.org/10.1109/codit49905.2020.9263785.
Pełny tekst źródłaPaschalidis, Ioannis Ch, i Yingwei Lin. "Mobile agent coordination via a distributed actor-critic algorithm". W Automation (MED 2011). IEEE, 2011. http://dx.doi.org/10.1109/med.2011.5983038.
Pełny tekst źródłaDiddigi, Raghuram Bharadwaj, Prateek Jain, Prabuchandran K. J i Shalabh Bhatnagar. "Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm". W 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892303.
Pełny tekst źródłaLiu, Bo, Yue Zhang, Shupo Fu i Xuan Liu. "Reduce UAV Coverage Energy Consumption through Actor-Critic Algorithm". W 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN). IEEE, 2019. http://dx.doi.org/10.1109/msn48538.2019.00069.
Pełny tekst źródłaZhong, Shan, Quan Liu, Shengrong Gong, Qiming Fu i Jin Xu. "Efficient actor-critic algorithm with dual piecewise model learning". W 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2017. http://dx.doi.org/10.1109/ssci.2017.8280911.
Pełny tekst źródła