Journal articles on the topic 'Policy gradient'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 journal articles for your research on the topic 'Policy gradient.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Cai, Qingpeng, Ling Pan, and Pingzhong Tang. "Deterministic Value-Policy Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3316–23. http://dx.doi.org/10.1609/aaai.v34i04.5732.
Full textPeters, Jan. "Policy gradient methods." Scholarpedia 5, no. 11 (2010): 3698. http://dx.doi.org/10.4249/scholarpedia.3698.
Full textZhao, Tingting, Hirotaka Hachiya, Voot Tangkaratt, Jun Morimoto, and Masashi Sugiyama. "Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration." Neural Computation 25, no. 6 (June 2013): 1512–47. http://dx.doi.org/10.1162/neco_a_00452.
Full textBaxter, J., P. L. Bartlett, and L. Weaver. "Experiments with Infinite-Horizon, Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 351–81. http://dx.doi.org/10.1613/jair.807.
Full textLe, Hung, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, and Svetha Venkatesh. "Episodic Policy Gradient Training." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7317–25. http://dx.doi.org/10.1609/aaai.v36i7.20694.
Full textBaxter, J., and P. L. Bartlett. "Infinite-Horizon Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 319–50. http://dx.doi.org/10.1613/jair.806.
Full textPajarinen, Joni, Hong Linh Thai, Riad Akrour, Jan Peters, and Gerhard Neumann. "Compatible natural gradient policy search." Machine Learning 108, no. 8-9 (May 20, 2019): 1443–66. http://dx.doi.org/10.1007/s10994-019-05807-0.
Full textBuffet, Olivier, and Douglas Aberdeen. "The factored policy-gradient planner." Artificial Intelligence 173, no. 5-6 (April 2009): 722–47. http://dx.doi.org/10.1016/j.artint.2008.11.008.
Full textWang, Lin, Xingang Xu, Xuhui Zhao, Baozhu Li, Ruijuan Zheng, and Qingtao Wu. "A randomized block policy gradient algorithm with differential privacy in Content Centric Networks." International Journal of Distributed Sensor Networks 17, no. 12 (December 2021): 155014772110599. http://dx.doi.org/10.1177/15501477211059934.
Full textAkella, Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Animashree Anandkumar, and Yisong Yue. "Deep Bayesian Quadrature Policy Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6600–6608. http://dx.doi.org/10.1609/aaai.v35i8.16817.
Full textPeters, Jan, Katharina Mulling, and Yasemin Altun. "Relative Entropy Policy Search." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 5, 2010): 1607–12. http://dx.doi.org/10.1609/aaai.v24i1.7727.
Full textHan, Shuai, Wenbo Zhou, Shuai Lü, and Jiayu Yu. "Regularly updated deterministic policy gradient algorithm." Knowledge-Based Systems 214 (February 2021): 106736. http://dx.doi.org/10.1016/j.knosys.2020.106736.
Full textD'Oro, Pierluca, Alberto Maria Metelli, Andrea Tirinzoni, Matteo Papini, and Marcello Restelli. "Gradient-Aware Model-Based Policy Search." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3801–8. http://dx.doi.org/10.1609/aaai.v34i04.5791.
Full textLi, Luntong, Dazi Li, and Tianheng Song. "Feature selection in deterministic policy gradient." Journal of Engineering 2020, no. 13 (July 1, 2020): 403–6. http://dx.doi.org/10.1049/joe.2019.1193.
Full textZhang, Chuheng, Yuanqi Li, and Jian Li. "Policy Search by Target Distribution Learning for Continuous Control." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6770–77. http://dx.doi.org/10.1609/aaai.v34i04.6156.
Full textZhang, Junzi, Jongho Kim, Brendan O'Donoghue, and Stephen Boyd. "Sample Efficient Reinforcement Learning with REINFORCE." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10887–95. http://dx.doi.org/10.1609/aaai.v35i12.17300.
Full textJiang, Zhanhong, Xian Yeow Lee, Sin Yong Tan, Kai Liang Tan, Aditya Balu, Young M. Lee, Chinmay Hegde, and Soumik Sarkar. "MDPGT: Momentum-Based Decentralized Policy Gradient Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 9377–85. http://dx.doi.org/10.1609/aaai.v36i9.21169.
Full textGomoluch, Paweł, Dalal Alrajeh, and Alessandra Russo. "Learning Classical Planning Strategies with Policy Gradient." Proceedings of the International Conference on Automated Planning and Scheduling 29 (May 25, 2021): 637–45. http://dx.doi.org/10.1609/icaps.v29i1.3531.
Full textEl-Laham, Yousef, and Monica F. Bugallo. "Policy Gradient Importance Sampling for Bayesian Inference." IEEE Transactions on Signal Processing 69 (2021): 4245–56. http://dx.doi.org/10.1109/tsp.2021.3093792.
Full textYu, Hai-Tao, Degen Huang, Fuji Ren, and Lishuang Li. "Diagnostic Evaluation of Policy-Gradient-Based Ranking." Electronics 11, no. 1 (December 23, 2021): 37. http://dx.doi.org/10.3390/electronics11010037.
Full textZhao, Tingting, Hirotaka Hachiya, Gang Niu, and Masashi Sugiyama. "Analysis and improvement of policy gradient estimation." Neural Networks 26 (February 2012): 118–29. http://dx.doi.org/10.1016/j.neunet.2011.09.005.
Full textKamiński, Bogumił. "Refined knowledge-gradient policy for learning probabilities." Operations Research Letters 43, no. 2 (March 2015): 143–47. http://dx.doi.org/10.1016/j.orl.2015.01.001.
Full textCherubini, A., F. Giannone, L. Iocchi, D. Nardi, and P. F. Palamara. "Policy gradient learning for quadruped soccer robots." Robotics and Autonomous Systems 58, no. 7 (July 2010): 872–78. http://dx.doi.org/10.1016/j.robot.2010.03.008.
Full textPirotta, Matteo, Marcello Restelli, and Luca Bascetta. "Policy gradient in Lipschitz Markov Decision Processes." Machine Learning 100, no. 2-3 (March 3, 2015): 255–83. http://dx.doi.org/10.1007/s10994-015-5484-1.
Full textLiu, Jian, and Liming Feng. "Diversity Evolutionary Policy Deep Reinforcement Learning." Computational Intelligence and Neuroscience 2021 (August 3, 2021): 1–11. http://dx.doi.org/10.1155/2021/5300189.
Full textYang, Long, Yu Zhang, Gang Zheng, Qian Zheng, Pengfei Li, Jianhang Huang, and Gang Pan. "Policy Optimization with Stochastic Mirror Descent." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8823–31. http://dx.doi.org/10.1609/aaai.v36i8.20863.
Full textZhang, Chongjie, and Victor Lesser. "Multi-Agent Learning with Policy Prediction." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 4, 2010): 927–34. http://dx.doi.org/10.1609/aaai.v24i1.7639.
Full textCatling, PC, and RJ Burt. "Studies of the Ground-Dwelling Mammals of Eucalypt Forests in South-Eastern New South Wales: the Effect of Environmental Variables on Distribution and Abundance." Wildlife Research 22, no. 6 (1995): 669. http://dx.doi.org/10.1071/wr9950669.
Full textLee, Seunghyeon, Seongho Jin, Seonghyeon Hwang, and Inho Lee. "Learning Optimal Trajectory Generation for Low-Cost Redundant Manipulator using Deep Deterministic Policy Gradient(DDPG)." Journal of Korea Robotics Society 17, no. 1 (March 1, 2022): 58–67. http://dx.doi.org/10.7746/jkros.2022.17.1.058.
Full textZhang, Matthew S., Murat A. Erdogdu, and Animesh Garg. "Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 9066–73. http://dx.doi.org/10.1609/aaai.v36i8.20891.
Full textMatsubara, Takamitsu, Jun Morimoto, Jun Nakanishi, Masa-Aki Sato, and Kenji Doya. "Learning a dynamic policy by using policy gradient: application to biped walking." Systems and Computers in Japan 38, no. 4 (2007): 25–38. http://dx.doi.org/10.1002/scj.20441.
Full textDharmavaram, Akshay, Matthew Riemer, and Shalabh Bhatnagar. "Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 10 (April 3, 2020): 13777–78. http://dx.doi.org/10.1609/aaai.v34i10.7160.
Full textL. A., Prashanth, and Michael C. Fu. "Risk-Sensitive Reinforcement Learning via Policy Gradient Search." Foundations and Trends® in Machine Learning 15, no. 5 (2022): 537–693. http://dx.doi.org/10.1561/2200000091.
Full textPetrović, Andrija, Mladen Nikolić, Miloš Jovanović, Miloš Bijanić, and Boris Delibašić. "Fair classification via Monte Carlo policy gradient method." Engineering Applications of Artificial Intelligence 104 (September 2021): 104398. http://dx.doi.org/10.1016/j.engappai.2021.104398.
Full textWang, Yingfei, and Warren B. Powell. "Finite-Time Analysis for the Knowledge-Gradient Policy." SIAM Journal on Control and Optimization 56, no. 2 (January 2018): 1105–29. http://dx.doi.org/10.1137/16m1073388.
Full textXi-Ren Cao. "A basic formula for online policy gradient algorithms." IEEE Transactions on Automatic Control 50, no. 5 (May 2005): 696–99. http://dx.doi.org/10.1109/tac.2005.847037.
Full textShi, Haibo, Yaoru Sun, Guangyuan Li, Fang Wang, Daming Wang, and Jie Li. "Hierarchical Intermittent Motor Control With Deterministic Policy Gradient." IEEE Access 7 (2019): 41799–810. http://dx.doi.org/10.1109/access.2019.2904910.
Full textLi, Xiaoguang, Xin Zhang, Lixin Wang, and Ge Yu. "Offline Multi-Policy Gradient for Latent Mixture Environments." IEEE Access 9 (2021): 801–12. http://dx.doi.org/10.1109/access.2020.3045300.
Full textFrazier, Peter I., Warren B. Powell, and Savas Dayanik. "A Knowledge-Gradient Policy for Sequential Information Collection." SIAM Journal on Control and Optimization 47, no. 5 (January 2008): 2410–39. http://dx.doi.org/10.1137/070693424.
Full textYou, Shixun, Ming Diao, Lipeng Gao, Fulong Zhang, and Huan Wang. "Target tracking strategy using deep deterministic policy gradient." Applied Soft Computing 95 (October 2020): 106490. http://dx.doi.org/10.1016/j.asoc.2020.106490.
Full textFrazier, Peter, Warren Powell, and Savas Dayanik. "The Knowledge-Gradient Policy for Correlated Normal Beliefs." INFORMS Journal on Computing 21, no. 4 (November 2009): 599–613. http://dx.doi.org/10.1287/ijoc.1080.0314.
Full textCherubini, A., F. Giannone, L. Iocchi, M. Lombardo, and G. Oriolo. "Policy gradient learning for a humanoid soccer robot." Robotics and Autonomous Systems 57, no. 8 (July 2009): 808–18. http://dx.doi.org/10.1016/j.robot.2009.03.006.
Full textZhang, Huaxiang, and Ying Fan. "An adaptive policy gradient in learning Nash equilibria." Neurocomputing 72, no. 1-3 (December 2008): 533–38. http://dx.doi.org/10.1016/j.neucom.2007.12.007.
Full textZhou, Chengmin, Bingding Huang, and Pasi Fränti. "A review of motion planning algorithms for intelligent robots." Journal of Intelligent Manufacturing 33, no. 2 (November 25, 2021): 387–424. http://dx.doi.org/10.1007/s10845-021-01867-z.
Full textYang, Lei, James Dankert, and Jennie Si. "A performance gradient perspective on gradient‐based policy iteration and a modified value iteration." International Journal of Intelligent Computing and Cybernetics 1, no. 4 (October 17, 2008): 509–20. http://dx.doi.org/10.1108/17563780810919096.
Full textGao, Binpin, Yingmei Wu, Chen Li, Kejun Zheng, Yan Wu, Mengjiao Wang, Xin Fan, and Shengya Ou. "Multi-Scenario Prediction of Landscape Ecological Risk in the Sichuan-Yunnan Ecological Barrier Based on Terrain Gradients." Land 11, no. 11 (November 18, 2022): 2079. http://dx.doi.org/10.3390/land11112079.
Full textChen, Qiulin, Karen Eggleston, Wei Zhang, Jiaying Zhao, and Sen Zhou. "The Educational Gradient in Health in China." China Quarterly 230 (May 15, 2017): 289–322. http://dx.doi.org/10.1017/s0305741017000613.
Full textPersson, Bertil R. R., and Freddy Ståhlberg. "Safety Aspects of Magnetic Resonance Examinations." International Journal of Technology Assessment in Health Care 1, no. 3 (July 1985): 647–65. http://dx.doi.org/10.1017/s0266462300001549.
Full textCohen, Andrew, Xingye Qiao, Lei Yu, Elliot Way, and Xiangrong Tong. "Diverse Exploration via Conjugate Policies for Policy Gradient Methods." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3404–11. http://dx.doi.org/10.1609/aaai.v33i01.33013404.
Full textde Jesus, Junior Costa, Jair Augusto Bottega, Marco Antonio de Souza Leite Cuadros, and Daniel Fernando Tello Gamarra. "Deep Deterministic Policy Gradient for Navigation of Mobile Robots." Journal of Intelligent & Fuzzy Systems 40, no. 1 (January 4, 2021): 349–61. http://dx.doi.org/10.3233/jifs-191711.
Full text