Academic literature on the topic 'Policy gradient'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Policy gradient.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Policy gradient"
Cai, Qingpeng, Ling Pan, and Pingzhong Tang. "Deterministic Value-Policy Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3316–23. http://dx.doi.org/10.1609/aaai.v34i04.5732.
Full textPeters, Jan. "Policy gradient methods." Scholarpedia 5, no. 11 (2010): 3698. http://dx.doi.org/10.4249/scholarpedia.3698.
Full textZhao, Tingting, Hirotaka Hachiya, Voot Tangkaratt, Jun Morimoto, and Masashi Sugiyama. "Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration." Neural Computation 25, no. 6 (June 2013): 1512–47. http://dx.doi.org/10.1162/neco_a_00452.
Full textBaxter, J., P. L. Bartlett, and L. Weaver. "Experiments with Infinite-Horizon, Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 351–81. http://dx.doi.org/10.1613/jair.807.
Full textLe, Hung, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, and Svetha Venkatesh. "Episodic Policy Gradient Training." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7317–25. http://dx.doi.org/10.1609/aaai.v36i7.20694.
Full textBaxter, J., and P. L. Bartlett. "Infinite-Horizon Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 319–50. http://dx.doi.org/10.1613/jair.806.
Full textPajarinen, Joni, Hong Linh Thai, Riad Akrour, Jan Peters, and Gerhard Neumann. "Compatible natural gradient policy search." Machine Learning 108, no. 8-9 (May 20, 2019): 1443–66. http://dx.doi.org/10.1007/s10994-019-05807-0.
Full textBuffet, Olivier, and Douglas Aberdeen. "The factored policy-gradient planner." Artificial Intelligence 173, no. 5-6 (April 2009): 722–47. http://dx.doi.org/10.1016/j.artint.2008.11.008.
Full textWang, Lin, Xingang Xu, Xuhui Zhao, Baozhu Li, Ruijuan Zheng, and Qingtao Wu. "A randomized block policy gradient algorithm with differential privacy in Content Centric Networks." International Journal of Distributed Sensor Networks 17, no. 12 (December 2021): 155014772110599. http://dx.doi.org/10.1177/15501477211059934.
Full textAkella, Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Animashree Anandkumar, and Yisong Yue. "Deep Bayesian Quadrature Policy Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6600–6608. http://dx.doi.org/10.1609/aaai.v35i8.16817.
Full textDissertations / Theses on the topic "Policy gradient"
Jacobzon, Gustaf, and Martin Larsson. "Generalizing Deep Deterministic Policy Gradient." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239365.
Full textGreensmith, Evan, and evan greensmith@gmail com. "Policy Gradient Methods: Variance Reduction and Stochastic Convergence." The Australian National University. Research School of Information Sciences and Engineering, 2005. http://thesis.anu.edu.au./public/adt-ANU20060106.193712.
Full textGreensmith, Evan. "Policy gradient methods : variance reduction and stochastic convergence /." View thesis entry in Australian Digital Theses Program, 2005. http://thesis.anu.edu.au/public/adt-ANU20060106.193712/index.html.
Full textAberdeen, Douglas Alexander, and doug aberdeen@anu edu au. "Policy-Gradient Algorithms for Partially Observable Markov Decision Processes." The Australian National University. Research School of Information Sciences and Engineering, 2003. http://thesis.anu.edu.au./public/adt-ANU20030410.111006.
Full textAberdeen, Douglas Alexander. "Policy-gradient algorithms for partially observable Markov decision processes /." View thesis entry in Australian Digital Theses Program, 2003. http://thesis.anu.edu.au/public/adt-ANU20030410.111006/index.html.
Full textLidström, Christian, and Hannes Leskelä. "Learning for RoboCup Soccer : Policy Gradient Reinforcement Learning inmulti-agent systems." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-157469.
Full textRobo Cup Soccer är en årlig världsomspännande robotiktävling, i vilken lag av autonoma robotagenter spelar fotboll mot varandra. Denna rapport fokuserar på 2D-simulatorn, vilken är en variant där inga riktiga robotar behövs, utan där spelarklienterna istället kommunicerar med en server vilken håller reda på speltillståndet. RoboCup Soccer 2D simulation har blivit ett stort ämne för forskning inom articiell intelligens, samarbete och beteende i multi-agent-system, och lärandet därav. Någon form av maskininlärning är ett krav om man villkunna tävla på den högsta nivån, då problemet är för komplext för att beslutsfattandet ska kunna programmeras manuellt.Denna rapport finner att PGRL är en vanlig metod för maskininlärning i Robo Cup-lag, den används inom några av de bästa lagen i Robo Cup. Rapporten nner också att PGRL är en effektiv form av maskininlärningn är det gäller inlärningshastighet, men att det finns många faktorer som kan påverka detta. Oftast måste en avvägning ske mellan inlärningshastighet och precision.
GAVELLI, VIKTOR, and ALEXANDER GOMEZ. "Multi-agent system with Policy Gradient Reinforcement Learning for RoboCup Soccer Simulator." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-157418.
Full textRoboCup Soccer Simulator är en multiagent fotbollssimulator som används i tävlingar för att simulera robotar som spelar fotboll. Dessa tävlingar hålls huvudsakligen för att marknadsföra forskning inom robotik och articiell intelligens genom att tillhandahålla ett billigt och lättillgängligt sätt att programmera robotlika agenter. I denna rapportbeskrivs och testas en implementation av ett multiagentfotbollslag. PolicyGradiend Reinforcement Learning (PGRL) används för att träna ochförändra lagets beteende. Resultaten visar att PGRL förbättrar lagets prestanda, men närlagets prestanda skiljer sig avsevärt från motståndarens blir resultatetofullständigt.3
Pianazzi, Enrico. "A deep reinforcement learning approach based on policy gradient for mobile robot navigation." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022.
Find full textPoulin, Nolan. "Proactive Planning through Active Policy Inference in Stochastic Environments." Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1267.
Full textFleming, Brian James. "The social gradient in health : trends in C20th ideas, Australian Health Policy 1970-1998, and a health equity policy evaluation of Australian aged care planning /." Title page, abstract and table of contents only, 2003. http://web4.library.adelaide.edu.au/theses/09PH/09phf5971.pdf.
Full textBooks on the topic "Policy gradient"
Deyette, Jeff. Plugging in renewable energy: Grading the states. Cambridge, MA: Union of Concerned Scientists, 2003.
Find full textOlmstead, Alan L. Hog round marketing, seed quality, and government policy: Institutional change in U.S. cotton production, 1920-1960. Cambridge, Mass: National Bureau of Economic Research, 2003.
Find full textHarris, Ann. School-based assessment in GCE and CSE boards: A report on policy and practice. London: Secondary Examinations Council, 1986.
Find full textDiez, Lara. The use of call grading: How calls to the police are graded and resourced. London: Home Office Police Research Group, 1995.
Find full textTorres, Justin. Grading the systems: The guide to state standards, tests, and accountability policies. Washington, D.C: Thomas B. Fordham Foudation, 2004.
Find full textGrading the 44th president: A report card on Barack Obama's first term as a progressive leader. Santa Barbara, Calif: Praeger, 2012.
Find full textDelahanty, Julie. From social movements to social clauses: Grading strategies for improving conditions for women garment workers. Ottawa: North-South Institute, 1999.
Find full textIs Al-Qaeda winning?: Grading the Administration's counterterrorism policy : hearing before the Subcommittee on Terrorism, Nonproliferation, and Trade of the Committee on Foreign Affairs, House of Representatives, One Hundred Thirteenth Congress, second session, April 8, 2014. Washington: U.S. Government Printing Office, 2014.
Find full textA, Prashanth L., and Michael C. Fu. Risk-Sensitive Reinforcement Learning Via Policy Gradient Search. Now Publishers, 2022.
Find full textGorard, Stephen. Education Policy. Policy Press, 2018. http://dx.doi.org/10.1332/policypress/9781447342144.001.0001.
Full textBook chapters on the topic "Policy gradient"
Huang, Ruitong, Tianyang Yu, Zihan Ding, and Shanghang Zhang. "Policy Gradient." In Deep Reinforcement Learning, 161–212. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-4095-0_5.
Full textBuffet, Olivier. "Policy-Gradient Algorithms." In Markov Decision Processes in Artificial Intelligence, 127–52. Hoboken, NJ USA: John Wiley & Sons, Inc., 2013. http://dx.doi.org/10.1002/9781118557426.ch5.
Full textZeugmann, Thomas, Pascal Poupart, James Kennedy, Xin Jin, Jiawei Han, Lorenza Saitta, Michele Sebag, et al. "Policy Gradient Methods." In Encyclopedia of Machine Learning, 774–76. Boston, MA: Springer US, 2011. http://dx.doi.org/10.1007/978-0-387-30164-8_640.
Full textSanghi, Nimish. "Policy Gradient Algorithms." In Deep Reinforcement Learning with Python, 207–49. Berkeley, CA: Apress, 2021. http://dx.doi.org/10.1007/978-1-4842-6809-4_7.
Full textPeters, Jan, and J. Andrew Bagnell. "Policy Gradient Methods." In Encyclopedia of Machine Learning and Data Mining, 1–4. Boston, MA: Springer US, 2016. http://dx.doi.org/10.1007/978-1-4899-7502-7_646-1.
Full textPeters, Jan, and J. Andrew Bagnell. "Policy Gradient Methods." In Encyclopedia of Machine Learning and Data Mining, 982–85. Boston, MA: Springer US, 2017. http://dx.doi.org/10.1007/978-1-4899-7687-1_646.
Full textRao, Ashwin, and Tikhon Jelvis. "Policy Gradient Algorithms." In Foundations of Reinforcement Learning with Applications in Finance, 381–408. Boca Raton: Chapman and Hall/CRC, 2022. http://dx.doi.org/10.1201/9781003229193-14.
Full textBono, Guillaume, Jilles Steeve Dibangoye, Laëtitia Matignon, Florian Pereyron, and Olivier Simonin. "Cooperative Multi-agent Policy Gradient." In Machine Learning and Knowledge Discovery in Databases, 459–76. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-10925-7_28.
Full textYan, Yan, and Quan Liu. "Policy Space Noise in Deep Deterministic Policy Gradient." In Neural Information Processing, 624–34. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-04179-3_55.
Full textWang, Yixiang, and Feng Wu. "Policy Adaptive Multi-agent Deep Deterministic Policy Gradient." In PRIMA 2020: Principles and Practice of Multi-Agent Systems, 165–81. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-69322-0_11.
Full textConference papers on the topic "Policy gradient"
Maggipinto, Marco, Gian Antonio Susto, and Pratik Chaudhari. "Proximal Deterministic Policy Gradient." In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020. http://dx.doi.org/10.1109/iros45743.2020.9341559.
Full textNilsson, Olle, and Antoine Cully. "Policy gradient assisted MAP-Elites." In GECCO '21: Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3449639.3459304.
Full textPeters, Jan, and Stefan Schaal. "Policy Gradient Methods for Robotics." In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2006. http://dx.doi.org/10.1109/iros.2006.282564.
Full textTsourdos, Antonios, Ir Adhi Dharma Permana, Dewi H. Budiarti, Hyo-Sang Shin, and Chang-Hun Lee. "Developing Flight Control Policy Using Deep Deterministic Policy Gradient." In 2019 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES). IEEE, 2019. http://dx.doi.org/10.1109/icares.2019.8914343.
Full textBose, Sourabh, and Manfred Huber. "Training neural networks with policy gradient." In 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017. http://dx.doi.org/10.1109/ijcnn.2017.7966360.
Full textAwate, Yogesh P. "Policy-Gradient Based Actor-Critic Algorithms." In 2009 WRI Global Congress on Intelligent Systems. IEEE, 2009. http://dx.doi.org/10.1109/gcis.2009.372.
Full textVien, Ngo Anh, and TaeChoong Chung. "Policy Gradient Semi-markov Decision Process." In 2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2008. http://dx.doi.org/10.1109/ictai.2008.51.
Full textBanerjee, Bikramjit, and Jing Peng. "Adaptive policy gradient in multiagent learning." In the second international joint conference. New York, New York, USA: ACM Press, 2003. http://dx.doi.org/10.1145/860575.860686.
Full textXiao, Bo, Wuguannan Yao, and Xiang Zhou. "Optimal Option Hedging with Policy Gradient." In 2021 International Conference on Data Mining Workshops (ICDMW). IEEE, 2021. http://dx.doi.org/10.1109/icdmw53433.2021.00145.
Full textSun, Zhou. "Mutual Deep Deterministic Policy Gradient Learning." In 2022 International Conference on Big Data, Information and Computer Network (BDICN). IEEE, 2022. http://dx.doi.org/10.1109/bdicn55575.2022.00099.
Full textReports on the topic "Policy gradient"
Lleras-Muney, Adriana. Education and Income Gradients in Longevity: The Role of Policy. Cambridge, MA: National Bureau of Economic Research, January 2022. http://dx.doi.org/10.3386/w29694.
Full textUmberger, Pierce. Experimental Evaluation of Dynamic Crack Branching in Poly(methyl methacrylate) (PMMA) Using the Method of Coherent Gradient Sensing. Fort Belvoir, VA: Defense Technical Information Center, February 2010. http://dx.doi.org/10.21236/ada518614.
Full textA Decision-Making Method for Connected Autonomous Driving Based on Reinforcement Learning. SAE International, December 2020. http://dx.doi.org/10.4271/2020-01-5154.
Full text