Добірка наукової літератури з теми "Policy gradients"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Policy gradients".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Policy gradients"
Cai, Qingpeng, Ling Pan, and Pingzhong Tang. "Deterministic Value-Policy Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3316–23. http://dx.doi.org/10.1609/aaai.v34i04.5732.
Повний текст джерелаWierstra, D., A. Forster, J. Peters, and J. Schmidhuber. "Recurrent policy gradients." Logic Journal of IGPL 18, no. 5 (September 9, 2009): 620–34. http://dx.doi.org/10.1093/jigpal/jzp049.
Повний текст джерелаSehnke, Frank, Christian Osendorfer, Thomas Rückstieß, Alex Graves, Jan Peters, and Jürgen Schmidhuber. "Parameter-exploring policy gradients." Neural Networks 23, no. 4 (May 2010): 551–59. http://dx.doi.org/10.1016/j.neunet.2009.12.004.
Повний текст джерелаZhao, Tingting, Hirotaka Hachiya, Voot Tangkaratt, Jun Morimoto, and Masashi Sugiyama. "Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration." Neural Computation 25, no. 6 (June 2013): 1512–47. http://dx.doi.org/10.1162/neco_a_00452.
Повний текст джерелаSeno, Takuma, and Michita Imai. "Policy Gradients with Memory-Augmented Critic." Transactions of the Japanese Society for Artificial Intelligence 36, no. 1 (January 1, 2021): B—K71_1–8. http://dx.doi.org/10.1527/tjsai.36-1_b-k71.
Повний текст джерелаMillidge, Beren. "Deep active inference as variational policy gradients." Journal of Mathematical Psychology 96 (June 2020): 102348. http://dx.doi.org/10.1016/j.jmp.2020.102348.
Повний текст джерелаCatling, PC, and RJ Burt. "Studies of the Ground-Dwelling Mammals of Eucalypt Forests in South-Eastern New South Wales: the Effect of Environmental Variables on Distribution and Abundance." Wildlife Research 22, no. 6 (1995): 669. http://dx.doi.org/10.1071/wr9950669.
Повний текст джерелаBaxter, J., P. L. Bartlett, and L. Weaver. "Experiments with Infinite-Horizon, Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 351–81. http://dx.doi.org/10.1613/jair.807.
Повний текст джерелаChen, Qiulin, Karen Eggleston, Wei Zhang, Jiaying Zhao, and Sen Zhou. "The Educational Gradient in Health in China." China Quarterly 230 (May 15, 2017): 289–322. http://dx.doi.org/10.1017/s0305741017000613.
Повний текст джерелаPeters, Jan, and Stefan Schaal. "Reinforcement learning of motor skills with policy gradients." Neural Networks 21, no. 4 (May 2008): 682–97. http://dx.doi.org/10.1016/j.neunet.2008.02.003.
Повний текст джерелаДисертації з теми "Policy gradients"
Crowley, Mark. "Equilibrium policy gradients for spatiotemporal planning." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/38971.
Повний текст джерелаSehnke, Frank [Verfasser], Patrick van der [Akademischer Betreuer] Smagt, and Jürgen [Akademischer Betreuer] Schmidhuber. "Parameter Exploring Policy Gradients and their Implications / Frank Sehnke. Gutachter: Jürgen Schmidhuber. Betreuer: Patrick van der Smagt." München : Universitätsbibliothek der TU München, 2012. http://d-nb.info/1030099820/34.
Повний текст джерелаTolman, Deborah A. "Environmental Gradients, Community Boundaries, and Disturbance the Darlingtonia Fens of Southwestern Oregon." PDXScholar, 2004. https://pdxscholar.library.pdx.edu/open_access_etds/3013.
Повний текст джерелаMasoudi, Mohammad Amin. "Robust Deep Reinforcement Learning for Portfolio Management." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42743.
Повний текст джерелаJacobzon, Gustaf, and Martin Larsson. "Generalizing Deep Deterministic Policy Gradient." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239365.
Повний текст джерелаКовальов, Костянтин Миколайович. "Комп'ютерна система управління промисловим роботом". Bachelor's thesis, КПІ ім. Ігоря Сікорського, 2019. https://ela.kpi.ua/handle/123456789/28610.
Повний текст джерелаQualifying work includes an explanatory note (56 p., 2 appendix). The object of the study are reinforcement learning algorithms for the task of an industrial robotic arm control. Continuous control of an industrial robotic arm for non-trivial tasks is too complicated or even unsolvable for classical methods of robotics. Reinforcement learning methods can be used in this case. They are quite simple to implement, allow for generalization to unseen cases, and learn from high-dimensional data. We implement deep deterministic policy gradient algorithm that is suitable for complex continuous contol tasks. During the study: • An analysis of existing classical methods for the problem of industrial robot control was conducted • An analysis of existing algorithms of training with reinforcement learning and their use in the field of robotics has been conducted • Deep deterministic policy gradient algorithm is implemented • Implemented algorithm is tested on a simplified environment • The architecture of the neural network is proposed for solving the problem • Algorithm was tested on the training set of objects • Algorithm was tested for its generalization ability on the test set It was shown that deep deterministic policy gradient algorithm with neural network as policy approximator is able to solve the problem with the image as an input and to generalize to objects not seen before.
Greensmith, Evan, and evan greensmith@gmail com. "Policy Gradient Methods: Variance Reduction and Stochastic Convergence." The Australian National University. Research School of Information Sciences and Engineering, 2005. http://thesis.anu.edu.au./public/adt-ANU20060106.193712.
Повний текст джерелаGreensmith, Evan. "Policy gradient methods : variance reduction and stochastic convergence /." View thesis entry in Australian Digital Theses Program, 2005. http://thesis.anu.edu.au/public/adt-ANU20060106.193712/index.html.
Повний текст джерелаAberdeen, Douglas Alexander, and doug aberdeen@anu edu au. "Policy-Gradient Algorithms for Partially Observable Markov Decision Processes." The Australian National University. Research School of Information Sciences and Engineering, 2003. http://thesis.anu.edu.au./public/adt-ANU20030410.111006.
Повний текст джерелаAberdeen, Douglas Alexander. "Policy-gradient algorithms for partially observable Markov decision processes /." View thesis entry in Australian Digital Theses Program, 2003. http://thesis.anu.edu.au/public/adt-ANU20030410.111006/index.html.
Повний текст джерелаКниги з теми "Policy gradients"
Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more. Packt Publishing, 2018.
Знайти повний текст джерелаGorard, Stephen. Education Policy. Policy Press, 2018. http://dx.doi.org/10.1332/policypress/9781447342144.001.0001.
Повний текст джерелаOlsen, Jan Abel. The social environment and health. Oxford University Press, 2017. http://dx.doi.org/10.1093/oso/9780198794837.003.0007.
Повний текст джерелаOlsen, Jan Abel. Exogenous determinants of health. Oxford University Press, 2017. http://dx.doi.org/10.1093/oso/9780198794837.003.0006.
Повний текст джерелаEgger, Eva-Maria, Aslihan Arslan, and Emanuele Zucchini. Does connectivity reduce gender gaps in off-farm employment? Evidence from 12 low- and middle-income countries. 3rd ed. UNU-WIDER, 2021. http://dx.doi.org/10.35188/unu-wider/2021/937-2.
Повний текст джерелаЧастини книг з теми "Policy gradients"
Sehnke, Frank, Christian Osendorfer, Jan Sölter, Jürgen Schmidhuber, and Ulrich Rührmair. "Policy Gradients for Cryptanalysis." In Artificial Neural Networks – ICANN 2010, 168–77. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-15825-4_22.
Повний текст джерелаMcClarren, Ryan G. "Reinforcement Learning with Policy Gradients." In Machine Learning for Engineers, 219–37. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-70388-2_9.
Повний текст джерелаPrashanth, L. A. "Policy Gradients for CVaR-Constrained MDPs." In Lecture Notes in Computer Science, 155–69. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-11662-4_12.
Повний текст джерелаTummon, Evan, Muhammad Adil Raja, and Conor Ryan. "Trading Cryptocurrency with Deep Deterministic Policy Gradients." In Lecture Notes in Computer Science, 245–56. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-62362-3_22.
Повний текст джерелаWierstra, Daan, Alexander Foerster, Jan Peters, and Jürgen Schmidhuber. "Solving Deep Memory POMDPs with Recurrent Policy Gradients." In Lecture Notes in Computer Science, 697–706. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-74690-4_71.
Повний текст джерелаLach, Luca, Timo Korthals, Francesco Ferro, Helge Ritter, and Malte Schilling. "Guiding Representation Learning in Deep Generative Models with Policy Gradients." In Communications in Computer and Information Science, 115–31. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-85672-4_9.
Повний текст джерелаStaroverov, Alexey, Vladislav Vetlin, Stepan Makarenko, Anton Naumov, and Aleksandr I. Panov. "Learning Embodied Agents with Policy Gradients to Navigate in Realistic Environments." In Advances in Neural Computation, Machine Learning, and Cognitive Research IV, 212–21. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60577-3_24.
Повний текст джерелаLiu, Chujun, Andrew G. Lonsberry, Mark J. Nandor, Musa L. Audu, and Roger D. Quinn. "Implementation of Deep Deterministic Policy Gradients for Controlling Dynamic Bipedal Walking." In Biomimetic and Biohybrid Systems, 276–87. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-95972-6_29.
Повний текст джерелаSehnke, Frank, and Tingting Zhao. "Baseline-Free Sampling in Parameter Exploring Policy Gradients: Super Symmetric PGPE." In Springer Series in Bio-/Neuroinformatics, 271–93. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-09903-3_13.
Повний текст джерелаRoy, Kaushik, Qi Zhang, Manas Gaur, and Amit Sheth. "Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits." In Machine Learning and Knowledge Discovery in Databases. Research Track, 35–50. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86486-6_3.
Повний текст джерелаТези доповідей конференцій з теми "Policy gradients"
Kersting, Kristian, and Kurt Driessens. "Non-parametric policy gradients." In the 25th international conference. New York, New York, USA: ACM Press, 2008. http://dx.doi.org/10.1145/1390156.1390214.
Повний текст джерелаSehnke, Frank, Alex Graves, Christian Osendorfer, and Jurgen Schmidhuber. "Multimodal Parameter-exploring Policy Gradients." In 2010 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2010. http://dx.doi.org/10.1109/icmla.2010.24.
Повний текст джерелаPan, Feiyang, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, and Qing He. "Policy Gradients for Contextual Recommendations." In The World Wide Web Conference. New York, New York, USA: ACM Press, 2019. http://dx.doi.org/10.1145/3308558.3313616.
Повний текст джерелаTheodorou, Evangelos A., Jiri Najemnik, and Emo Todorov. "Free energy based policy gradients." In 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 2013. http://dx.doi.org/10.1109/adprl.2013.6614998.
Повний текст джерелаTheodorou, Evangelos A., Krishnamurthy Dvijotham, and Emo Todorov. "Time varying nonlinear Policy Gradients." In 2013 IEEE 52nd Annual Conference on Decision and Control (CDC). IEEE, 2013. http://dx.doi.org/10.1109/cdc.2013.6761122.
Повний текст джерелаDo, Chau, Camilo Gordillo, and Wolfram Burgard. "Learning to Pour using Deep Deterministic Policy Gradients." In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018. http://dx.doi.org/10.1109/iros.2018.8593654.
Повний текст джерелаNguyen, Hung The, Tung Nguyen, Do-Van Nguyen, and Thanh-Ha Le. "A Hierarchical Deep Deterministic Policy Gradients for Swarm Navigation." In 2019 11th International Conference on Knowledge and Systems Engineering (KSE). IEEE, 2019. http://dx.doi.org/10.1109/kse.2019.8919269.
Повний текст джерелаMani, Kaustubh, Meha Kaushik, Nirvan Singhania, and K. Madhava Krishna. "Learning Adaptive Driving Behavior Using Recurrent Deterministic Policy Gradients." In 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2019. http://dx.doi.org/10.1109/robio49542.2019.8961480.
Повний текст джерелаHegde, Shashank, Vishal Kumar, and Atul Singh. "Risk aware portfolio construction using deep deterministic policy gradients." In 2018 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2018. http://dx.doi.org/10.1109/ssci.2018.8628791.
Повний текст джерелаTahboub, Karim A. "Human-Machine Coadaptation Based on Reinforcement Learning with Policy Gradients." In 2019 8th International Conference on Systems and Control (ICSC). IEEE, 2019. http://dx.doi.org/10.1109/icsc47195.2019.8950660.
Повний текст джерелаЗвіти організацій з теми "Policy gradients"
Lleras-Muney, Adriana. Education and Income Gradients in Longevity: The Role of Policy. Cambridge, MA: National Bureau of Economic Research, January 2022. http://dx.doi.org/10.3386/w29694.
Повний текст джерелаUmberger, Pierce. Experimental Evaluation of Dynamic Crack Branching in Poly(methyl methacrylate) (PMMA) Using the Method of Coherent Gradient Sensing. Fort Belvoir, VA: Defense Technical Information Center, February 2010. http://dx.doi.org/10.21236/ada518614.
Повний текст джерелаA Decision-Making Method for Connected Autonomous Driving Based on Reinforcement Learning. SAE International, December 2020. http://dx.doi.org/10.4271/2020-01-5154.
Повний текст джерела