Добірка наукової літератури з теми "Off-Policy learning"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Off-Policy learning".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Off-Policy learning"
Meng, Wenjia, Qian Zheng, Gang Pan, and Yilong Yin. "Off-Policy Proximal Policy Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 9162–70. http://dx.doi.org/10.1609/aaai.v37i8.26099.
Повний текст джерелаSchmitt, Simon, John Shawe-Taylor, and Hado van Hasselt. "Chaining Value Functions for Off-Policy Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8187–95. http://dx.doi.org/10.1609/aaai.v36i8.20792.
Повний текст джерелаXu, Da, Yuting Ye, Chuanwei Ruan, and Bo Yang. "Towards Robust Off-Policy Learning for Runtime Uncertainty." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 10101–9. http://dx.doi.org/10.1609/aaai.v36i9.21249.
Повний текст джерелаPeters, James F., and Christopher Henry. "Approximation spaces in off-policy Monte Carlo learning." Engineering Applications of Artificial Intelligence 20, no. 5 (August 2007): 667–75. http://dx.doi.org/10.1016/j.engappai.2006.11.005.
Повний текст джерелаYu, Jiayu, Jingyao Li, Shuai Lü, and Shuai Han. "Mixed experience sampling for off-policy reinforcement learning." Expert Systems with Applications 251 (October 2024): 124017. http://dx.doi.org/10.1016/j.eswa.2024.124017.
Повний текст джерелаCetin, Edoardo, and Oya Celiktutan. "Learning Pessimism for Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (June 26, 2023): 6971–79. http://dx.doi.org/10.1609/aaai.v37i6.25852.
Повний текст джерелаKong, Seung-Hyun, I. Made Aswin Nahrendra, and Dong-Hee Paek. "Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay." IEEE Access 9 (2021): 93152–64. http://dx.doi.org/10.1109/access.2021.3085142.
Повний текст джерелаLi, Lihong. "A perspective on off-policy evaluation in reinforcement learning." Frontiers of Computer Science 13, no. 5 (June 17, 2019): 911–12. http://dx.doi.org/10.1007/s11704-019-9901-7.
Повний текст джерелаLuo, Biao, Huai-Ning Wu, and Tingwen Huang. "Off-Policy Reinforcement Learning for $ H_\infty $ Control Design." IEEE Transactions on Cybernetics 45, no. 1 (January 2015): 65–76. http://dx.doi.org/10.1109/tcyb.2014.2319577.
Повний текст джерелаSun, Mingfei, Sam Devlin, Katja Hofmann, and Shimon Whiteson. "Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8378–85. http://dx.doi.org/10.1609/aaai.v36i8.20813.
Повний текст джерелаДисертації з теми "Off-Policy learning"
Hauser, Kristen. "Hyperparameter Tuning for Reinforcement Learning with Bandits and Off-Policy Sampling." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1613034993418088.
Повний текст джерелаTosatto, Samuele [Verfasser], Jan [Akademischer Betreuer] Peters, and Martha [Akademischer Betreuer] White. "Off-Policy Reinforcement Learning for Robotics / Samuele Tosatto ; Jan Peters, Martha White." Darmstadt : Universitäts- und Landesbibliothek, 2021. http://d-nb.info/1227582293/34.
Повний текст джерелаSakhi, Otmane. "Offline Contextual Bandit : Theory and Large Scale Applications." Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAG011.
Повний текст джерелаThis thesis presents contributions to the problem of learning from logged interactions using the offline contextual bandit framework. We are interested in two related topics: (1) offline policy learning with performance certificates, and (2) fast and efficient policy learning applied to large scale, real world recommendation. For (1), we first leverage results from the distributionally robust optimisation framework to construct asymptotic, variance-sensitive bounds to evaluate policies' performances. These bounds lead to new, more practical learning objectives thanks to their composite nature and straightforward calibration. We then analyse the problem from the PAC-Bayesian perspective, and provide tighter, non-asymptotic bounds on the performance of policies. Our results motivate new strategies, that offer performance certificates before deploying the policies online. The newly derived strategies rely on composite learning objectives that do not require additional tuning. For (2), we first propose a hierarchical Bayesian model, that combines different signals, to efficiently estimate the quality of recommendation. We provide proper computational tools to scale the inference to real world problems, and demonstrate empirically the benefits of the approach in multiple scenarios. We then address the question of accelerating common policy optimisation approaches, particularly focusing on recommendation problems with catalogues of millions of items. We derive optimisation routines, based on new gradient approximations, computed in logarithmic time with respect to the catalogue size. Our approach improves on common, linear time gradient computations, yielding fast optimisation with no loss on the quality of the learned policies
Tosatto, Samuele. "Off-Policy Reinforcement Learning for Robotics." Phd thesis, 2021. https://tuprints.ulb.tu-darmstadt.de/17536/1/thesis.pdf.
Повний текст джерелаDelp, Michael. "Experiments in off-policy reinforcement learning with the GQ(lambda) algorithm." Master's thesis, 2011. http://hdl.handle.net/10048/1762.
Повний текст джерелаDiddigi, Raghuram Bharadwaj. "Reinforcement Learning Algorithms for Off-Policy, Multi-Agent Learning and Applications to Smart Grids." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5673.
Повний текст джерелаКниги з теми "Off-Policy learning"
Kabay, Sarah. Access, Quality, and the Global Learning Crisis. Oxford University Press, 2021. http://dx.doi.org/10.1093/oso/9780192896865.001.0001.
Повний текст джерелаStartz, Richard. Profit of Education. ABC-CLIO, LLC, 2010. http://dx.doi.org/10.5040/9798216001799.
Повний текст джерелаЧастини книг з теми "Off-Policy learning"
Li, Jinna, Frank L. Lewis, and Jialu Fan. "Off-Policy Game Reinforcement Learning." In Reinforcement Learning, 185–232. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-28394-9_7.
Повний текст джерелаZhang, Li, Xin Li, Mingzhong Wang, and Andong Tian. "Off-Policy Differentiable Logic Reinforcement Learning." In Machine Learning and Knowledge Discovery in Databases. Research Track, 617–32. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86520-7_38.
Повний текст джерелаCief, Matej, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan, and Artur Bekasov. "Learning Action Embeddings for Off-Policy Evaluation." In Lecture Notes in Computer Science, 108–22. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-56027-9_7.
Повний текст джерелаKlein, Edouard, Matthieu Geist, and Olivier Pietquin. "Batch, Off-Policy and Model-Free Apprenticeship Learning." In Lecture Notes in Computer Science, 285–96. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-29946-9_28.
Повний текст джерелаRak, Alexandra, Alexey Skrynnik, and Aleksandr I. Panov. "Flexible Data Augmentation in Off-Policy Reinforcement Learning." In Artificial Intelligence and Soft Computing, 224–35. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87986-0_20.
Повний текст джерелаRak, Alexandra, Alexey Skrynnik, and Aleksandr I. Panov. "Flexible Data Augmentation in Off-Policy Reinforcement Learning." In Artificial Intelligence and Soft Computing, 224–35. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87986-0_20.
Повний текст джерелаSteckelmacher, Denis, Hélène Plisnier, Diederik M. Roijers, and Ann Nowé. "Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics." In Machine Learning and Knowledge Discovery in Databases, 19–34. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-46133-1_2.
Повний текст джерелаRoettger, Frederic. "Reviewing On-Policy/Off-Policy Critic Learning in the Context of Temporal Differences and Residual Learning." In Reinforcement Learning Algorithms: Analysis and Applications, 15–24. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-41188-6_2.
Повний текст джерелаZhang, Qichao, Dongbin Zhao, and Sibo Zhang. "Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games." In Neural Information Processing, 822–30. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-70087-8_84.
Повний текст джерелаWei, Qinglai, Ruizhuo Song, Benkai Li, and Xiaofeng Lin. "Off-Policy IRL Optimal Tracking Control for Continuous-Time Chaotic Systems." In Self-Learning Optimal Control of Nonlinear Systems, 201–14. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-4080-1_9.
Повний текст джерелаТези доповідей конференцій з теми "Off-Policy learning"
He, Li, Long Xia, Wei Zeng, Zhi-Ming Ma, Yihong Zhao, and Dawei Yin. "Off-policy Learning for Multiple Loggers." In KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3292500.3330864.
Повний текст джерелаWhite, Adam, Joseph Modayil, and Richard S. Sutton. "Scaling life-long off-policy learning." In 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL). IEEE, 2012. http://dx.doi.org/10.1109/devlrn.2012.6400860.
Повний текст джерелаZhang, Yan, and Michael M. Zavlanos. "Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus." In 2019 IEEE 58th Conference on Decision and Control (CDC). IEEE, 2019. http://dx.doi.org/10.1109/cdc40024.2019.9029969.
Повний текст джерелаZheng, Bowen, and Ran Cheng. "Rethinking Population-assisted Off-policy Reinforcement Learning." In GECCO '23: Genetic and Evolutionary Computation Conference. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3583131.3590512.
Повний текст джерелаCheng, Zhihao, Li Shen, and Dacheng Tao. "Off-policy Imitation Learning from Visual Inputs." In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023. http://dx.doi.org/10.1109/icra48891.2023.10161566.
Повний текст джерелаMiao, Dadong, Yanan Wang, Guoyu Tang, Lin Liu, Sulong Xu, Bo Long, Yun Xiao, Lingfei Wu, and Yunjiang Jiang. "Sequential Search with Off-Policy Reinforcement Learning." In CIKM '21: The 30th ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3459637.3481954.
Повний текст джерелаJeunen, Olivier, Sean Murphy, and Ben Allison. "Off-Policy Learning-to-Bid with AuctionGym." In KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3580305.3599877.
Повний текст джерелаSaito, Yuta, Himan Abdollahpouri, Jesse Anderton, Ben Carterette, and Mounia Lalmas. "Long-term Off-Policy Evaluation and Learning." In WWW '24: The ACM Web Conference 2024. New York, NY, USA: ACM, 2024. http://dx.doi.org/10.1145/3589334.3645446.
Повний текст джерелаJoseph, Ajin George, and Shalabh Bhatnagar. "Bounds for off-policy prediction in reinforcement learning." In 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017. http://dx.doi.org/10.1109/ijcnn.2017.7966359.
Повний текст джерелаMarvi, Zahra, and Bahare Kiumarsi. "Safe Off-policy Reinforcement Learning Using Barrier Functions." In 2020 American Control Conference (ACC). IEEE, 2020. http://dx.doi.org/10.23919/acc45564.2020.9147584.
Повний текст джерелаЗвіти організацій з теми "Off-Policy learning"
Private sector and food security. Commercial Agriculture for Smallholders and Agribusiness (CASA), 2023. http://dx.doi.org/10.1079/20240191178.
Повний текст джерела