Articoli di riviste sul tema "Off-Policy learning"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Off-Policy learning".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.
Meng, Wenjia, Qian Zheng, Gang Pan e Yilong Yin. "Off-Policy Proximal Policy Optimization". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 8 (26 giugno 2023): 9162–70. http://dx.doi.org/10.1609/aaai.v37i8.26099.
Schmitt, Simon, John Shawe-Taylor e Hado van Hasselt. "Chaining Value Functions for Off-Policy Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 8 (28 giugno 2022): 8187–95. http://dx.doi.org/10.1609/aaai.v36i8.20792.
Xu, Da, Yuting Ye, Chuanwei Ruan e Bo Yang. "Towards Robust Off-Policy Learning for Runtime Uncertainty". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 9 (28 giugno 2022): 10101–9. http://dx.doi.org/10.1609/aaai.v36i9.21249.
Peters, James F., e Christopher Henry. "Approximation spaces in off-policy Monte Carlo learning". Engineering Applications of Artificial Intelligence 20, n. 5 (agosto 2007): 667–75. http://dx.doi.org/10.1016/j.engappai.2006.11.005.
Yu, Jiayu, Jingyao Li, Shuai Lü e Shuai Han. "Mixed experience sampling for off-policy reinforcement learning". Expert Systems with Applications 251 (ottobre 2024): 124017. http://dx.doi.org/10.1016/j.eswa.2024.124017.
Cetin, Edoardo, e Oya Celiktutan. "Learning Pessimism for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 6 (26 giugno 2023): 6971–79. http://dx.doi.org/10.1609/aaai.v37i6.25852.
Kong, Seung-Hyun, I. Made Aswin Nahrendra e Dong-Hee Paek. "Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay". IEEE Access 9 (2021): 93152–64. http://dx.doi.org/10.1109/access.2021.3085142.
Li, Lihong. "A perspective on off-policy evaluation in reinforcement learning". Frontiers of Computer Science 13, n. 5 (17 giugno 2019): 911–12. http://dx.doi.org/10.1007/s11704-019-9901-7.
Luo, Biao, Huai-Ning Wu e Tingwen Huang. "Off-Policy Reinforcement Learning for $ H_\infty $ Control Design". IEEE Transactions on Cybernetics 45, n. 1 (gennaio 2015): 65–76. http://dx.doi.org/10.1109/tcyb.2014.2319577.
Sun, Mingfei, Sam Devlin, Katja Hofmann e Shimon Whiteson. "Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 8 (28 giugno 2022): 8378–85. http://dx.doi.org/10.1609/aaai.v36i8.20813.
Jain, Arushi, Gandharv Patil, Ayush Jain, Khimya Khetarpal e Doina Precup. "Variance Penalized On-Policy and Off-Policy Actor-Critic". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 9 (18 maggio 2021): 7899–907. http://dx.doi.org/10.1609/aaai.v35i9.16964.
Hao, Longyan, Chaoli Wang e Yibo Shi. "Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method". Mathematics 12, n. 10 (14 maggio 2024): 1533. http://dx.doi.org/10.3390/math12101533.
Gelada, Carles, e Marc G. Bellemare. "Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 3647–55. http://dx.doi.org/10.1609/aaai.v33i01.33013647.
Xiao, Teng, e Suhang Wang. "Towards Off-Policy Learning for Ranking Policies with Logged Feedback". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 8 (28 giugno 2022): 8700–8707. http://dx.doi.org/10.1609/aaai.v36i8.20849.
Li, Jinna, Hamidreza Modares, Tianyou Chai, Frank L. Lewis e Lihua Xie. "Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games". IEEE Transactions on Neural Networks and Learning Systems 28, n. 10 (ottobre 2017): 2434–45. http://dx.doi.org/10.1109/tnnls.2016.2609500.
Zhang, Hengrui, Youfang Lin, Shuo Shen, Sheng Han e Kai Lv. "Enhancing Off-Policy Constrained Reinforcement Learning through Adaptive Ensemble C Estimation". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 19 (24 marzo 2024): 21770–78. http://dx.doi.org/10.1609/aaai.v38i19.30177.
Zhang, Shangtong, Bo Liu e Shimon Whiteson. "Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 12 (18 maggio 2021): 10905–13. http://dx.doi.org/10.1609/aaai.v35i12.17302.
Ali, Raja Farrukh, Kevin Duong, Nasik Muhammad Nafi e William Hsu. "Multi-Horizon Learning in Procedurally-Generated Environments for Off-Policy Reinforcement Learning (Student Abstract)". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 13 (26 giugno 2023): 16150–51. http://dx.doi.org/10.1609/aaai.v37i13.26935.
Tennenholtz, Guy, Uri Shalit e Shie Mannor. "Off-Policy Evaluation in Partially Observable Environments". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 06 (3 aprile 2020): 10276–83. http://dx.doi.org/10.1609/aaai.v34i06.6590.
Nakamura, Yutaka, Takeshi Mori, Yoichi Tokita, Tomohiro Shibata e Shin Ishii. "Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller". Journal of Robotics and Mechatronics 17, n. 6 (20 dicembre 2005): 636–44. http://dx.doi.org/10.20965/jrm.2005.p0636.
Wang, Mingyang, Zhenshan Bing, Xiangtong Yao, Shuai Wang, Huang Kai, Hang Su, Chenguang Yang e Alois Knoll. "Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 8 (26 giugno 2023): 10157–65. http://dx.doi.org/10.1609/aaai.v37i8.26210.
Cao, Jiaqing, Quan Liu, Fei Zhu, Qiming Fu e Shan Zhong. "Gradient temporal-difference learning for off-policy evaluation using emphatic weightings". Information Sciences 580 (novembre 2021): 311–30. http://dx.doi.org/10.1016/j.ins.2021.08.082.
Tian, Chang, An Liu, Guan Huang e Wu Luo. "Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning". IEEE Transactions on Signal Processing 70 (2022): 1609–24. http://dx.doi.org/10.1109/tsp.2022.3158737.
Karimpanal, Thommen George, e Erik Wilhelm. "Identification and off-policy learning of multiple objectives using adaptive clustering". Neurocomputing 263 (novembre 2017): 39–47. http://dx.doi.org/10.1016/j.neucom.2017.04.074.
Kiumarsi, Bahare, Frank L. Lewis e Zhong-Ping Jiang. "H∞ control of linear discrete-time systems: Off-policy reinforcement learning". Automatica 78 (aprile 2017): 144–52. http://dx.doi.org/10.1016/j.automatica.2016.12.009.
Li, Jinna, Zhenfei Xiao e Ping Li. "Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning". IEEE Access 7 (2019): 134647–59. http://dx.doi.org/10.1109/access.2019.2939384.
Kiumarsi, Bahare, Wei Kang e Frank L. Lewis. "H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning". Unmanned Systems 04, n. 01 (gennaio 2016): 51–60. http://dx.doi.org/10.1142/s2301385016400069.
Lian, Bosen, Wenqian Xue, Yijing Xie, Frank L. Lewis e Ali Davoudi. "Off-policy inverse Q-learning for discrete-time antagonistic unknown systems". Automatica 155 (settembre 2023): 111171. http://dx.doi.org/10.1016/j.automatica.2023.111171.
Kim, Man-Je, Hyunsoo Park e Chang Wook Ahn. "Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning". Electronics 11, n. 7 (28 marzo 2022): 1069. http://dx.doi.org/10.3390/electronics11071069.
Chaudhari, Shreyas, David Arbour, Georgios Theocharous e Nikos Vlassis. "Distributional Off-Policy Evaluation for Slate Recommendations". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 8 (24 marzo 2024): 8265–73. http://dx.doi.org/10.1609/aaai.v38i8.28667.
Zhang, Ruiyi, Tong Yu, Yilin Shen e Hongxia Jin. "Text-Based Interactive Recommendation via Offline Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 10 (28 giugno 2022): 11694–702. http://dx.doi.org/10.1609/aaai.v36i10.21424.
Xu, Z., L. Cao e X. Chen. "Deep Reinforcement Learning with Adaptive Update Target Combination". Computer Journal 63, n. 7 (15 agosto 2019): 995–1003. http://dx.doi.org/10.1093/comjnl/bxz066.
Shahid, Asad Ali, Dario Piga, Francesco Braghin e Loris Roveda. "Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning". Autonomous Robots 46, n. 3 (9 febbraio 2022): 483–98. http://dx.doi.org/10.1007/s10514-022-10034-z.
Hollenstein, Jakob, Georg Martius e Justus Piater. "Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 11 (24 marzo 2024): 12466–72. http://dx.doi.org/10.1609/aaai.v38i11.29139.
Ren, He, Jing Dai, Huaguang Zhang e Kun Zhang. "Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems". Transactions of the Institute of Measurement and Control 42, n. 15 (6 luglio 2020): 2919–28. http://dx.doi.org/10.1177/0142331220932634.
Levine, Alexander, e Soheil Feizi. "Goal-Conditioned Q-learning as Knowledge Distillation". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 7 (26 giugno 2023): 8500–8509. http://dx.doi.org/10.1609/aaai.v37i7.26024.
Yang, Hyunjun, Hyeonjun Park e Kyungjae Lee. "A Selective Portfolio Management Algorithm with Off-Policy Reinforcement Learning Using Dirichlet Distribution". Axioms 11, n. 12 (23 novembre 2022): 664. http://dx.doi.org/10.3390/axioms11120664.
Suttle, Wesley, Zhuoran Yang, Kaiqing Zhang, Zhaoran Wang, Tamer Başar e Ji Liu. "A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning". IFAC-PapersOnLine 53, n. 2 (2020): 1549–54. http://dx.doi.org/10.1016/j.ifacol.2020.12.2021.
Stanković, Miloš S., Marko Beko e Srdjan S. Stanković. "Distributed Gradient Temporal Difference Off-policy Learning With Eligibility Traces: Weak Convergence". IFAC-PapersOnLine 53, n. 2 (2020): 1563–68. http://dx.doi.org/10.1016/j.ifacol.2020.12.2184.
Li, Jinna, Zhenfei Xiao, Tianyou Chai, Frank L. Lewis e Sarangapani Jagannathan. "Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems". IFAC-PapersOnLine 53, n. 2 (2020): 9189–94. http://dx.doi.org/10.1016/j.ifacol.2020.12.2180.
Kim e Park. "Exploration with Multiple Random ε-Buffers in Off-Policy Deep Reinforcement Learning". Symmetry 11, n. 11 (1 novembre 2019): 1352. http://dx.doi.org/10.3390/sym11111352.
Chen, Ning, Shuhan Luo, Jiayang Dai, Biao Luo e Weihua Gui. "Optimal Control of Iron-Removal Systems Based on Off-Policy Reinforcement Learning". IEEE Access 8 (2020): 149730–40. http://dx.doi.org/10.1109/access.2020.3015801.
Hachiya, Hirotaka, Takayuki Akiyama, Masashi Sugiayma e Jan Peters. "Adaptive importance sampling for value function approximation in off-policy reinforcement learning". Neural Networks 22, n. 10 (dicembre 2009): 1399–410. http://dx.doi.org/10.1016/j.neunet.2009.01.002.
Zuo, Guoyu, Qishen Zhao, Kexin Chen, Jiangeng Li e Daoxiong Gong. "Off-policy adversarial imitation learning for robotic tasks with low-quality demonstrations". Applied Soft Computing 97 (dicembre 2020): 106795. http://dx.doi.org/10.1016/j.asoc.2020.106795.
Givchi, Arash, e Maziar Palhang. "Off-policy temporal difference learning with distribution adaptation in fast mixing chains". Soft Computing 22, n. 3 (30 gennaio 2017): 737–50. http://dx.doi.org/10.1007/s00500-017-2490-1.
Liu, Mushuang, Yan Wan, Frank L. Lewis e Victor G. Lopez. "Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning". IEEE Transactions on Neural Networks and Learning Systems 31, n. 12 (dicembre 2020): 5522–33. http://dx.doi.org/10.1109/tnnls.2020.2969215.
Pritchett, Lant, e Justin Sandefur. "Learning from Experiments when Context Matters". American Economic Review 105, n. 5 (1 maggio 2015): 471–75. http://dx.doi.org/10.1257/aer.p20151016.
Chen, Zaiwei. "A Unified Lyapunov Framework for Finite-Sample Analysis of Reinforcement Learning Algorithms". ACM SIGMETRICS Performance Evaluation Review 50, n. 3 (30 dicembre 2022): 12–15. http://dx.doi.org/10.1145/3579342.3579346.
Narita, Yusuke, Kyohei Okumura, Akihiro Shimizu e Kohei Yata. "Counterfactual Learning with General Data-Generating Policies". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 8 (26 giugno 2023): 9286–93. http://dx.doi.org/10.1609/aaai.v37i8.26113.
Kim, MyeongSeop, Jung-Su Kim, Myoung-Su Choi e Jae-Han Park. "Adaptive Discount Factor for Deep Reinforcement Learning in Continuing Tasks with Uncertainty". Sensors 22, n. 19 (25 settembre 2022): 7266. http://dx.doi.org/10.3390/s22197266.