Artículos de revistas sobre el tema "Off-Policy learning"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Off-Policy learning".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Meng, Wenjia, Qian Zheng, Gang Pan y Yilong Yin. "Off-Policy Proximal Policy Optimization". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 9162–70. http://dx.doi.org/10.1609/aaai.v37i8.26099.
Texto completoSchmitt, Simon, John Shawe-Taylor y Hado van Hasselt. "Chaining Value Functions for Off-Policy Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 8187–95. http://dx.doi.org/10.1609/aaai.v36i8.20792.
Texto completoXu, Da, Yuting Ye, Chuanwei Ruan y Bo Yang. "Towards Robust Off-Policy Learning for Runtime Uncertainty". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 9 (28 de junio de 2022): 10101–9. http://dx.doi.org/10.1609/aaai.v36i9.21249.
Texto completoPeters, James F. y Christopher Henry. "Approximation spaces in off-policy Monte Carlo learning". Engineering Applications of Artificial Intelligence 20, n.º 5 (agosto de 2007): 667–75. http://dx.doi.org/10.1016/j.engappai.2006.11.005.
Texto completoYu, Jiayu, Jingyao Li, Shuai Lü y Shuai Han. "Mixed experience sampling for off-policy reinforcement learning". Expert Systems with Applications 251 (octubre de 2024): 124017. http://dx.doi.org/10.1016/j.eswa.2024.124017.
Texto completoCetin, Edoardo y Oya Celiktutan. "Learning Pessimism for Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 6 (26 de junio de 2023): 6971–79. http://dx.doi.org/10.1609/aaai.v37i6.25852.
Texto completoKong, Seung-Hyun, I. Made Aswin Nahrendra y Dong-Hee Paek. "Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay". IEEE Access 9 (2021): 93152–64. http://dx.doi.org/10.1109/access.2021.3085142.
Texto completoLi, Lihong. "A perspective on off-policy evaluation in reinforcement learning". Frontiers of Computer Science 13, n.º 5 (17 de junio de 2019): 911–12. http://dx.doi.org/10.1007/s11704-019-9901-7.
Texto completoLuo, Biao, Huai-Ning Wu y Tingwen Huang. "Off-Policy Reinforcement Learning for $ H_\infty $ Control Design". IEEE Transactions on Cybernetics 45, n.º 1 (enero de 2015): 65–76. http://dx.doi.org/10.1109/tcyb.2014.2319577.
Texto completoSun, Mingfei, Sam Devlin, Katja Hofmann y Shimon Whiteson. "Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 8378–85. http://dx.doi.org/10.1609/aaai.v36i8.20813.
Texto completoJain, Arushi, Gandharv Patil, Ayush Jain, Khimya Khetarpal y Doina Precup. "Variance Penalized On-Policy and Off-Policy Actor-Critic". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 9 (18 de mayo de 2021): 7899–907. http://dx.doi.org/10.1609/aaai.v35i9.16964.
Texto completoHao, Longyan, Chaoli Wang y Yibo Shi. "Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method". Mathematics 12, n.º 10 (14 de mayo de 2024): 1533. http://dx.doi.org/10.3390/math12101533.
Texto completoGelada, Carles y Marc G. Bellemare. "Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julio de 2019): 3647–55. http://dx.doi.org/10.1609/aaai.v33i01.33013647.
Texto completoXiao, Teng y Suhang Wang. "Towards Off-Policy Learning for Ranking Policies with Logged Feedback". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 8700–8707. http://dx.doi.org/10.1609/aaai.v36i8.20849.
Texto completoLi, Jinna, Hamidreza Modares, Tianyou Chai, Frank L. Lewis y Lihua Xie. "Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games". IEEE Transactions on Neural Networks and Learning Systems 28, n.º 10 (octubre de 2017): 2434–45. http://dx.doi.org/10.1109/tnnls.2016.2609500.
Texto completoZhang, Hengrui, Youfang Lin, Shuo Shen, Sheng Han y Kai Lv. "Enhancing Off-Policy Constrained Reinforcement Learning through Adaptive Ensemble C Estimation". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 19 (24 de marzo de 2024): 21770–78. http://dx.doi.org/10.1609/aaai.v38i19.30177.
Texto completoZhang, Shangtong, Bo Liu y Shimon Whiteson. "Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 12 (18 de mayo de 2021): 10905–13. http://dx.doi.org/10.1609/aaai.v35i12.17302.
Texto completoAli, Raja Farrukh, Kevin Duong, Nasik Muhammad Nafi y William Hsu. "Multi-Horizon Learning in Procedurally-Generated Environments for Off-Policy Reinforcement Learning (Student Abstract)". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 13 (26 de junio de 2023): 16150–51. http://dx.doi.org/10.1609/aaai.v37i13.26935.
Texto completoTennenholtz, Guy, Uri Shalit y Shie Mannor. "Off-Policy Evaluation in Partially Observable Environments". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 06 (3 de abril de 2020): 10276–83. http://dx.doi.org/10.1609/aaai.v34i06.6590.
Texto completoNakamura, Yutaka, Takeshi Mori, Yoichi Tokita, Tomohiro Shibata y Shin Ishii. "Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller". Journal of Robotics and Mechatronics 17, n.º 6 (20 de diciembre de 2005): 636–44. http://dx.doi.org/10.20965/jrm.2005.p0636.
Texto completoWang, Mingyang, Zhenshan Bing, Xiangtong Yao, Shuai Wang, Huang Kai, Hang Su, Chenguang Yang y Alois Knoll. "Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 10157–65. http://dx.doi.org/10.1609/aaai.v37i8.26210.
Texto completoCao, Jiaqing, Quan Liu, Fei Zhu, Qiming Fu y Shan Zhong. "Gradient temporal-difference learning for off-policy evaluation using emphatic weightings". Information Sciences 580 (noviembre de 2021): 311–30. http://dx.doi.org/10.1016/j.ins.2021.08.082.
Texto completoTian, Chang, An Liu, Guan Huang y Wu Luo. "Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning". IEEE Transactions on Signal Processing 70 (2022): 1609–24. http://dx.doi.org/10.1109/tsp.2022.3158737.
Texto completoKarimpanal, Thommen George y Erik Wilhelm. "Identification and off-policy learning of multiple objectives using adaptive clustering". Neurocomputing 263 (noviembre de 2017): 39–47. http://dx.doi.org/10.1016/j.neucom.2017.04.074.
Texto completoKiumarsi, Bahare, Frank L. Lewis y Zhong-Ping Jiang. "H∞ control of linear discrete-time systems: Off-policy reinforcement learning". Automatica 78 (abril de 2017): 144–52. http://dx.doi.org/10.1016/j.automatica.2016.12.009.
Texto completoLi, Jinna, Zhenfei Xiao y Ping Li. "Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning". IEEE Access 7 (2019): 134647–59. http://dx.doi.org/10.1109/access.2019.2939384.
Texto completoKiumarsi, Bahare, Wei Kang y Frank L. Lewis. "H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning". Unmanned Systems 04, n.º 01 (enero de 2016): 51–60. http://dx.doi.org/10.1142/s2301385016400069.
Texto completoLian, Bosen, Wenqian Xue, Yijing Xie, Frank L. Lewis y Ali Davoudi. "Off-policy inverse Q-learning for discrete-time antagonistic unknown systems". Automatica 155 (septiembre de 2023): 111171. http://dx.doi.org/10.1016/j.automatica.2023.111171.
Texto completoKim, Man-Je, Hyunsoo Park y Chang Wook Ahn. "Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning". Electronics 11, n.º 7 (28 de marzo de 2022): 1069. http://dx.doi.org/10.3390/electronics11071069.
Texto completoChaudhari, Shreyas, David Arbour, Georgios Theocharous y Nikos Vlassis. "Distributional Off-Policy Evaluation for Slate Recommendations". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 8 (24 de marzo de 2024): 8265–73. http://dx.doi.org/10.1609/aaai.v38i8.28667.
Texto completoZhang, Ruiyi, Tong Yu, Yilin Shen y Hongxia Jin. "Text-Based Interactive Recommendation via Offline Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 10 (28 de junio de 2022): 11694–702. http://dx.doi.org/10.1609/aaai.v36i10.21424.
Texto completoXu, Z., L. Cao y X. Chen. "Deep Reinforcement Learning with Adaptive Update Target Combination". Computer Journal 63, n.º 7 (15 de agosto de 2019): 995–1003. http://dx.doi.org/10.1093/comjnl/bxz066.
Texto completoShahid, Asad Ali, Dario Piga, Francesco Braghin y Loris Roveda. "Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning". Autonomous Robots 46, n.º 3 (9 de febrero de 2022): 483–98. http://dx.doi.org/10.1007/s10514-022-10034-z.
Texto completoHollenstein, Jakob, Georg Martius y Justus Piater. "Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 11 (24 de marzo de 2024): 12466–72. http://dx.doi.org/10.1609/aaai.v38i11.29139.
Texto completoRen, He, Jing Dai, Huaguang Zhang y Kun Zhang. "Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems". Transactions of the Institute of Measurement and Control 42, n.º 15 (6 de julio de 2020): 2919–28. http://dx.doi.org/10.1177/0142331220932634.
Texto completoLevine, Alexander y Soheil Feizi. "Goal-Conditioned Q-learning as Knowledge Distillation". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 7 (26 de junio de 2023): 8500–8509. http://dx.doi.org/10.1609/aaai.v37i7.26024.
Texto completoYang, Hyunjun, Hyeonjun Park y Kyungjae Lee. "A Selective Portfolio Management Algorithm with Off-Policy Reinforcement Learning Using Dirichlet Distribution". Axioms 11, n.º 12 (23 de noviembre de 2022): 664. http://dx.doi.org/10.3390/axioms11120664.
Texto completoSuttle, Wesley, Zhuoran Yang, Kaiqing Zhang, Zhaoran Wang, Tamer Başar y Ji Liu. "A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning". IFAC-PapersOnLine 53, n.º 2 (2020): 1549–54. http://dx.doi.org/10.1016/j.ifacol.2020.12.2021.
Texto completoStanković, Miloš S., Marko Beko y Srdjan S. Stanković. "Distributed Gradient Temporal Difference Off-policy Learning With Eligibility Traces: Weak Convergence". IFAC-PapersOnLine 53, n.º 2 (2020): 1563–68. http://dx.doi.org/10.1016/j.ifacol.2020.12.2184.
Texto completoLi, Jinna, Zhenfei Xiao, Tianyou Chai, Frank L. Lewis y Sarangapani Jagannathan. "Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems". IFAC-PapersOnLine 53, n.º 2 (2020): 9189–94. http://dx.doi.org/10.1016/j.ifacol.2020.12.2180.
Texto completoKim y Park. "Exploration with Multiple Random ε-Buffers in Off-Policy Deep Reinforcement Learning". Symmetry 11, n.º 11 (1 de noviembre de 2019): 1352. http://dx.doi.org/10.3390/sym11111352.
Texto completoChen, Ning, Shuhan Luo, Jiayang Dai, Biao Luo y Weihua Gui. "Optimal Control of Iron-Removal Systems Based on Off-Policy Reinforcement Learning". IEEE Access 8 (2020): 149730–40. http://dx.doi.org/10.1109/access.2020.3015801.
Texto completoHachiya, Hirotaka, Takayuki Akiyama, Masashi Sugiayma y Jan Peters. "Adaptive importance sampling for value function approximation in off-policy reinforcement learning". Neural Networks 22, n.º 10 (diciembre de 2009): 1399–410. http://dx.doi.org/10.1016/j.neunet.2009.01.002.
Texto completoZuo, Guoyu, Qishen Zhao, Kexin Chen, Jiangeng Li y Daoxiong Gong. "Off-policy adversarial imitation learning for robotic tasks with low-quality demonstrations". Applied Soft Computing 97 (diciembre de 2020): 106795. http://dx.doi.org/10.1016/j.asoc.2020.106795.
Texto completoGivchi, Arash y Maziar Palhang. "Off-policy temporal difference learning with distribution adaptation in fast mixing chains". Soft Computing 22, n.º 3 (30 de enero de 2017): 737–50. http://dx.doi.org/10.1007/s00500-017-2490-1.
Texto completoLiu, Mushuang, Yan Wan, Frank L. Lewis y Victor G. Lopez. "Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning". IEEE Transactions on Neural Networks and Learning Systems 31, n.º 12 (diciembre de 2020): 5522–33. http://dx.doi.org/10.1109/tnnls.2020.2969215.
Texto completoPritchett, Lant y Justin Sandefur. "Learning from Experiments when Context Matters". American Economic Review 105, n.º 5 (1 de mayo de 2015): 471–75. http://dx.doi.org/10.1257/aer.p20151016.
Texto completoChen, Zaiwei. "A Unified Lyapunov Framework for Finite-Sample Analysis of Reinforcement Learning Algorithms". ACM SIGMETRICS Performance Evaluation Review 50, n.º 3 (30 de diciembre de 2022): 12–15. http://dx.doi.org/10.1145/3579342.3579346.
Texto completoNarita, Yusuke, Kyohei Okumura, Akihiro Shimizu y Kohei Yata. "Counterfactual Learning with General Data-Generating Policies". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 9286–93. http://dx.doi.org/10.1609/aaai.v37i8.26113.
Texto completoKim, MyeongSeop, Jung-Su Kim, Myoung-Su Choi y Jae-Han Park. "Adaptive Discount Factor for Deep Reinforcement Learning in Continuing Tasks with Uncertainty". Sensors 22, n.º 19 (25 de septiembre de 2022): 7266. http://dx.doi.org/10.3390/s22197266.
Texto completo