Artículos de revistas sobre el tema "Reinforcement Learning, Multi-armed Bandits"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Reinforcement Learning, Multi-armed Bandits".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Wan, Zongqi, Zhijie Zhang, Tongyang Li, Jialin Zhang y Xiaoming Sun. "Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 10087–94. http://dx.doi.org/10.1609/aaai.v37i8.26202.
Texto completoCiucanu, Radu, Pascal Lafourcade, Gael Marcadet y Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits". Journal of Artificial Intelligence Research 73 (23 de febrero de 2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.
Texto completoHuanca-Anquise, Candy A., Ana Lúcia Cetertich Bazzan y Anderson R. Tavares. "Multi-Objective, Multi-Armed Bandits: Algorithms for Repeated Games and Application to Route Choice". Revista de Informática Teórica e Aplicada 30, n.º 1 (30 de enero de 2023): 11–23. http://dx.doi.org/10.22456/2175-2745.122929.
Texto completoGiachino, Chiara, Luigi Bollani, Alessandro Bonadonna y Marco Bertetti. "Reinforcement learning for content's customization: a first step of experimentation in Skyscanner". Industrial Management & Data Systems 121, n.º 6 (15 de enero de 2021): 1417–34. http://dx.doi.org/10.1108/imds-12-2019-0722.
Texto completoNoothigattu, Ritesh, Tom Yan y Ariel D. Procaccia. "Inverse Reinforcement Learning From Like-Minded Teachers". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 10 (18 de mayo de 2021): 9197–204. http://dx.doi.org/10.1609/aaai.v35i10.17110.
Texto completoXiong, Guojun, Jian Li y Rahul Singh. "Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 8726–34. http://dx.doi.org/10.1609/aaai.v36i8.20852.
Texto completoHuo, Xiaoguang y Feng Fu. "Risk-aware multi-armed bandit problem with application to portfolio selection". Royal Society Open Science 4, n.º 11 (noviembre de 2017): 171377. http://dx.doi.org/10.1098/rsos.171377.
Texto completoNobari, Sadegh. "DBA: Dynamic Multi-Armed Bandit Algorithm". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julio de 2019): 9869–70. http://dx.doi.org/10.1609/aaai.v33i01.33019869.
Texto completoEsfandiari, Hossein, MohammadTaghi HajiAghayi, Brendan Lucier y Michael Mitzenmacher. "Online Pandora’s Boxes and Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julio de 2019): 1885–92. http://dx.doi.org/10.1609/aaai.v33i01.33011885.
Texto completoLefebvre, Germain, Christopher Summerfield y Rafal Bogacz. "A Normative Account of Confirmation Bias During Reinforcement Learning". Neural Computation 34, n.º 2 (14 de enero de 2022): 307–37. http://dx.doi.org/10.1162/neco_a_01455.
Texto completoKoulouriotis, D. E. y A. Xanthopoulos. "Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems". Applied Mathematics and Computation 196, n.º 2 (marzo de 2008): 913–22. http://dx.doi.org/10.1016/j.amc.2007.07.043.
Texto completoElizarov, Artem Aleksandrovich y Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning". Russian Digital Libraries Journal 23, n.º 6 (12 de mayo de 2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.
Texto completoMorimoto, Juliano. "Foraging decisions as multi-armed bandit problems: Applying reinforcement learning algorithms to foraging data". Journal of Theoretical Biology 467 (abril de 2019): 48–56. http://dx.doi.org/10.1016/j.jtbi.2019.02.002.
Texto completoAskhedkar, Anjali R. y Bharat S. Chaudhari. "Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement". Journal of Sensor and Actuator Networks 12, n.º 3 (4 de mayo de 2023): 38. http://dx.doi.org/10.3390/jsan12030038.
Texto completoEspinosa-Leal, Leonardo, Anthony Chapman y Magnus Westerlund. "Autonomous Industrial Management via Reinforcement Learning". Journal of Intelligent & Fuzzy Systems 39, n.º 6 (4 de diciembre de 2020): 8427–39. http://dx.doi.org/10.3233/jifs-189161.
Texto completoTeymuri, Benyamin, Reza Serati, Nikolaos Athanasios Anagnostopoulos y Mehdi Rasti. "LP-MAB: Improving the Energy Efficiency of LoRaWAN Using a Reinforcement-Learning-Based Adaptive Configuration Algorithm". Sensors 23, n.º 4 (20 de febrero de 2023): 2363. http://dx.doi.org/10.3390/s23042363.
Texto completoVaratharajah, Yogatheesan y Brent Berry. "A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials". Life 12, n.º 8 (21 de agosto de 2022): 1277. http://dx.doi.org/10.3390/life12081277.
Texto completoZhou, Jinkai, Xuebo Lai y Joseph Y. J. Chow. "Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty". Transportation Research Record: Journal of the Transportation Research Board 2673, n.º 10 (2 de junio de 2019): 673–82. http://dx.doi.org/10.1177/0361198119850457.
Texto completoDai, Yue, Jiangang Lu, Zhiwen Yu y Ruifeng Zhao. "High-Precision Timing Method of BeiDou-3 System Based on Reinforcement Learning". Journal of Physics: Conference Series 2401, n.º 1 (1 de diciembre de 2022): 012093. http://dx.doi.org/10.1088/1742-6596/2401/1/012093.
Texto completoDunne, Simon, Arun D'Souza y John P. O'Doherty. "The involvement of model-based but not model-free learning signals during observational reward learning in the absence of choice". Journal of Neurophysiology 115, n.º 6 (1 de junio de 2016): 3195–203. http://dx.doi.org/10.1152/jn.00046.2016.
Texto completoKessler, Samuel, Jack Parker-Holder, Philip Ball, Stefan Zohren y Stephen J. Roberts. "Same State, Different Task: Continual Reinforcement Learning without Interference". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 7 (28 de junio de 2022): 7143–51. http://dx.doi.org/10.1609/aaai.v36i7.20674.
Texto completoLi, Xinbin, Xianglin Xu, Lei Yan, Haihong Zhao y Tongwei Zhang. "Energy-Efficient Data Collection Using Autonomous Underwater Glider: A Reinforcement Learning Formulation". Sensors 20, n.º 13 (4 de julio de 2020): 3758. http://dx.doi.org/10.3390/s20133758.
Texto completoYu, Junpu. "Thompson -Greedy Algorithm: An Improvement to the Regret of Thompson Sampling and -Greedy on Multi-Armed Bandit Problems". Applied and Computational Engineering 8, n.º 1 (1 de agosto de 2023): 525–34. http://dx.doi.org/10.54254/2755-2721/8/20230264.
Texto completoBotchkaryov, Alexey. "Task sequence planning by intelligent agent with context awareness". Computer systems and network 4, n.º 1 (16 de diciembre de 2022): 12–20. http://dx.doi.org/10.23939/csn2022.01.012.
Texto completoAmirizadeh, Khosrow y Rajeswari Mandava. "Fast Iterative model for Sequential-Selection-Based Applications". INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 12, n.º 7 (14 de febrero de 2014): 3689–96. http://dx.doi.org/10.24297/ijct.v12i7.3092.
Texto completoKamikokuryo, Kenta, Takumi Haga, Gentiane Venture y Vincent Hernandez. "Adversarial Autoencoder and Multi-Armed Bandit for Dynamic Difficulty Adjustment in Immersive Virtual Reality for Rehabilitation: Application to Hand Movement". Sensors 22, n.º 12 (14 de junio de 2022): 4499. http://dx.doi.org/10.3390/s22124499.
Texto completoMoy, Christophe, Lilian Besson, Guillaume Delbarre y Laurent Toutain. "Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments". Annals of Telecommunications 75, n.º 11-12 (27 de agosto de 2020): 711–27. http://dx.doi.org/10.1007/s12243-020-00795-y.
Texto completoShi, Chengshuai y Cong Shen. "Federated Multi-Armed Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 11 (18 de mayo de 2021): 9603–11. http://dx.doi.org/10.1609/aaai.v35i11.17156.
Texto completoChai, Chengliang, Jiabin Liu, Nan Tang, Guoliang Li y Yuyu Luo. "Selective data acquisition in the wild for model charging". Proceedings of the VLDB Endowment 15, n.º 7 (marzo de 2022): 1466–78. http://dx.doi.org/10.14778/3523210.3523223.
Texto completoSankararaman, Abishek, Ayalvadi Ganesh y Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits". Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, n.º 3 (17 de diciembre de 2019): 1–35. http://dx.doi.org/10.1145/3366701.
Texto completoSankararaman, Abishek, Ayalvadi Ganesh y Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits". ACM SIGMETRICS Performance Evaluation Review 48, n.º 1 (8 de julio de 2020): 29–30. http://dx.doi.org/10.1145/3410048.3410065.
Texto completoFlynn, Hamish, David Reeb, Melih Kandemir y Jan Peters. "PAC-Bayesian lifelong learning for multi-armed bandits". Data Mining and Knowledge Discovery 36, n.º 2 (marzo de 2022): 841–76. http://dx.doi.org/10.1007/s10618-022-00825-4.
Texto completoAmeen, Salem y Sunil Vadera. "Pruning Neural Networks Using Multi-Armed Bandits". Computer Journal 63, n.º 7 (26 de septiembre de 2019): 1099–108. http://dx.doi.org/10.1093/comjnl/bxz078.
Texto completoXu, Xiao y Qing Zhao. "Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits". IEEE Transactions on Signal Processing 69 (2021): 2371–82. http://dx.doi.org/10.1109/tsp.2021.3070201.
Texto completoWang, Kai, Lily Xu, Aparna Taneja y Milind Tambe. "Optimistic Whittle Index Policy: Online Learning for Restless Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 10131–39. http://dx.doi.org/10.1609/aaai.v37i8.26207.
Texto completoZhao, Qing. "Multi-Armed Bandits: Theory and Applications to Online Learning in Networks". Synthesis Lectures on Communication Networks 12, n.º 1 (20 de noviembre de 2019): 1–165. http://dx.doi.org/10.2200/s00941ed2v01y201907cnt022.
Texto completoWeinstein, Ari y Michael Littman. "Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes". Proceedings of the International Conference on Automated Planning and Scheduling 22 (14 de mayo de 2012): 306–14. http://dx.doi.org/10.1609/icaps.v22i1.13507.
Texto completoKaibel, Chris y Torsten Biemann. "Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments". Organizational Research Methods 24, n.º 1 (11 de junio de 2019): 78–103. http://dx.doi.org/10.1177/1094428119854153.
Texto completoLumbreras, Josep, Erkka Haapasalo y Marco Tomamichel. "Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states". Quantum 6 (29 de junio de 2022): 749. http://dx.doi.org/10.22331/q-2022-06-29-749.
Texto completoVial, Daniel, Sanjay Shakkottai y R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs". ACM SIGMETRICS Performance Evaluation Review 51, n.º 1 (26 de junio de 2023): 67–68. http://dx.doi.org/10.1145/3606376.3593567.
Texto completoBen-Porat, Omer, Lee Cohen, Liu Leqi, Zachary C. Lipton y Yishay Mansour. "Modeling Attrition in Recommender Systems with Departing Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 6 (28 de junio de 2022): 6072–79. http://dx.doi.org/10.1609/aaai.v36i6.20554.
Texto completoVial, Daniel, Sanjay Shakkottai y R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs". Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, n.º 3 (diciembre de 2022): 1–57. http://dx.doi.org/10.1145/3570614.
Texto completoPacchiano, Aldo, Heinrich Jiang y Michael I. Jordan. "Robustness Guarantees for Mode Estimation with an Application to Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 10 (18 de mayo de 2021): 9277–84. http://dx.doi.org/10.1609/aaai.v35i10.17119.
Texto completoYeh, Yi-Liang y Po-Kai Yang. "Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions". Machines 9, n.º 12 (26 de noviembre de 2021): 319. http://dx.doi.org/10.3390/machines9120319.
Texto completoGarcelon, Evrard, Mohammad Ghavamzadeh, Alessandro Lazaric y Matteo Pirotta. "Improved Algorithms for Conservative Exploration in Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 04 (3 de abril de 2020): 3962–69. http://dx.doi.org/10.1609/aaai.v34i04.5812.
Texto completoDu, Yihan, Siwei Wang y Longbo Huang. "A One-Size-Fits-All Solution to Conservative Bandit Problems". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 8 (18 de mayo de 2021): 7254–61. http://dx.doi.org/10.1609/aaai.v35i8.16891.
Texto completoLi, Yang, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang y Bin Cui. "Efficient Automatic CASH via Rising Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 04 (3 de abril de 2020): 4763–71. http://dx.doi.org/10.1609/aaai.v34i04.5910.
Texto completoWang, Kai, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde y Milind Tambe. "Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 10 (26 de junio de 2023): 12138–46. http://dx.doi.org/10.1609/aaai.v37i10.26431.
Texto completoGuo, Han, Ramakanth Pasunuru y Mohit Bansal. "Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 05 (3 de abril de 2020): 7830–38. http://dx.doi.org/10.1609/aaai.v34i05.6288.
Texto completoLupu, Andrei, Audrey Durand y Doina Precup. "Leveraging Observations in Bandits: Between Risks and Benefits". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julio de 2019): 6112–19. http://dx.doi.org/10.1609/aaai.v33i01.33016112.
Texto completo