Journal articles on the topic 'Reinforcement Learning, Multi-armed Bandits'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 journal articles for your research on the topic 'Reinforcement Learning, Multi-armed Bandits.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Wan, Zongqi, Zhijie Zhang, Tongyang Li, Jialin Zhang, and Xiaoming Sun. "Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 10087–94. http://dx.doi.org/10.1609/aaai.v37i8.26202.
Full textCiucanu, Radu, Pascal Lafourcade, Gael Marcadet, and Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits." Journal of Artificial Intelligence Research 73 (February 23, 2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.
Full textHuanca-Anquise, Candy A., Ana Lúcia Cetertich Bazzan, and Anderson R. Tavares. "Multi-Objective, Multi-Armed Bandits: Algorithms for Repeated Games and Application to Route Choice." Revista de Informática Teórica e Aplicada 30, no. 1 (January 30, 2023): 11–23. http://dx.doi.org/10.22456/2175-2745.122929.
Full textGiachino, Chiara, Luigi Bollani, Alessandro Bonadonna, and Marco Bertetti. "Reinforcement learning for content's customization: a first step of experimentation in Skyscanner." Industrial Management & Data Systems 121, no. 6 (January 15, 2021): 1417–34. http://dx.doi.org/10.1108/imds-12-2019-0722.
Full textNoothigattu, Ritesh, Tom Yan, and Ariel D. Procaccia. "Inverse Reinforcement Learning From Like-Minded Teachers." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9197–204. http://dx.doi.org/10.1609/aaai.v35i10.17110.
Full textXiong, Guojun, Jian Li, and Rahul Singh. "Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8726–34. http://dx.doi.org/10.1609/aaai.v36i8.20852.
Full textHuo, Xiaoguang, and Feng Fu. "Risk-aware multi-armed bandit problem with application to portfolio selection." Royal Society Open Science 4, no. 11 (November 2017): 171377. http://dx.doi.org/10.1098/rsos.171377.
Full textNobari, Sadegh. "DBA: Dynamic Multi-Armed Bandit Algorithm." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9869–70. http://dx.doi.org/10.1609/aaai.v33i01.33019869.
Full textEsfandiari, Hossein, MohammadTaghi HajiAghayi, Brendan Lucier, and Michael Mitzenmacher. "Online Pandora’s Boxes and Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 1885–92. http://dx.doi.org/10.1609/aaai.v33i01.33011885.
Full textLefebvre, Germain, Christopher Summerfield, and Rafal Bogacz. "A Normative Account of Confirmation Bias During Reinforcement Learning." Neural Computation 34, no. 2 (January 14, 2022): 307–37. http://dx.doi.org/10.1162/neco_a_01455.
Full textKoulouriotis, D. E., and A. Xanthopoulos. "Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems." Applied Mathematics and Computation 196, no. 2 (March 2008): 913–22. http://dx.doi.org/10.1016/j.amc.2007.07.043.
Full textElizarov, Artem Aleksandrovich, and Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning." Russian Digital Libraries Journal 23, no. 6 (May 12, 2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.
Full textMorimoto, Juliano. "Foraging decisions as multi-armed bandit problems: Applying reinforcement learning algorithms to foraging data." Journal of Theoretical Biology 467 (April 2019): 48–56. http://dx.doi.org/10.1016/j.jtbi.2019.02.002.
Full textAskhedkar, Anjali R., and Bharat S. Chaudhari. "Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement." Journal of Sensor and Actuator Networks 12, no. 3 (May 4, 2023): 38. http://dx.doi.org/10.3390/jsan12030038.
Full textEspinosa-Leal, Leonardo, Anthony Chapman, and Magnus Westerlund. "Autonomous Industrial Management via Reinforcement Learning." Journal of Intelligent & Fuzzy Systems 39, no. 6 (December 4, 2020): 8427–39. http://dx.doi.org/10.3233/jifs-189161.
Full textTeymuri, Benyamin, Reza Serati, Nikolaos Athanasios Anagnostopoulos, and Mehdi Rasti. "LP-MAB: Improving the Energy Efficiency of LoRaWAN Using a Reinforcement-Learning-Based Adaptive Configuration Algorithm." Sensors 23, no. 4 (February 20, 2023): 2363. http://dx.doi.org/10.3390/s23042363.
Full textVaratharajah, Yogatheesan, and Brent Berry. "A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials." Life 12, no. 8 (August 21, 2022): 1277. http://dx.doi.org/10.3390/life12081277.
Full textZhou, Jinkai, Xuebo Lai, and Joseph Y. J. Chow. "Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty." Transportation Research Record: Journal of the Transportation Research Board 2673, no. 10 (June 2, 2019): 673–82. http://dx.doi.org/10.1177/0361198119850457.
Full textDai, Yue, Jiangang Lu, Zhiwen Yu, and Ruifeng Zhao. "High-Precision Timing Method of BeiDou-3 System Based on Reinforcement Learning." Journal of Physics: Conference Series 2401, no. 1 (December 1, 2022): 012093. http://dx.doi.org/10.1088/1742-6596/2401/1/012093.
Full textDunne, Simon, Arun D'Souza, and John P. O'Doherty. "The involvement of model-based but not model-free learning signals during observational reward learning in the absence of choice." Journal of Neurophysiology 115, no. 6 (June 1, 2016): 3195–203. http://dx.doi.org/10.1152/jn.00046.2016.
Full textKessler, Samuel, Jack Parker-Holder, Philip Ball, Stefan Zohren, and Stephen J. Roberts. "Same State, Different Task: Continual Reinforcement Learning without Interference." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7143–51. http://dx.doi.org/10.1609/aaai.v36i7.20674.
Full textLi, Xinbin, Xianglin Xu, Lei Yan, Haihong Zhao, and Tongwei Zhang. "Energy-Efficient Data Collection Using Autonomous Underwater Glider: A Reinforcement Learning Formulation." Sensors 20, no. 13 (July 4, 2020): 3758. http://dx.doi.org/10.3390/s20133758.
Full textYu, Junpu. "Thompson -Greedy Algorithm: An Improvement to the Regret of Thompson Sampling and -Greedy on Multi-Armed Bandit Problems." Applied and Computational Engineering 8, no. 1 (August 1, 2023): 525–34. http://dx.doi.org/10.54254/2755-2721/8/20230264.
Full textBotchkaryov, Alexey. "Task sequence planning by intelligent agent with context awareness." Computer systems and network 4, no. 1 (December 16, 2022): 12–20. http://dx.doi.org/10.23939/csn2022.01.012.
Full textAmirizadeh, Khosrow, and Rajeswari Mandava. "Fast Iterative model for Sequential-Selection-Based Applications." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 12, no. 7 (February 14, 2014): 3689–96. http://dx.doi.org/10.24297/ijct.v12i7.3092.
Full textKamikokuryo, Kenta, Takumi Haga, Gentiane Venture, and Vincent Hernandez. "Adversarial Autoencoder and Multi-Armed Bandit for Dynamic Difficulty Adjustment in Immersive Virtual Reality for Rehabilitation: Application to Hand Movement." Sensors 22, no. 12 (June 14, 2022): 4499. http://dx.doi.org/10.3390/s22124499.
Full textMoy, Christophe, Lilian Besson, Guillaume Delbarre, and Laurent Toutain. "Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments." Annals of Telecommunications 75, no. 11-12 (August 27, 2020): 711–27. http://dx.doi.org/10.1007/s12243-020-00795-y.
Full textShi, Chengshuai, and Cong Shen. "Federated Multi-Armed Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 9603–11. http://dx.doi.org/10.1609/aaai.v35i11.17156.
Full textChai, Chengliang, Jiabin Liu, Nan Tang, Guoliang Li, and Yuyu Luo. "Selective data acquisition in the wild for model charging." Proceedings of the VLDB Endowment 15, no. 7 (March 2022): 1466–78. http://dx.doi.org/10.14778/3523210.3523223.
Full textSankararaman, Abishek, Ayalvadi Ganesh, and Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits." Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, no. 3 (December 17, 2019): 1–35. http://dx.doi.org/10.1145/3366701.
Full textSankararaman, Abishek, Ayalvadi Ganesh, and Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits." ACM SIGMETRICS Performance Evaluation Review 48, no. 1 (July 8, 2020): 29–30. http://dx.doi.org/10.1145/3410048.3410065.
Full textFlynn, Hamish, David Reeb, Melih Kandemir, and Jan Peters. "PAC-Bayesian lifelong learning for multi-armed bandits." Data Mining and Knowledge Discovery 36, no. 2 (March 2022): 841–76. http://dx.doi.org/10.1007/s10618-022-00825-4.
Full textAmeen, Salem, and Sunil Vadera. "Pruning Neural Networks Using Multi-Armed Bandits." Computer Journal 63, no. 7 (September 26, 2019): 1099–108. http://dx.doi.org/10.1093/comjnl/bxz078.
Full textXu, Xiao, and Qing Zhao. "Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits." IEEE Transactions on Signal Processing 69 (2021): 2371–82. http://dx.doi.org/10.1109/tsp.2021.3070201.
Full textWang, Kai, Lily Xu, Aparna Taneja, and Milind Tambe. "Optimistic Whittle Index Policy: Online Learning for Restless Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 10131–39. http://dx.doi.org/10.1609/aaai.v37i8.26207.
Full textZhao, Qing. "Multi-Armed Bandits: Theory and Applications to Online Learning in Networks." Synthesis Lectures on Communication Networks 12, no. 1 (November 20, 2019): 1–165. http://dx.doi.org/10.2200/s00941ed2v01y201907cnt022.
Full textWeinstein, Ari, and Michael Littman. "Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes." Proceedings of the International Conference on Automated Planning and Scheduling 22 (May 14, 2012): 306–14. http://dx.doi.org/10.1609/icaps.v22i1.13507.
Full textKaibel, Chris, and Torsten Biemann. "Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments." Organizational Research Methods 24, no. 1 (June 11, 2019): 78–103. http://dx.doi.org/10.1177/1094428119854153.
Full textLumbreras, Josep, Erkka Haapasalo, and Marco Tomamichel. "Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states." Quantum 6 (June 29, 2022): 749. http://dx.doi.org/10.22331/q-2022-06-29-749.
Full textVial, Daniel, Sanjay Shakkottai, and R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs." ACM SIGMETRICS Performance Evaluation Review 51, no. 1 (June 26, 2023): 67–68. http://dx.doi.org/10.1145/3606376.3593567.
Full textBen-Porat, Omer, Lee Cohen, Liu Leqi, Zachary C. Lipton, and Yishay Mansour. "Modeling Attrition in Recommender Systems with Departing Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6072–79. http://dx.doi.org/10.1609/aaai.v36i6.20554.
Full textVial, Daniel, Sanjay Shakkottai, and R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs." Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, no. 3 (December 2022): 1–57. http://dx.doi.org/10.1145/3570614.
Full textPacchiano, Aldo, Heinrich Jiang, and Michael I. Jordan. "Robustness Guarantees for Mode Estimation with an Application to Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9277–84. http://dx.doi.org/10.1609/aaai.v35i10.17119.
Full textYeh, Yi-Liang, and Po-Kai Yang. "Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions." Machines 9, no. 12 (November 26, 2021): 319. http://dx.doi.org/10.3390/machines9120319.
Full textGarcelon, Evrard, Mohammad Ghavamzadeh, Alessandro Lazaric, and Matteo Pirotta. "Improved Algorithms for Conservative Exploration in Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3962–69. http://dx.doi.org/10.1609/aaai.v34i04.5812.
Full textDu, Yihan, Siwei Wang, and Longbo Huang. "A One-Size-Fits-All Solution to Conservative Bandit Problems." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 7254–61. http://dx.doi.org/10.1609/aaai.v35i8.16891.
Full textLi, Yang, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang, and Bin Cui. "Efficient Automatic CASH via Rising Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4763–71. http://dx.doi.org/10.1609/aaai.v34i04.5910.
Full textWang, Kai, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde, and Milind Tambe. "Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 10 (June 26, 2023): 12138–46. http://dx.doi.org/10.1609/aaai.v37i10.26431.
Full textGuo, Han, Ramakanth Pasunuru, and Mohit Bansal. "Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7830–38. http://dx.doi.org/10.1609/aaai.v34i05.6288.
Full textLupu, Andrei, Audrey Durand, and Doina Precup. "Leveraging Observations in Bandits: Between Risks and Benefits." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6112–19. http://dx.doi.org/10.1609/aaai.v33i01.33016112.
Full text