Статті в журналах з теми "Reinforcement Learning, Multi-armed Bandits"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Reinforcement Learning, Multi-armed Bandits".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.
Wan, Zongqi, Zhijie Zhang, Tongyang Li, Jialin Zhang, and Xiaoming Sun. "Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 10087–94. http://dx.doi.org/10.1609/aaai.v37i8.26202.
Повний текст джерелаCiucanu, Radu, Pascal Lafourcade, Gael Marcadet, and Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits." Journal of Artificial Intelligence Research 73 (February 23, 2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.
Повний текст джерелаHuanca-Anquise, Candy A., Ana Lúcia Cetertich Bazzan, and Anderson R. Tavares. "Multi-Objective, Multi-Armed Bandits: Algorithms for Repeated Games and Application to Route Choice." Revista de Informática Teórica e Aplicada 30, no. 1 (January 30, 2023): 11–23. http://dx.doi.org/10.22456/2175-2745.122929.
Повний текст джерелаGiachino, Chiara, Luigi Bollani, Alessandro Bonadonna, and Marco Bertetti. "Reinforcement learning for content's customization: a first step of experimentation in Skyscanner." Industrial Management & Data Systems 121, no. 6 (January 15, 2021): 1417–34. http://dx.doi.org/10.1108/imds-12-2019-0722.
Повний текст джерелаNoothigattu, Ritesh, Tom Yan, and Ariel D. Procaccia. "Inverse Reinforcement Learning From Like-Minded Teachers." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9197–204. http://dx.doi.org/10.1609/aaai.v35i10.17110.
Повний текст джерелаXiong, Guojun, Jian Li, and Rahul Singh. "Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8726–34. http://dx.doi.org/10.1609/aaai.v36i8.20852.
Повний текст джерелаHuo, Xiaoguang, and Feng Fu. "Risk-aware multi-armed bandit problem with application to portfolio selection." Royal Society Open Science 4, no. 11 (November 2017): 171377. http://dx.doi.org/10.1098/rsos.171377.
Повний текст джерелаNobari, Sadegh. "DBA: Dynamic Multi-Armed Bandit Algorithm." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9869–70. http://dx.doi.org/10.1609/aaai.v33i01.33019869.
Повний текст джерелаEsfandiari, Hossein, MohammadTaghi HajiAghayi, Brendan Lucier, and Michael Mitzenmacher. "Online Pandora’s Boxes and Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 1885–92. http://dx.doi.org/10.1609/aaai.v33i01.33011885.
Повний текст джерелаLefebvre, Germain, Christopher Summerfield, and Rafal Bogacz. "A Normative Account of Confirmation Bias During Reinforcement Learning." Neural Computation 34, no. 2 (January 14, 2022): 307–37. http://dx.doi.org/10.1162/neco_a_01455.
Повний текст джерелаKoulouriotis, D. E., and A. Xanthopoulos. "Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems." Applied Mathematics and Computation 196, no. 2 (March 2008): 913–22. http://dx.doi.org/10.1016/j.amc.2007.07.043.
Повний текст джерелаElizarov, Artem Aleksandrovich, and Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning." Russian Digital Libraries Journal 23, no. 6 (May 12, 2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.
Повний текст джерелаMorimoto, Juliano. "Foraging decisions as multi-armed bandit problems: Applying reinforcement learning algorithms to foraging data." Journal of Theoretical Biology 467 (April 2019): 48–56. http://dx.doi.org/10.1016/j.jtbi.2019.02.002.
Повний текст джерелаAskhedkar, Anjali R., and Bharat S. Chaudhari. "Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement." Journal of Sensor and Actuator Networks 12, no. 3 (May 4, 2023): 38. http://dx.doi.org/10.3390/jsan12030038.
Повний текст джерелаEspinosa-Leal, Leonardo, Anthony Chapman, and Magnus Westerlund. "Autonomous Industrial Management via Reinforcement Learning." Journal of Intelligent & Fuzzy Systems 39, no. 6 (December 4, 2020): 8427–39. http://dx.doi.org/10.3233/jifs-189161.
Повний текст джерелаTeymuri, Benyamin, Reza Serati, Nikolaos Athanasios Anagnostopoulos, and Mehdi Rasti. "LP-MAB: Improving the Energy Efficiency of LoRaWAN Using a Reinforcement-Learning-Based Adaptive Configuration Algorithm." Sensors 23, no. 4 (February 20, 2023): 2363. http://dx.doi.org/10.3390/s23042363.
Повний текст джерелаVaratharajah, Yogatheesan, and Brent Berry. "A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials." Life 12, no. 8 (August 21, 2022): 1277. http://dx.doi.org/10.3390/life12081277.
Повний текст джерелаZhou, Jinkai, Xuebo Lai, and Joseph Y. J. Chow. "Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty." Transportation Research Record: Journal of the Transportation Research Board 2673, no. 10 (June 2, 2019): 673–82. http://dx.doi.org/10.1177/0361198119850457.
Повний текст джерелаDai, Yue, Jiangang Lu, Zhiwen Yu, and Ruifeng Zhao. "High-Precision Timing Method of BeiDou-3 System Based on Reinforcement Learning." Journal of Physics: Conference Series 2401, no. 1 (December 1, 2022): 012093. http://dx.doi.org/10.1088/1742-6596/2401/1/012093.
Повний текст джерелаDunne, Simon, Arun D'Souza, and John P. O'Doherty. "The involvement of model-based but not model-free learning signals during observational reward learning in the absence of choice." Journal of Neurophysiology 115, no. 6 (June 1, 2016): 3195–203. http://dx.doi.org/10.1152/jn.00046.2016.
Повний текст джерелаKessler, Samuel, Jack Parker-Holder, Philip Ball, Stefan Zohren, and Stephen J. Roberts. "Same State, Different Task: Continual Reinforcement Learning without Interference." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7143–51. http://dx.doi.org/10.1609/aaai.v36i7.20674.
Повний текст джерелаLi, Xinbin, Xianglin Xu, Lei Yan, Haihong Zhao, and Tongwei Zhang. "Energy-Efficient Data Collection Using Autonomous Underwater Glider: A Reinforcement Learning Formulation." Sensors 20, no. 13 (July 4, 2020): 3758. http://dx.doi.org/10.3390/s20133758.
Повний текст джерелаYu, Junpu. "Thompson -Greedy Algorithm: An Improvement to the Regret of Thompson Sampling and -Greedy on Multi-Armed Bandit Problems." Applied and Computational Engineering 8, no. 1 (August 1, 2023): 525–34. http://dx.doi.org/10.54254/2755-2721/8/20230264.
Повний текст джерелаBotchkaryov, Alexey. "Task sequence planning by intelligent agent with context awareness." Computer systems and network 4, no. 1 (December 16, 2022): 12–20. http://dx.doi.org/10.23939/csn2022.01.012.
Повний текст джерелаAmirizadeh, Khosrow, and Rajeswari Mandava. "Fast Iterative model for Sequential-Selection-Based Applications." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 12, no. 7 (February 14, 2014): 3689–96. http://dx.doi.org/10.24297/ijct.v12i7.3092.
Повний текст джерелаKamikokuryo, Kenta, Takumi Haga, Gentiane Venture, and Vincent Hernandez. "Adversarial Autoencoder and Multi-Armed Bandit for Dynamic Difficulty Adjustment in Immersive Virtual Reality for Rehabilitation: Application to Hand Movement." Sensors 22, no. 12 (June 14, 2022): 4499. http://dx.doi.org/10.3390/s22124499.
Повний текст джерелаMoy, Christophe, Lilian Besson, Guillaume Delbarre, and Laurent Toutain. "Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments." Annals of Telecommunications 75, no. 11-12 (August 27, 2020): 711–27. http://dx.doi.org/10.1007/s12243-020-00795-y.
Повний текст джерелаShi, Chengshuai, and Cong Shen. "Federated Multi-Armed Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 9603–11. http://dx.doi.org/10.1609/aaai.v35i11.17156.
Повний текст джерелаChai, Chengliang, Jiabin Liu, Nan Tang, Guoliang Li, and Yuyu Luo. "Selective data acquisition in the wild for model charging." Proceedings of the VLDB Endowment 15, no. 7 (March 2022): 1466–78. http://dx.doi.org/10.14778/3523210.3523223.
Повний текст джерелаSankararaman, Abishek, Ayalvadi Ganesh, and Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits." Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, no. 3 (December 17, 2019): 1–35. http://dx.doi.org/10.1145/3366701.
Повний текст джерелаSankararaman, Abishek, Ayalvadi Ganesh, and Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits." ACM SIGMETRICS Performance Evaluation Review 48, no. 1 (July 8, 2020): 29–30. http://dx.doi.org/10.1145/3410048.3410065.
Повний текст джерелаFlynn, Hamish, David Reeb, Melih Kandemir, and Jan Peters. "PAC-Bayesian lifelong learning for multi-armed bandits." Data Mining and Knowledge Discovery 36, no. 2 (March 2022): 841–76. http://dx.doi.org/10.1007/s10618-022-00825-4.
Повний текст джерелаAmeen, Salem, and Sunil Vadera. "Pruning Neural Networks Using Multi-Armed Bandits." Computer Journal 63, no. 7 (September 26, 2019): 1099–108. http://dx.doi.org/10.1093/comjnl/bxz078.
Повний текст джерелаXu, Xiao, and Qing Zhao. "Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits." IEEE Transactions on Signal Processing 69 (2021): 2371–82. http://dx.doi.org/10.1109/tsp.2021.3070201.
Повний текст джерелаWang, Kai, Lily Xu, Aparna Taneja, and Milind Tambe. "Optimistic Whittle Index Policy: Online Learning for Restless Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 10131–39. http://dx.doi.org/10.1609/aaai.v37i8.26207.
Повний текст джерелаZhao, Qing. "Multi-Armed Bandits: Theory and Applications to Online Learning in Networks." Synthesis Lectures on Communication Networks 12, no. 1 (November 20, 2019): 1–165. http://dx.doi.org/10.2200/s00941ed2v01y201907cnt022.
Повний текст джерелаWeinstein, Ari, and Michael Littman. "Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes." Proceedings of the International Conference on Automated Planning and Scheduling 22 (May 14, 2012): 306–14. http://dx.doi.org/10.1609/icaps.v22i1.13507.
Повний текст джерелаKaibel, Chris, and Torsten Biemann. "Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments." Organizational Research Methods 24, no. 1 (June 11, 2019): 78–103. http://dx.doi.org/10.1177/1094428119854153.
Повний текст джерелаLumbreras, Josep, Erkka Haapasalo, and Marco Tomamichel. "Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states." Quantum 6 (June 29, 2022): 749. http://dx.doi.org/10.22331/q-2022-06-29-749.
Повний текст джерелаVial, Daniel, Sanjay Shakkottai, and R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs." ACM SIGMETRICS Performance Evaluation Review 51, no. 1 (June 26, 2023): 67–68. http://dx.doi.org/10.1145/3606376.3593567.
Повний текст джерелаBen-Porat, Omer, Lee Cohen, Liu Leqi, Zachary C. Lipton, and Yishay Mansour. "Modeling Attrition in Recommender Systems with Departing Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6072–79. http://dx.doi.org/10.1609/aaai.v36i6.20554.
Повний текст джерелаVial, Daniel, Sanjay Shakkottai, and R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs." Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, no. 3 (December 2022): 1–57. http://dx.doi.org/10.1145/3570614.
Повний текст джерелаPacchiano, Aldo, Heinrich Jiang, and Michael I. Jordan. "Robustness Guarantees for Mode Estimation with an Application to Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9277–84. http://dx.doi.org/10.1609/aaai.v35i10.17119.
Повний текст джерелаYeh, Yi-Liang, and Po-Kai Yang. "Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions." Machines 9, no. 12 (November 26, 2021): 319. http://dx.doi.org/10.3390/machines9120319.
Повний текст джерелаGarcelon, Evrard, Mohammad Ghavamzadeh, Alessandro Lazaric, and Matteo Pirotta. "Improved Algorithms for Conservative Exploration in Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3962–69. http://dx.doi.org/10.1609/aaai.v34i04.5812.
Повний текст джерелаDu, Yihan, Siwei Wang, and Longbo Huang. "A One-Size-Fits-All Solution to Conservative Bandit Problems." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 7254–61. http://dx.doi.org/10.1609/aaai.v35i8.16891.
Повний текст джерелаLi, Yang, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang, and Bin Cui. "Efficient Automatic CASH via Rising Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4763–71. http://dx.doi.org/10.1609/aaai.v34i04.5910.
Повний текст джерелаWang, Kai, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde, and Milind Tambe. "Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 10 (June 26, 2023): 12138–46. http://dx.doi.org/10.1609/aaai.v37i10.26431.
Повний текст джерелаGuo, Han, Ramakanth Pasunuru, and Mohit Bansal. "Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7830–38. http://dx.doi.org/10.1609/aaai.v34i05.6288.
Повний текст джерелаLupu, Andrei, Audrey Durand, and Doina Precup. "Leveraging Observations in Bandits: Between Risks and Benefits." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6112–19. http://dx.doi.org/10.1609/aaai.v33i01.33016112.
Повний текст джерела