Artykuły w czasopismach na temat „Reinforcement Learning, Multi-armed Bandits”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Reinforcement Learning, Multi-armed Bandits”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.
Wan, Zongqi, Zhijie Zhang, Tongyang Li, Jialin Zhang i Xiaoming Sun. "Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 8 (26.06.2023): 10087–94. http://dx.doi.org/10.1609/aaai.v37i8.26202.
Pełny tekst źródłaCiucanu, Radu, Pascal Lafourcade, Gael Marcadet i Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits". Journal of Artificial Intelligence Research 73 (23.02.2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.
Pełny tekst źródłaHuanca-Anquise, Candy A., Ana Lúcia Cetertich Bazzan i Anderson R. Tavares. "Multi-Objective, Multi-Armed Bandits: Algorithms for Repeated Games and Application to Route Choice". Revista de Informática Teórica e Aplicada 30, nr 1 (30.01.2023): 11–23. http://dx.doi.org/10.22456/2175-2745.122929.
Pełny tekst źródłaGiachino, Chiara, Luigi Bollani, Alessandro Bonadonna i Marco Bertetti. "Reinforcement learning for content's customization: a first step of experimentation in Skyscanner". Industrial Management & Data Systems 121, nr 6 (15.01.2021): 1417–34. http://dx.doi.org/10.1108/imds-12-2019-0722.
Pełny tekst źródłaNoothigattu, Ritesh, Tom Yan i Ariel D. Procaccia. "Inverse Reinforcement Learning From Like-Minded Teachers". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 10 (18.05.2021): 9197–204. http://dx.doi.org/10.1609/aaai.v35i10.17110.
Pełny tekst źródłaXiong, Guojun, Jian Li i Rahul Singh. "Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 8 (28.06.2022): 8726–34. http://dx.doi.org/10.1609/aaai.v36i8.20852.
Pełny tekst źródłaHuo, Xiaoguang, i Feng Fu. "Risk-aware multi-armed bandit problem with application to portfolio selection". Royal Society Open Science 4, nr 11 (listopad 2017): 171377. http://dx.doi.org/10.1098/rsos.171377.
Pełny tekst źródłaNobari, Sadegh. "DBA: Dynamic Multi-Armed Bandit Algorithm". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 9869–70. http://dx.doi.org/10.1609/aaai.v33i01.33019869.
Pełny tekst źródłaEsfandiari, Hossein, MohammadTaghi HajiAghayi, Brendan Lucier i Michael Mitzenmacher. "Online Pandora’s Boxes and Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 1885–92. http://dx.doi.org/10.1609/aaai.v33i01.33011885.
Pełny tekst źródłaLefebvre, Germain, Christopher Summerfield i Rafal Bogacz. "A Normative Account of Confirmation Bias During Reinforcement Learning". Neural Computation 34, nr 2 (14.01.2022): 307–37. http://dx.doi.org/10.1162/neco_a_01455.
Pełny tekst źródłaKoulouriotis, D. E., i A. Xanthopoulos. "Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems". Applied Mathematics and Computation 196, nr 2 (marzec 2008): 913–22. http://dx.doi.org/10.1016/j.amc.2007.07.043.
Pełny tekst źródłaElizarov, Artem Aleksandrovich, i Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning". Russian Digital Libraries Journal 23, nr 6 (12.05.2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.
Pełny tekst źródłaMorimoto, Juliano. "Foraging decisions as multi-armed bandit problems: Applying reinforcement learning algorithms to foraging data". Journal of Theoretical Biology 467 (kwiecień 2019): 48–56. http://dx.doi.org/10.1016/j.jtbi.2019.02.002.
Pełny tekst źródłaAskhedkar, Anjali R., i Bharat S. Chaudhari. "Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement". Journal of Sensor and Actuator Networks 12, nr 3 (4.05.2023): 38. http://dx.doi.org/10.3390/jsan12030038.
Pełny tekst źródłaEspinosa-Leal, Leonardo, Anthony Chapman i Magnus Westerlund. "Autonomous Industrial Management via Reinforcement Learning". Journal of Intelligent & Fuzzy Systems 39, nr 6 (4.12.2020): 8427–39. http://dx.doi.org/10.3233/jifs-189161.
Pełny tekst źródłaTeymuri, Benyamin, Reza Serati, Nikolaos Athanasios Anagnostopoulos i Mehdi Rasti. "LP-MAB: Improving the Energy Efficiency of LoRaWAN Using a Reinforcement-Learning-Based Adaptive Configuration Algorithm". Sensors 23, nr 4 (20.02.2023): 2363. http://dx.doi.org/10.3390/s23042363.
Pełny tekst źródłaVaratharajah, Yogatheesan, i Brent Berry. "A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials". Life 12, nr 8 (21.08.2022): 1277. http://dx.doi.org/10.3390/life12081277.
Pełny tekst źródłaZhou, Jinkai, Xuebo Lai i Joseph Y. J. Chow. "Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty". Transportation Research Record: Journal of the Transportation Research Board 2673, nr 10 (2.06.2019): 673–82. http://dx.doi.org/10.1177/0361198119850457.
Pełny tekst źródłaDai, Yue, Jiangang Lu, Zhiwen Yu i Ruifeng Zhao. "High-Precision Timing Method of BeiDou-3 System Based on Reinforcement Learning". Journal of Physics: Conference Series 2401, nr 1 (1.12.2022): 012093. http://dx.doi.org/10.1088/1742-6596/2401/1/012093.
Pełny tekst źródłaDunne, Simon, Arun D'Souza i John P. O'Doherty. "The involvement of model-based but not model-free learning signals during observational reward learning in the absence of choice". Journal of Neurophysiology 115, nr 6 (1.06.2016): 3195–203. http://dx.doi.org/10.1152/jn.00046.2016.
Pełny tekst źródłaKessler, Samuel, Jack Parker-Holder, Philip Ball, Stefan Zohren i Stephen J. Roberts. "Same State, Different Task: Continual Reinforcement Learning without Interference". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 7 (28.06.2022): 7143–51. http://dx.doi.org/10.1609/aaai.v36i7.20674.
Pełny tekst źródłaLi, Xinbin, Xianglin Xu, Lei Yan, Haihong Zhao i Tongwei Zhang. "Energy-Efficient Data Collection Using Autonomous Underwater Glider: A Reinforcement Learning Formulation". Sensors 20, nr 13 (4.07.2020): 3758. http://dx.doi.org/10.3390/s20133758.
Pełny tekst źródłaYu, Junpu. "Thompson -Greedy Algorithm: An Improvement to the Regret of Thompson Sampling and -Greedy on Multi-Armed Bandit Problems". Applied and Computational Engineering 8, nr 1 (1.08.2023): 525–34. http://dx.doi.org/10.54254/2755-2721/8/20230264.
Pełny tekst źródłaBotchkaryov, Alexey. "Task sequence planning by intelligent agent with context awareness". Computer systems and network 4, nr 1 (16.12.2022): 12–20. http://dx.doi.org/10.23939/csn2022.01.012.
Pełny tekst źródłaAmirizadeh, Khosrow, i Rajeswari Mandava. "Fast Iterative model for Sequential-Selection-Based Applications". INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 12, nr 7 (14.02.2014): 3689–96. http://dx.doi.org/10.24297/ijct.v12i7.3092.
Pełny tekst źródłaKamikokuryo, Kenta, Takumi Haga, Gentiane Venture i Vincent Hernandez. "Adversarial Autoencoder and Multi-Armed Bandit for Dynamic Difficulty Adjustment in Immersive Virtual Reality for Rehabilitation: Application to Hand Movement". Sensors 22, nr 12 (14.06.2022): 4499. http://dx.doi.org/10.3390/s22124499.
Pełny tekst źródłaMoy, Christophe, Lilian Besson, Guillaume Delbarre i Laurent Toutain. "Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments". Annals of Telecommunications 75, nr 11-12 (27.08.2020): 711–27. http://dx.doi.org/10.1007/s12243-020-00795-y.
Pełny tekst źródłaShi, Chengshuai, i Cong Shen. "Federated Multi-Armed Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 11 (18.05.2021): 9603–11. http://dx.doi.org/10.1609/aaai.v35i11.17156.
Pełny tekst źródłaChai, Chengliang, Jiabin Liu, Nan Tang, Guoliang Li i Yuyu Luo. "Selective data acquisition in the wild for model charging". Proceedings of the VLDB Endowment 15, nr 7 (marzec 2022): 1466–78. http://dx.doi.org/10.14778/3523210.3523223.
Pełny tekst źródłaSankararaman, Abishek, Ayalvadi Ganesh i Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits". Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, nr 3 (17.12.2019): 1–35. http://dx.doi.org/10.1145/3366701.
Pełny tekst źródłaSankararaman, Abishek, Ayalvadi Ganesh i Sanjay Shakkottai. "Social Learning in Multi Agent Multi Armed Bandits". ACM SIGMETRICS Performance Evaluation Review 48, nr 1 (8.07.2020): 29–30. http://dx.doi.org/10.1145/3410048.3410065.
Pełny tekst źródłaFlynn, Hamish, David Reeb, Melih Kandemir i Jan Peters. "PAC-Bayesian lifelong learning for multi-armed bandits". Data Mining and Knowledge Discovery 36, nr 2 (marzec 2022): 841–76. http://dx.doi.org/10.1007/s10618-022-00825-4.
Pełny tekst źródłaAmeen, Salem, i Sunil Vadera. "Pruning Neural Networks Using Multi-Armed Bandits". Computer Journal 63, nr 7 (26.09.2019): 1099–108. http://dx.doi.org/10.1093/comjnl/bxz078.
Pełny tekst źródłaXu, Xiao, i Qing Zhao. "Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits". IEEE Transactions on Signal Processing 69 (2021): 2371–82. http://dx.doi.org/10.1109/tsp.2021.3070201.
Pełny tekst źródłaWang, Kai, Lily Xu, Aparna Taneja i Milind Tambe. "Optimistic Whittle Index Policy: Online Learning for Restless Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 8 (26.06.2023): 10131–39. http://dx.doi.org/10.1609/aaai.v37i8.26207.
Pełny tekst źródłaZhao, Qing. "Multi-Armed Bandits: Theory and Applications to Online Learning in Networks". Synthesis Lectures on Communication Networks 12, nr 1 (20.11.2019): 1–165. http://dx.doi.org/10.2200/s00941ed2v01y201907cnt022.
Pełny tekst źródłaWeinstein, Ari, i Michael Littman. "Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes". Proceedings of the International Conference on Automated Planning and Scheduling 22 (14.05.2012): 306–14. http://dx.doi.org/10.1609/icaps.v22i1.13507.
Pełny tekst źródłaKaibel, Chris, i Torsten Biemann. "Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments". Organizational Research Methods 24, nr 1 (11.06.2019): 78–103. http://dx.doi.org/10.1177/1094428119854153.
Pełny tekst źródłaLumbreras, Josep, Erkka Haapasalo i Marco Tomamichel. "Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states". Quantum 6 (29.06.2022): 749. http://dx.doi.org/10.22331/q-2022-06-29-749.
Pełny tekst źródłaVial, Daniel, Sanjay Shakkottai i R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs". ACM SIGMETRICS Performance Evaluation Review 51, nr 1 (26.06.2023): 67–68. http://dx.doi.org/10.1145/3606376.3593567.
Pełny tekst źródłaBen-Porat, Omer, Lee Cohen, Liu Leqi, Zachary C. Lipton i Yishay Mansour. "Modeling Attrition in Recommender Systems with Departing Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 6 (28.06.2022): 6072–79. http://dx.doi.org/10.1609/aaai.v36i6.20554.
Pełny tekst źródłaVial, Daniel, Sanjay Shakkottai i R. Srikant. "Robust Multi-Agent Bandits Over Undirected Graphs". Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, nr 3 (grudzień 2022): 1–57. http://dx.doi.org/10.1145/3570614.
Pełny tekst źródłaPacchiano, Aldo, Heinrich Jiang i Michael I. Jordan. "Robustness Guarantees for Mode Estimation with an Application to Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 10 (18.05.2021): 9277–84. http://dx.doi.org/10.1609/aaai.v35i10.17119.
Pełny tekst źródłaYeh, Yi-Liang, i Po-Kai Yang. "Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions". Machines 9, nr 12 (26.11.2021): 319. http://dx.doi.org/10.3390/machines9120319.
Pełny tekst źródłaGarcelon, Evrard, Mohammad Ghavamzadeh, Alessandro Lazaric i Matteo Pirotta. "Improved Algorithms for Conservative Exploration in Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 04 (3.04.2020): 3962–69. http://dx.doi.org/10.1609/aaai.v34i04.5812.
Pełny tekst źródłaDu, Yihan, Siwei Wang i Longbo Huang. "A One-Size-Fits-All Solution to Conservative Bandit Problems". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 8 (18.05.2021): 7254–61. http://dx.doi.org/10.1609/aaai.v35i8.16891.
Pełny tekst źródłaLi, Yang, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang i Bin Cui. "Efficient Automatic CASH via Rising Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 04 (3.04.2020): 4763–71. http://dx.doi.org/10.1609/aaai.v34i04.5910.
Pełny tekst źródłaWang, Kai, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde i Milind Tambe. "Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 10 (26.06.2023): 12138–46. http://dx.doi.org/10.1609/aaai.v37i10.26431.
Pełny tekst źródłaGuo, Han, Ramakanth Pasunuru i Mohit Bansal. "Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 7830–38. http://dx.doi.org/10.1609/aaai.v34i05.6288.
Pełny tekst źródłaLupu, Andrei, Audrey Durand i Doina Precup. "Leveraging Observations in Bandits: Between Risks and Benefits". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 6112–19. http://dx.doi.org/10.1609/aaai.v33i01.33016112.
Pełny tekst źródła