Добірка наукової літератури з теми "Offline Contextual Bandit"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Offline Contextual Bandit".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Offline Contextual Bandit"
Huang, Wen, and Xintao Wu. "Robustly Improving Bandit Algorithms with Confounded and Selection Biased Offline Data: A Causal Approach." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 18 (March 24, 2024): 20438–46. http://dx.doi.org/10.1609/aaai.v38i18.30027.
Повний текст джерелаNarita, Yusuke, Shota Yasui, and Kohei Yata. "Efficient Counterfactual Learning from Bandit Feedback." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4634–41. http://dx.doi.org/10.1609/aaai.v33i01.33014634.
Повний текст джерелаDegroote, Hans, Patrick De Causmaecker, Bernd Bischl, and Lars Kotthoff. "A Regression-Based Methodology for Online Algorithm Selection." Proceedings of the International Symposium on Combinatorial Search 9, no. 1 (September 1, 2021): 37–45. http://dx.doi.org/10.1609/socs.v9i1.18458.
Повний текст джерелаLi, Zhao, Junshuai Song, Zehong Hu, Zhen Wang, and Jun Gao. "Constrained Dual-Level Bandit for Personalized Impression Regulation in Online Ranking Systems." ACM Transactions on Knowledge Discovery from Data 16, no. 2 (July 21, 2021): 1–23. http://dx.doi.org/10.1145/3461340.
Повний текст джерелаVera, Alberto, Siddhartha Banerjee, and Itai Gurvich. "Online Allocation and Pricing: Constant Regret via Bellman Inequalities." Operations Research 69, no. 3 (May 2021): 821–40. http://dx.doi.org/10.1287/opre.2020.2061.
Повний текст джерелаAyle, Morgane, Jimmy Tekli, Julia El-Zini, Boulos El-Asmar, and Mariette Awad. "BAR — A Reinforcement Learning Agent for Bounding-Box Automated Refinement." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 03 (April 3, 2020): 2561–68. http://dx.doi.org/10.1609/aaai.v34i03.5639.
Повний текст джерелаSimchi-Levi, David, and Yunzong Xu. "Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability." Mathematics of Operations Research, December 9, 2021. http://dx.doi.org/10.1287/moor.2021.1193.
Повний текст джерелаSoemers, Dennis, Tim Brys, Kurt Driessens, Mark Winands, and Ann Nowé. "Adapting to Concept Drift in Credit Card Transaction Data Streams Using Contextual Bandits and Decision Trees." Proceedings of the AAAI Conference on Artificial Intelligence 32, no. 1 (April 27, 2018). http://dx.doi.org/10.1609/aaai.v32i1.11411.
Повний текст джерелаCao, Junyu, and Wei Sun. "Tiered Assortment: Optimization and Online Learning." Management Science, October 4, 2023. http://dx.doi.org/10.1287/mnsc.2023.4940.
Повний текст джерелаZeng, Yingyan, Xiaoyu Chen, and Ran Jin. "Ensemble Active Learning by Contextual Bandits for AI Incubation in Manufacturing." ACM Transactions on Intelligent Systems and Technology, October 25, 2023. http://dx.doi.org/10.1145/3627821.
Повний текст джерелаДисертації з теми "Offline Contextual Bandit"
Sakhi, Otmane. "Offline Contextual Bandit : Theory and Large Scale Applications." Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAG011.
Повний текст джерелаThis thesis presents contributions to the problem of learning from logged interactions using the offline contextual bandit framework. We are interested in two related topics: (1) offline policy learning with performance certificates, and (2) fast and efficient policy learning applied to large scale, real world recommendation. For (1), we first leverage results from the distributionally robust optimisation framework to construct asymptotic, variance-sensitive bounds to evaluate policies' performances. These bounds lead to new, more practical learning objectives thanks to their composite nature and straightforward calibration. We then analyse the problem from the PAC-Bayesian perspective, and provide tighter, non-asymptotic bounds on the performance of policies. Our results motivate new strategies, that offer performance certificates before deploying the policies online. The newly derived strategies rely on composite learning objectives that do not require additional tuning. For (2), we first propose a hierarchical Bayesian model, that combines different signals, to efficiently estimate the quality of recommendation. We provide proper computational tools to scale the inference to real world problems, and demonstrate empirically the benefits of the approach in multiple scenarios. We then address the question of accelerating common policy optimisation approaches, particularly focusing on recommendation problems with catalogues of millions of items. We derive optimisation routines, based on new gradient approximations, computed in logarithmic time with respect to the catalogue size. Our approach improves on common, linear time gradient computations, yielding fast optimisation with no loss on the quality of the learned policies
Тези доповідей конференцій з теми "Offline Contextual Bandit"
Li, Lihong, Wei Chu, John Langford, and Xuanhui Wang. "Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms." In the fourth ACM international conference. New York, New York, USA: ACM Press, 2011. http://dx.doi.org/10.1145/1935826.1935878.
Повний текст джерелаBouneffouf, Djallel, Srinivasan Parthasarathy, Horst Samulowitz, and Martin Wistuba. "Optimal Exploitation of Clustering and History Information in Multi-armed Bandit." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/279.
Повний текст джерелаDegroote, Hans. "Online Algorithm Selection." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/746.
Повний текст джерелаJanuszewski, Piotr, Dominik Grzegorzek, and Paweł Czarnul. "Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits." In 16th International Conference on Agents and Artificial Intelligence. SCITEPRESS - Science and Technology Publications, 2024. http://dx.doi.org/10.5220/0012311000003636.
Повний текст джерелаAmeko, Mawulolo K., Miranda L. Beltzer, Lihua Cai, Mehdi Boukhechba, Bethany A. Teachman, and Laura E. Barnes. "Offline Contextual Multi-armed Bandits for Mobile Health Interventions: A Case Study on Emotion Regulation." In RecSys '20: Fourteenth ACM Conference on Recommender Systems. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3383313.3412244.
Повний текст джерела