Добірка наукової літератури з теми "Multi-Objective Reinforcement Learning"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Multi-Objective Reinforcement Learning".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Multi-Objective Reinforcement Learning"
Horie, Naoto, Tohgoroh Matsui, Koichi Moriyama, Atsuko Mutoh, and Nobuhiro Inuzuka. "Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning." Artificial Life and Robotics 24, no. 3 (February 8, 2019): 352–59. http://dx.doi.org/10.1007/s10015-019-00523-3.
Повний текст джерелаKim, Man-Je, Hyunsoo Park, and Chang Wook Ahn. "Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning." Electronics 11, no. 7 (March 28, 2022): 1069. http://dx.doi.org/10.3390/electronics11071069.
Повний текст джерелаDrugan, Madalina, Marco Wiering, Peter Vamplew, and Madhu Chetty. "Special issue on multi-objective reinforcement learning." Neurocomputing 263 (November 2017): 1–2. http://dx.doi.org/10.1016/j.neucom.2017.06.020.
Повний текст джерелаPerez, Julien, Cécile Germain-Renaud, Balazs Kégl, and Charles Loomis. "Multi-objective Reinforcement Learning for Responsive Grids." Journal of Grid Computing 8, no. 3 (June 8, 2010): 473–92. http://dx.doi.org/10.1007/s10723-010-9161-0.
Повний текст джерелаNguyen, Thanh Thi, Ngoc Duy Nguyen, Peter Vamplew, Saeid Nahavandi, Richard Dazeley, and Chee Peng Lim. "A multi-objective deep reinforcement learning framework." Engineering Applications of Artificial Intelligence 96 (November 2020): 103915. http://dx.doi.org/10.1016/j.engappai.2020.103915.
Повний текст джерелаGarcía, Javier, Rubén Majadas, and Fernando Fernández. "Learning adversarial attack policies through multi-objective reinforcement learning." Engineering Applications of Artificial Intelligence 96 (November 2020): 104021. http://dx.doi.org/10.1016/j.engappai.2020.104021.
Повний текст джерелаYamamoto, Hiroyuki, Tomohiro Hayashida, Ichiro Nishizaki, and Shinya Sekizaki. "Hypervolume-Based Multi-Objective Reinforcement Learning: Interactive Approach." Advances in Science, Technology and Engineering Systems Journal 4, no. 1 (2019): 93–100. http://dx.doi.org/10.25046/aj040110.
Повний текст джерелаGarcía, Javier, Roberto Iglesias, Miguel A. Rodríguez, and Carlos V. Regueiro. "Incremental reinforcement learning for multi-objective robotic tasks." Knowledge and Information Systems 51, no. 3 (September 22, 2016): 911–40. http://dx.doi.org/10.1007/s10115-016-0992-2.
Повний текст джерелаSchneider, Stefan, Ramin Khalili, Adnan Manzoor, Haydar Qarawlus, Rafael Schellenberg, Holger Karl, and Artur Hecker. "Self-Learning Multi-Objective Service Coordination Using Deep Reinforcement Learning." IEEE Transactions on Network and Service Management 18, no. 3 (September 2021): 3829–42. http://dx.doi.org/10.1109/tnsm.2021.3076503.
Повний текст джерелаFerreira, Leonardo Anjoletto, Carlos Henrique Costa Ribeiro, and Reinaldo Augusto da Costa Bianchi. "Heuristically accelerated reinforcement learning modularization for multi-agent multi-objective problems." Applied Intelligence 41, no. 2 (May 1, 2014): 551–62. http://dx.doi.org/10.1007/s10489-014-0534-0.
Повний текст джерелаДисертації з теми "Multi-Objective Reinforcement Learning"
Pinder, J. M. "Multi-objective reinforcement learning framework for unknown stochastic & uncertain environments." Thesis, University of Salford, 2016. http://usir.salford.ac.uk/39978/.
Повний текст джерелаWang, Weijia. "Multi-objective sequential decision making." Phd thesis, Université Paris Sud - Paris XI, 2014. http://tel.archives-ouvertes.fr/tel-01057079.
Повний текст джерелаBouzid, Salah Eddine. "Optimisation multicritères des performances de réseau d’objets communicants par méta-heuristiques hybrides et apprentissage par renforcement." Thesis, Le Mans, 2020. http://cyberdoc-int.univ-lemans.fr/Theses/2020/2020LEMA1026.pdf.
Повний текст джерелаThe deployment of Communicating Things Networks (CTNs), with continuously increasing densities, needs to be optimal in terms of quality of service, energy consumption and lifetime. Determining the optimal placement of the nodes of these networks, relative to the different quality criteria, is an NP-Hard problem. Faced to this NP-Hardness, especially for indoor environments, existing approaches focus on the optimization of one single objective while neglecting the other criteria, or adopt an expensive manual solution. Finding new approaches to solve this problem is required. Accordingly, in this thesis, we propose a new approach which automatically generates the deployment that guarantees optimality in terms of performance and robustness related to possible topological failures and instabilities. The proposed approach is based, on the first hand, on the modeling of the deployment problem as a multi-objective optimization problem under constraints, and its resolution using a hybrid algorithm combining genetic multi-objective optimization with weighted sum optimization and on the other hand, the integration of reinforcement learning to guarantee the optimization of energy consumption and the extending the network lifetime. To apply this approach, two tools are developed. A first called MOONGA (Multi-Objective Optimization of wireless Network approach based on Genetic Algorithm) which automatically generates the placement of nodes while optimizing the metrics that define the QoS of the CTN: connectivity, m-connectivity, coverage, k-coverage, coverage redundancy and cost. MOONGA tool considers constraints related to the architecture of the deployment space, the network topology, the specifies of the application and the preferences of the network designer. The second optimization tool is named R2LTO (Reinforcement Learning for Life-Time Optimization), which is a new routing protocol for CTNs, based on distributed reinforcement learning that allows to determine the optimal rooting path in order to guarantee energy-efficiency and to extend the network lifetime while maintaining the required QoS
Ho, Dinh Khanh. "Gestion des ressources et de l’énergie orientée qualité de service pour les systèmes robotiques mobiles autonomes." Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4000.
Повний текст джерелаMobile robotic systems are becoming more and more complex with the integration of advanced sensing and acting components and functionalities to perform the real required missions. For these technical systems, the requirements are divided into two categories: functional and non-functional requirements. While functional requirements represent what the robot must do to accomplish the mission, non-functional requirements represent how the robot performs the mission. Thus, the quality of service and energy efficiency of a robotic mission are classified in this category. The autonomy of these systems is fully achieved when both functional and non-functional requirements are guaranteed without any human intervention or any external control. However, these mobile systems are naturally confronted with resource availability and energy capacity constraints, particularly in the context of long-term missions, these constraints become more critical. In addition, the performance of these systems is also influenced by unexpected and unstructured environmental conditions in which they interact. The management of resources and energy during operation is therefore a challenge for autonomous mobile robots in order to guarantee the desired performance objectives while respecting constraints. In this context, the ability of the robotic system to become aware of its own internal behaviors and physical environment and to adapt to these dynamic circumstances becomes important.This thesis focuses on the quality of service and energy efficiency of mobile robotic systems and proposes a hierarchical run-time management in order to guarantee these non-functional objectives of each robotic mission. At the local management level of each robotic mission, a Mission Manager employs a reinforcement learning-based decision-making mechanism to automatically reconfigure certain key mission-specific parameters to minimize the level of violation of required performance and energy objectives. At the global management level of the whole system, a Multi-Mission Manager leveraged rule-based decision-making and case-based reasoning techniques monitors the system's resources and the responses of Mission Managers in order to decide to reallocate the energy budget, regulate the quality of service and trigger the online learning for each robotic mission.The proposed methodology has been successfully prototyped and validated in a simulation environment and the run-time management framework is also integrated into our real mobile robotic system based on a Pioneer-3DX mobile base equipped with an embedded NVIDIA Jetson Xavier platform
Pereira, Tiago Oliveira. "Multi-Objective Deep Reinforcement Learning in Drug Discovery." Master's thesis, 2020. http://hdl.handle.net/10316/92570.
Повний текст джерелаO longo período de tempo, os enormes custos financeiros inerentes à introdução de um novo medicamento no mercado e a incerteza em relação à possibilidade de este vir a ser ou não aceite pelas autoridades responsáveis são claros obstáculos ao desenvolvimento de novos fármacos. A aplicação de técnicas de aprendizagem profunda em fases precoces do processo de descoberta de fármacos pode contribuir para facilitar a identificação de potenciais fármacos com propriedades biológicas promissoras. Nesse sentido, ao utilizar métodos computacionais, é possível reduzir o enorme espaço de pesquisa de possíveis fármacos e minimizar os problemas inerentes às fases subsequentes do processo. Não obstante, a maioria dos estudos que aplicam estas técnicas têm-se focado na otimização de apenas uma propriedade específica das moléculas, o que é insuficiente para o desenvolvimento de fármacos, uma vez que este é um problema que requer uma solução mais abrangente.Este trabalho propõe uma estratégia para a geração orientada de moléculas com o intuito de otimizar propriedades biológicas e físico-químicas. O propósito é gerar um conjunto promissor de moléculas que consigam desempenhar a função biológica desejada e ter efeitos inócuos para o organismo, para posteriormente ser investigada a possibilidade de encontrar possíveis fármacos. O modelo gerador computacional foi conseguido através da implementação de uma rede neuronal recorrente, por sua vez, contendo células de memória de longa duração. Este modelo foi treinado para aprender as regras fundamentais de construção de moléculas através de SMILES. O modelo gerador é depois treinado novamente através de aprendizagem por reforço para produzir moléculas com propriedades previamente determinadas. Para avaliar as novas moléculas geradas, é implementado um modelo regressivo que relaciona matematicamente a estrutura das moléculas com a sua atividade biológica em estudo. A novidade introduzida neste trabalho é a estratégia exploratória que garante, durante o processo de treino, um compromisso entre a necessidade de descobrir todo o espaço químico mais detalhadamente e a necessidade de utilizar a informação previamente aprendida para a construção de moléculas que otimizem a propriedade em estudo. Para demonstrar a eficácia deste método, o modelo gerador foi modificado para abordar objetivos individuais como, por exemplo, a afinidade da ligação entre o fármaco-recetor, e a estimativa quantitativa de um conjunto de propriedades típicas de fármacos. Os resultados demonstram a versatilidade do modelo uma vez que este garante a otimização de diferentes propriedades, mantendo as percentagens de diversidade e validade química nas moléculas geradas a níveis aceitáveis. Para além disso, o modelo gerador foi posteriormente melhorado através do seu alargamento à otimização simultânea de mais do que uma propriedade. Para fazer isso, foram exploradas diversas técnicas para implementar a otimização multiobjectivo com o intuito de aumentar a aplicabilidade dos novos potenciais fármacos através da otimização das suas propriedades físicas, químicas e biológicas. No contexto de aprendizagem por reforço, a abordagem geral foi combinar diferentes recompensas num único valor de recompensa. Neste sentido, foram aplicados diferentes métodos de escalarização para obter uma única recompensa que ponderasse os diferentes objetivos. Os resultados mostram que é possível encontrar moléculas que satisfaçam ambas as propriedades e, simultaneamente, com percentagens de validade a rondar os 90\%.
The long period of time, the enormous financial cost of bringing a new drug into the market, and the uncertainty about whether it will be accepted by the responsible authorities are clear obstacles to the development of new drugs. Applying deep learning techniques in the early stages of the drug discovery process can contribute to facilitating the identification of drug candidates with interesting biological properties. On that account, by employing computational methods, it is possible to reduce the enormous research space for drug-like compounds and minimize all the inherent issues. Nevertheless, most studies that employ these techniques focus on optimizing a specific molecule property, which is scarce for drug development, since this is a problem that requires a more far-reaching solution.This work proposes a framework for the targeted generation of molecules designed to optimize biological and psychochemical properties. The purpose is to create a promising set of molecules that can perform the desired function and have harmless effects for the organism to be further researched as candidate drugs.The artificial intelligence generative model was achieved by implementing a recurrent neural network, containing long short-term memory cells. This model was trained to learn the building rules of valid molecules in terms of SMILES strings. The generator model is then re-trained through reinforcement learning to produce molecules with bespoke properties. To evaluate the newly generated molecules, a structure-activity relationship model is implemented in order to map the molecular structure to the desired biological property. The novelty of this approach is the exploratory strategy that ensures, throughout the training process, a compromise between the need to discover in more detail the entire chemical space and the need to use the already learned information in the construction of molecules that guarantee the optimization of the property in study. To demonstrate the effectiveness of the method, the generator model was biased to address single-objectives, such as the drug-target binding affinity or the quantitative estimate of drug-likeness property. The results show the versatility of the proposed model since it guaranteed the optimization of different properties while maintaining the percentages of generated molecules diversity and validity at acceptable levels. Furthermore, we improve the generative model by expanding this framework to optimize more than one objective. To do that, different techniques to implement multi-objective optimization were explored. The goal was to increase the applicability of new potential drugs through the optimization of physical, chemical and biological properties. Our general approach combines different rewards into a single reward. Different scalarization methods were applied to have a unique reward that pondered the goodness of objectives. The results demonstrate that it is possible to find molecules that satisfy both proposed objectives and, simultaneously, achieve synthesizability rates of approximately 90\%.
Outro - This research has been funded by the Portuguese Research Agency FCT, throughD4 - Deep Drug Discovery and Deployment (CENTRO-01-0145-FEDER029266).This work is funded by national funds through the FCT - Foundation for Scienceand Technology, I.P., within the scope of the project CISUC -UID/CEC/00326/2020 and by European Social Fund, through the RegionalOperational Program Centro 2020
Hasan, Md Mahmudul. "An Intelligent Decision-making Scheme in a Dynamic Multi-objective Environment using Deep Reinforcement Learning." Thesis, 2020. https://arro.anglia.ac.uk/id/eprint/705890/1/Hasan_2020.pdf.
Повний текст джерелаЧастини книг з теми "Multi-Objective Reinforcement Learning"
Van Moffaert, Kristof, Madalina M. Drugan, and Ann Nowé. "Hypervolume-Based Multi-Objective Reinforcement Learning." In Lecture Notes in Computer Science, 352–66. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37140-0_28.
Повний текст джерелаMoustafa, Ahmed, and Minjie Zhang. "Multi-Objective Service Composition Using Reinforcement Learning." In Service-Oriented Computing, 298–312. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-45005-1_21.
Повний текст джерелаMéndez-Hernández, Beatriz M., Erick D. Rodríguez-Bazan, Yailen Martinez-Jimenez, Pieter Libin, and Ann Nowé. "A Multi-objective Reinforcement Learning Algorithm for JSSP." In Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation, 567–84. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-30487-4_44.
Повний текст джерелаVideau, Mathurin, Alessandro Leite, Olivier Teytaud, and Marc Schoenauer. "Multi-objective Genetic Programming for Explainable Reinforcement Learning." In Lecture Notes in Computer Science, 278–93. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-02056-8_18.
Повний текст джерелаXu, Jiangjiao, Ke Li, and Mohammad Abusara. "Multi-objective Reinforcement Learning Based Multi-microgrid System Optimisation Problem." In Lecture Notes in Computer Science, 684–96. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-72062-9_54.
Повний текст джерелаYamaguchi, Tomohiro, Shota Nagahama, Yoshihiro Ichikawa, and Keiki Takadama. "Model-Based Multi-objective Reinforcement Learning with Unknown Weights." In Human Interface and the Management of Information. Information in Intelligent Systems, 311–21. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-22649-7_25.
Повний текст джерелаYu, Yemin, Kun Kuang, Jiangchao Yang, Zeke Wang, Kunyang Jia, Weiming Lu, Hongxia Yang, and Fei Wu. "Multi-objective Meta-return Reinforcement Learning for Sequential Recommendation." In Artificial Intelligence, 95–111. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20500-2_8.
Повний текст джерелаIwamura, Koji, and Nobuhiro Sugimura. "Distributed Real-Time Scheduling by Using Multi-agent Reinforcement Learning." In Multi-objective Evolutionary Optimisation for Product Design and Manufacturing, 325–42. London: Springer London, 2011. http://dx.doi.org/10.1007/978-0-85729-652-8_11.
Повний текст джерелаYan, Jiaxin, Hua Wang, Xiaole Li, Shanwen Yi, and Yao Qin. "Multi-objective Disaster Backup in Inter-datacenter Using Reinforcement Learning." In Wireless Algorithms, Systems, and Applications, 590–601. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59016-1_49.
Повний текст джерелаLiu, Jun, Yi Zhou, Yimin Qiu, and Zhongfeng Li. "An Improved Multi-objective Optimization Algorithm Based on Reinforcement Learning." In Lecture Notes in Computer Science, 501–13. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-09677-8_42.
Повний текст джерелаТези доповідей конференцій з теми "Multi-Objective Reinforcement Learning"
Skalse, Joar, Lewis Hammond, Charlie Griffin, and Alessandro Abate. "Lexicographic Multi-Objective Reinforcement Learning." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/476.
Повний текст джерелаChen, Xi, Ali Ghadirzadeh, Marten Bjorkman, and Patric Jensfelt. "Meta-Learning for Multi-objective Reinforcement Learning." In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019. http://dx.doi.org/10.1109/iros40897.2019.8968092.
Повний текст джерелаWiering, Marco A., Maikel Withagen, and Madalina M. Drugan. "Model-based multi-objective reinforcement learning." In 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 2014. http://dx.doi.org/10.1109/adprl.2014.7010622.
Повний текст джерелаLiao, H. L., and Q. H. Wu. "Multi-objective optimisation by reinforcement learning." In 2010 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2010. http://dx.doi.org/10.1109/cec.2010.5585972.
Повний текст джерелаFerreira, Leonardo A., Reinaldo A. C. Bianchi, and Carlos H. C. Ribeiro. "Multi-agent Multi-objective Learning Using Heuristically Accelerated Reinforcement Learning." In 2012 Brazilian Robotics Symposium and Latin American Robotics Symposium (SBR-LARS). IEEE, 2012. http://dx.doi.org/10.1109/sbr-lars.2012.10.
Повний текст джерелаYahyaa, Saba Q., Madalina M. Drugan, and Bernard Manderick. "Annealing-pareto multi-objective multi-armed bandit algorithm." In 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 2014. http://dx.doi.org/10.1109/adprl.2014.7010619.
Повний текст джерелаRavichandran, Naresh Balaji, Fangkai Yang, Christopher Peters, Anders Lansner, and Pawel Herman. "Pedestrian simulation as multi-objective reinforcement learning." In IVA '18: International Conference on Intelligent Virtual Agents. New York, NY, USA: ACM, 2018. http://dx.doi.org/10.1145/3267851.3267914.
Повний текст джерелаVan Moffaert, Kristof, Tim Brys, and Ann Nowe. "Risk-sensitivity through multi-objective reinforcement learning." In 2015 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2015. http://dx.doi.org/10.1109/cec.2015.7257098.
Повний текст джерелаVan Moffaert, Kristof, Madalina M. Drugan, and Ann Nowe. "Scalarized multi-objective reinforcement learning: Novel design techniques." In 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 2013. http://dx.doi.org/10.1109/adprl.2013.6615007.
Повний текст джерелаLiu, Fei-Yu, and Chao Qian. "Prediction Guided Meta-Learning for Multi-Objective Reinforcement Learning." In 2021 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2021. http://dx.doi.org/10.1109/cec45853.2021.9504972.
Повний текст джерела