Дисертації з теми "Multi-Objective Markov Decision Processes"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-27 дисертацій для дослідження на тему "Multi-Objective Markov Decision Processes".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Pratikakis, Nikolaos. "Multistage decisions and risk in Markov decision processes towards effective approximate dynamic programming architectures /." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/31654.
Повний текст джерелаCommittee Chair: Jay H. Lee; Committee Member: Martha Grover; Committee Member: Matthew J. Realff; Committee Member: Shabbir Ahmed; Committee Member: Stylianos Kavadias. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Dorff, Rebecca. "Modelling Infertility with Markov Chains." BYU ScholarsArchive, 2013. https://scholarsarchive.byu.edu/etd/4070.
Повний текст джерелаChen, Yu Fan Ph D. Massachusetts Institute of Technology. "Hierarchical decomposition of multi-agent Markov decision processes with application to health aware planning." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/93795.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references (pages 99-104).
Multi-agent robotic systems have attracted the interests of both researchers and practitioners because they provide more capabilities and afford greater flexibility than single-agent systems. Coordination of individual agents within large teams is often challenging because of the combinatorial nature of such problems. In particular, the number of possible joint configurations is the product of that of every agent. Further, real world applications often contain various sources of uncertainties. This thesis investigates techniques to address the scalability issue of multi-agent planning under uncertainties. This thesis develops a novel hierarchical decomposition approach (HD-MMDP) for solving Multi-agent Markov Decision Processes (MMDPs), which is a natural framework for formulating stochastic sequential decision-making problems. In particular, the HD-MMDP algorithm builds a decomposition structure by exploiting coupling relationships in the reward function. A number of smaller subproblems are formed and are solved individually. The planning spaces of each subproblem are much smaller than that of the original problem, which improves the computational efficiency, and the solutions to the subproblems can be combined to form a solution (policy) to the original problem. The HD-MMDP algorithm is applied on a ten agent persistent search and track (PST) mission and shows more than 35% improvement over an existing algorithm developed specifically for this domain. This thesis also contributes to the development of the software infrastructure that enables hardware experiments involving multiple robots. In particular, the thesis presents a novel optimization based multi-agent path planning algorithm, which was tested in simulation and hardware (quadrotor) experiment. The HD-MMDP algorithm is also used to solve a multi-agent intruder monitoring mission implemented using real robots.
by Yu Fan Chen.
S.M.
Omidshafiei, Shayegan. "Decentralized control of multi-robot systems using partially observable Markov Decision Processes and belief space macro-actions." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/101447.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 129-139).
Planning, control, perception, and learning for multi-robot systems present signicant challenges. Transition dynamics of the robots may be stochastic, making it difficult to select the best action each robot should take at a given time. The observation model, a function of the robots' sensors, may be noisy or partial, meaning that deterministic knowledge of the team's state is often impossible to attain. Robots designed for real-world applications require careful consideration of such sources of uncertainty. This thesis contributes a framework for multi-robot planning in continuous spaces with partial observability. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems. However, representing and solving Dec-POMDPs is often intractable for large problems. This thesis extends the Dec-POMDP framework to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP), taking advantage of high- level representations that are natural for multi-robot problems. Dec-POSMDPs allow asynchronous decision-making, which is crucial in multi-robot domains. This thesis also presents algorithms for solving Dec-POSMDPs, which are more scalable than previous methods due to use of closed-loop macro-actions in planning. The proposed framework's performance is evaluated in a constrained multi-robot package delivery domain, showing its ability to provide high-quality solutions for large problems. Due to the probabilistic nature of state transitions and observations, robots operate in belief space, the space of probability distributions over all of their possible states. This thesis also contributes a hardware platform called Measurable Augmented Reality for Prototyping Cyber-Physical Systems (MAR-CPS). MAR-CPS allows real-time visualization of the belief space in laboratory settings.
by Shayegan Omidshafiei.
S.M.
Dorini, Gianluca. "The neighbour search approach for solving multi-objectie Markov Decision Processes, and the application in reservoirs operation planning." Thesis, University of Exeter, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445450.
Повний текст джерелаFowler, Michael C. "Intelligent Knowledge Distribution for Multi-Agent Communication, Planning, and Learning." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/97996.
Повний текст джерелаDoctor of Philosophy
This dissertation addresses a fundamental question behind when multiple autonomous sys- tems, like drone swarms, in the field need to coordinate and share data: what information should be sent to whom and when, with the limited resources available to each agent? Intelligent Knowledge Distribution is a framework that answers these questions. Communication requirements for multi-agent systems can be rather high when an accurate picture of the environment and the state of other agents must be maintained. To reduce the impact of multi-agent coordination on networked systems, e.g., power and bandwidth, this dissertation introduces new concepts to enable Intelligent Knowledge Distribution (IKD), including Constrained-action POMDPs and concurrent decentralized (CoDec) POMDPs for an agnostic plug-and-play capability for fully autonomous systems. The IKD model was able to demonstrate its validity as a "plug-and-play" library that manages communications between agents that ensures the right information is being transmitted at the right time to the right agent to ensure mission success.
Leung, Hiu-lan, and 梁曉蘭. "Wandering ideal point models for single or multi-attribute ranking data: a Bayesian approach." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2003. http://hub.hku.hk/bib/B29552357.
Повний текст джерелаMurugesan, Sugumar. "Opportunistic Scheduling Using Channel Memory in Markov-modeled Wireless Networks." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1282065836.
Повний текст джерелаRaffensperger, Peter Abraham. "Measuring and Influencing Sequential Joint Agent Behaviours." Thesis, University of Canterbury. Electrical and Computer Engineering, 2013. http://hdl.handle.net/10092/7472.
Повний текст джерелаLafleur, Jarret Marshall. "A Markovian state-space framework for integrating flexibility into space system design decisions." Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/43749.
Повний текст джерелаRiauke, Jelena. "SPEA2-based safety system multi-objective optimization." Thesis, Loughborough University, 2009. https://dspace.lboro.ac.uk/2134/5514.
Повний текст джерелаReynolds, Toby J. "Bayesian modelling of integrated data and its application to seabird populations." Thesis, University of St Andrews, 2010. http://hdl.handle.net/10023/1635.
Повний текст джерелаAstaraky, Davood. "A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/23622.
Повний текст джерелаPinheiro, Paulo Gurgel 1983. "Localização multirrobo cooperativa com planejamento." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276155.
Повний текст джерелаDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-09-11T21:14:07Z (GMT). No. of bitstreams: 1 Pinheiro_PauloGurgel_M.pdf: 1259816 bytes, checksum: a4783df9aa3755becb68ee233ad43e3c (MD5) Previous issue date: 2009
Resumo: Em um problema de localização multirrobô cooperativa, um grupo de robôs encontra-se em um determinado ambiente, cuja localização exata de cada um dos robôs é desconhecida. Neste cenário, uma distribuição de probabilidades aponta as chances de um robô estar em um determinado estado. É necessário então, que os robôs se movimentem pelo ambiente e gerem novas observações que serão compartilhadas, para calcular novas estimativas. Nos últimos anos, muitos trabalhos têm focado no estudo de técnicas probabilísticas, modelos de comunicação e modelos de detecções, para resolver o problema de localização. No entanto, a movimentação dos robôs é, em geral, definida por ações aleatórias. Ações aleatórias geram observações que podem ser inúteis para a melhoria da estimativa. Este trabalho apresenta uma proposta de localização com suporte a planejamento de ações. O objetivo é apresentar um modelo cujas ações realizadas pelos robôs são definidas por políticas. Escolhendo a melhor ação a ser realizada, é possível receber informações mais úteis dos sensores internos e externos e estimar as posturas mais rapidamente. O modelo proposto, denominado Modelo de Localização Planejada - MLP, utiliza POMDPs para modelar os problemas de localização e algoritmos específicos de geração de políticas. Foi utilizada a localização de Markov como técnica probabilística de localização e implementadas versões de modelos de detecção e propagação de informação. Neste trabalho, um simulador de problemas de localização multirrobô foi desenvolvido, no qual foram realizados experimentos em que o modelo proposto foi comparado a um modelo que não faz uso de planejamento de ações. Os resultados obtidos apontam que o modelo proposto é capaz de estimar as posturas dos robôs com uma menor quantidade de passos, sendo significativamente mais e ciente do que o modelo comparado sem planejamento.
Abstract: In a cooperative multi-robot localization problem, a group of robots is in a certain environment, where the exact location of each robot is unknown. In this scenario, there is only a distribution of probabilities indicating the chance of a robot to be in a particular state. It is necessary for the robots to move in the environment generating new observations, which will be shared to calculate new estimates. Currently, many studies have focused on the study of probabilistic techniques, models of communication and models of detection to solve the localization problem. However, the movement of robots is generally defined by random actions. Random actions generate observations that can be useless for improving the estimate. This work describes a proposal for multi-robot localization with support planning of actions. The objective is to describe a model whose actions performed by robots are defined by policies. Choosing the best action to be performed, the robot gets more useful information from internal and external sensors and estimates the posture more quickly. The proposed model, called Model of Planned Localization - MPL, uses POMDPs to model the problems of location and specific algorithms to generate policies. The Markov localization was used as probabilistic technique of localization and implemented versions of detection models and information propagation model. In this work, a simulator to multi-robot localization problems was developed, in which experiments were performed. The proposed model was compared to a model that does not make use of planning actions. The results showed that the proposed model is able to estimate the positions of robots with lower number of steps, being more e-cient than model compared.
Mestrado
Inteligencia Artificial
Mestre em Ciência da Computação
Ozcan-Deniz, Gulbin. "An Integrated Multi-Agent Framework for Optimizing Time, Cost and Environmental Impact of Construction Processes." FIU Digital Commons, 2011. http://digitalcommons.fiu.edu/etd/455.
Повний текст джерелаPaniah, Crédo. "Approche multi-agents pour la gestion des fermes éoliennes offshore." Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112067/document.
Повний текст джерелаRenewable Energy Sources (RES) has grown remarkably in last few decades. Compared to conventional energy sources, renewable generation is more available, sustainable and environment-friendly - for example, there is no greenhouse gases emission during the energy generation. However, while electrical network stability requires production and consumption equality and the electricity market constrains producers to contract future production a priori and respect their furniture commitments or pay substantial penalties, RES are mainly uncontrollable and their behavior is difficult to forecast accurately. De facto, they jeopardize the stability of the physical network and renewable producers competitiveness in the market. The Winpower project aims to design realistic, robust and stable control strategies for offshore networks connecting to the main electricity system renewable sources and controllable storage devices owned by different autonomous actors. Each actor must embed its own local physical device control strategy but a global network management mechanism, jointly decided between connected actors, should be designed as well.We assume a market participation of the actors as an unique entity (the coalition of actors connected by the Winpower network) allowing the coalition to facilitate the network management through resources aggregation, renewable producers to take advantage of controllable sources flexibility to handle market penalties risks, as well as storage devices owners to leverage their resources on the market and/or with the management of renewable imbalances. This work tackles the market participation of the coalition as a Cooperative Virtual Power Plant. For this purpose, we describe a multi-agent architecture trough the definition of intelligent agents managing and operating actors resources and the description of these agents interactions; it allows the alliance of local constraints and objectives and the global network management objective.We formalize the aggregation and planning of resources utilization as a Markov Decision Process (MDP), a formal model suited for sequential decision making in uncertain environments. Its aim is to define the sequence of actions which maximize expected actual incomes of the market participation, while decisions over controllable resources have uncertain outcomes. However, market participation decision is prior to the actual operation when renewable generation still is uncertain. Thus, the Markov Decision Process is intractable as its state in each decision time-slot is not fully observable. To solve such a Partially Observable MDP (POMDP), we decompose it into a classical MDP and an information state (a probability distribution over renewable generation errors). The Information State MDP (IS-MDP) obtained is solved with an adaptation of the Backwards Induction, a classical MDP resolution algorithm.Then, we describe a common simulation framework to compare our proposed methodology to some other strategies, including the state of the art in renewable generation market participation. Simulations results validate the resources aggregation strategy and confirm that cooperation is beneficial to renewable producers and storage devices owners when they participate in electricity market. The proposed architecture is designed to allow the distribution of the decision making between the coalition’s actors, through the implementation of a suitable coordination mechanism. We propose some distribution methodologies, to this end
El, Helou Melhem. "Radio Access Technology Selection in Heterogeneous Wireless Networks." Thesis, Rennes 1, 2014. http://www.theses.fr/2014REN1S086/document.
Повний текст джерелаTo cope with the rapid growth of mobile broadband traffic, various radio access technologies (e.g., HSPA, LTE, WiFi, and WiMAX) are being integrated and jointly managed. Radio Access Technology (RAT) selection, devoted to decide to what RAT mobiles should connect, is a key functionality to improve network performance and user experience. When intelligence is pushed to the network edge, mobiles make autonomous decisions regarding selection of their most appropriate RAT. They aim to selfishly maximize their utility. However, because mobiles have no information on network load conditions, their decisions may lead to performance inefficiency. Moreover, delegating decisions to the network optimizes overall performance, but at the cost of increased network complexity, signaling, and processing load. In this thesis, instead of favoring either of these decision-making approaches, we propose a hybrid decision framework: the network provides information for the mobiles to make robust RAT selections. More precisely, mobile users select their RAT depending on their individual needs and preferences, as well as on the monetary cost and QoS parameters signaled by the network. By appropriately tuning network information, user decisions are globally expected to meet operator objectives, avoiding undesirable network states. We first introduce our hybrid decision framework. Decision makings, on the network and user sides, are investigated. To maximize user experience, we present a satisfaction-based Multi-Criteria Decision-Making (MCDM) method. In addition to their radio conditions, mobile users consider the cost and QoS parameters, signaled by the network, to evaluate serving RATs. In comparison with existing MCDM solutions, our algorithm meets user needs (e.g., traffic class, throughput demand, cost tolerance), avoiding inadequate decisions. A particular attention is then addressed to the network to make sure it broadcasts suitable decisional information, so as to better exploit its radio resources while mobiles maximize their own utility. We present two heuristic methods to dynamically derive what to signal to mobiles. While QoS parameters are modulated as a function of the load conditions, radio resources are shown to be efficiently exploited. Moreover, we focus on optimizing network information. Deriving QoS parameters is formulated as a semi-Markov decision process, and optimal policies are computed using the Policy Iteration algorithm. Also, and since network parameters may not be easily obtained, a reinforcement learning approach is introduced to derive what to signal to mobiles. The performances of optimal, learning-based, and heuristic policies are analyzed. When thresholds are pertinently set, our heuristic method provides performance very close to the optimal solution. Moreover, although lower performances are observed, our learning-based algorithm has the crucial advantage of requiring no prior parameterization
Teng, Sin Yong. "Intelligent Energy-Savings and Process Improvement Strategies in Energy-Intensive Industries." Doctoral thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-433427.
Повний текст джерелаMarre, Jean-Baptiste. "L'évaluation économique des services écosystémiques marins et côtiers et son utilisation dans la prise de décision : cas d'étude en Nouvelle-Calédonie et en Australie." Thesis, Brest, 2014. http://www.theses.fr/2014BRES0087/document.
Повний текст джерелаCoastal and marine ecosystems are some of the most heavily exploited with increasing degradation. This alarming situation appeals for urgent and effective actions. The optimal balance between use and conservation of ecosystems theoretically requires all costs and benefits to be considered in decision-making, including intangible costs and benefits such as non-market use and non-use values. The broad aim of this PhD is to examine how these economic values associated with coastal and marine ecosystem services can be measured, and how the economic valuation exercise may be considered and influence management decision- making.The first analytical part of the thesis focuses on assessing non-market use and non-use values, through econometric methods. The characterization and estimation of non-use values are complex and controversial, especially when the valuation exercise is focusing on individuals who are users of the ecosystem services being considered. An original approach based on a stated preference method, namely choice experiments, is developed then empirically applied in quantifying non-market values for marine and coastal ecosystems in two areas in New Caledonia. It allows the estimation of non-use values for populations of users in an implicit way. An in-depth analysis of the individuals’ choice heuristics during the valuation exercise is also conducted, with a focus on payment non-attendance. This issue is dealt with by comparing multiple modelling approaches in terms of: (1) inferred attendance, in relation to stated attendance; (2) attendance distribution according to several socio-economic variables; and (3) welfare estimates.After noting that the potential influence of economic valuation in decision making is unclear and largely unexplored in the literature, the second major component of this PhD aims to examine if, how and to what extent the economic valuation of ecosystem services, including measures of non-market values, influence decision-making regarding coastal and marine ecosystems management in Australia. Based on two nation-wide surveys, the perceived usefulness of the economic valuation of ecosystem services by the general public and decision-makers is studied, and the reasons why decision-makers may or may not fully consider economic values are elicited. Using a multi-criteria analysis, a part of the surveys also aims at examining the relative importance of different evaluation criteria (ecological, social and economic) when assessing the consequences of a hypothetical coastal development project on commercial activities, recreational activities and marine biodiversity
Saha, Subhamay. "Single and Multi-player Stochastic Dynamic Optimization." Thesis, 2013. http://etd.iisc.ernet.in/2005/3357.
Повний текст джерелаWang, Jue. "Multi-state Bayesian Process Control." Thesis, 2013. http://hdl.handle.net/1807/43750.
Повний текст джерела"TaxiWorld: Developing and Evaluating Solution Methods for Multi-Agent Planning Domains." Master's thesis, 2011. http://hdl.handle.net/2286/R.I.9358.
Повний текст джерелаDissertation/Thesis
M.S. Computer Science 2011
Royden-Turner, Stuart Jack. "Asset allocation in wealth management using stochastic models." Diss., 2016. http://hdl.handle.net/10500/22129.
Повний текст джерелаOperations Management
M. Sc. (Operations Research)
Abed-Alguni, Bilal Hashem Kalil. "Cooperative reinforcement learning for independent learners." Thesis, 2014. http://hdl.handle.net/1959.13/1052917.
Повний текст джерелаMachine learning in multi-agent domains poses several research challenges. One challenge is how to model cooperation between reinforcement learners. Cooperation between independent reinforcement learners is known to accelerate convergence to optimal solutions. In large state space problems, independent reinforcement learners normally cooperate to accelerate the learning process using decomposition techniques or knowledge sharing strategies. This thesis presents two techniques to multi-agent reinforcement learning and a comparison study. The first technique is a formal decomposition model and an algorithm for distributed systems. The second technique is a cooperative Q-learning algorithm for multi-goal decomposable systems. The comparison study compares the performance of some of the best known cooperative Q-learning algorithms for independent learners. Distributed systems are normally organised into two levels: system and subsystem levels. This thesis presents a formal solution for decomposition of Markov Decision Processes (MDPs) in distributed systems that takes advantage of the organisation of distributed systems and provides support for migration of learners. This is accomplished by two proposals: a Distributed, Hierarchical Learning Model (DHLM) and an Intelligent Distributed Q-Learning algorithm (IDQL) that are based on three specialisations of agents: workers, tutors and consultants. Worker agents are the actual learners and performers of tasks, while tutor agents and consultant agents are coordinators at the subsystem level and the system level, respectively. A main duty of consultant and tutor agents is the assignment of problem space to worker agents. The experimental results in a distributed hunter prey problem suggest that IDQL converges to a solution faster than the single agent Q-learning approach. An important feature of DHLM is that it provides a solution for migration of agents. This feature provides support for the IDQL algorithm where the problem space of each worker agent can change dynamically. Other hierarchical RL models do not cover this issue. Problems that have multiple goal-states can be decomposed into sub-problems by taking advantage of the loosely-coupled bonds among the goal states. In such problems, each goal state and its problem space form a sub-problem. This thesis introduces Q-learning with Aggregation algorithm (QA-learning), an algorithm for problems with multiple goal-states that is based on two roles: learner and tutor. A learner is an agent that learns and uses the knowledge of its neighbours (tutors) to construct its Q-table. A tutor is a learner that is ready to share its Q-table with its neighbours (learners). These roles are based on the concept of learners reusing tutors' sub-solutions. This algorithm provides solutions to problems with multiple goal-states. In this algorithm, each learner incorporates its tutors' knowledge into its own Q-table calculations. A comprehensive solution can then be obtained by combining these partial solutions. The experimental results in an instance of the shortest path problem suggest that the output of QA-learning is comparable to the output of a single Q-learner whose problem space is the whole system. But the QA-learning algorithm converges to a solution faster than a single learner approach. Cooperative Q-learning algorithms for independent learners accelerate the learning process of individual learners. In this type of Q-learning, independent learners share and update their Q-values by following a sharing strategy after some episodes learning independently. This thesis presents a comparison study of the performance of some famous cooperative Q-learning algorithms (BEST-Q, AVE-Q, PSO-Q, and WSS) as well as an algorithm that aggregates their results. These algorithms are compared in two cases: equal experience and different experiences cases. In the first case, the learners have equal learning time, while in the second case, the learners have different learning times. The comparison study also examines the effects of the frequency of Q-value sharing on the learning speed of independent learners. The experimental results in the equal experience case indicate that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. While, the experimental results in the different experiences case suggest that each of the cooperative Q-learning algorithms performs similarly, but better than single agent Q-learning. In both cases, high-frequency sharing of Q-values accelerates the convergence to optimal solutions compared to low-frequency sharing. Low-frequency Q-value sharing degrades the performance of the cooperative Q-learning algorithms in the equal experience and different experiences cases.
"A MULTI-FUNCTIONAL PROVENANCE ARCHITECTURE: CHALLENGES AND SOLUTIONS." Thesis, 2013. http://hdl.handle.net/10388/ETD-2013-12-1419.
Повний текст джерелаKarim, Mohammad Shahedul. "Instantly Decodable Network Coding: From Point to Multi-Point to Device-to-Device Communications." Phd thesis, 2016. http://hdl.handle.net/1885/118239.
Повний текст джерелаKumar, Sandip. "Generalized Sampling-Based Feedback Motion Planners." Thesis, 2011. http://hdl.handle.net/1969.1/ETD-TAMU-2011-12-10663.
Повний текст джерела