Dissertations / Theses on the topic 'POMDP'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'POMDP.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Folsom-Kovarik, Jeremiah. "Leveraging Help Requests in POMDP Intelligent Tutors." Doctoral diss., University of Central Florida, 2012. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5210.
Full textPh.D.
Doctorate
Computer Science
Engineering and Computer Science
Computer Science
Kaplow, Robert. "Point-based POMDP solvers survey and comparative analysis /." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:8881/R/?func=dbin-jump-full&object_id=92275.
Full textPng, ShaoWei. "Bayesian reinforcement learning for POMDP-based dialogue systems." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104830.
Full textLes systèmes de dialogues sont de plus en plus populaires depuis l'amélioration des technologies de reconnaissance vocale. Ces systèmes de dialogues peuvent être modélisés efficacement à l'aide des processus de décision markoviens partiellement observables (POMDP). Toutefois, les recherches antérieures supposent généralement une connaissance des paramètres du modèle. L'apprentissage par renforcement basée sur un modèle bayéesien, qui offre un cadre riche pour l'apprentissage et la planification simultanéee, peut éeliminer la néecessitée de cette supposition à cause de la grande complexitée du cadre, le déeveloppement de ces algorithmes pour les systèmes de dialogues complexes repréesente un déefi majeur. Dans ce document, nous déemontrons qu'en exploitant certaines propriéetées connues du système, comme les syméetries, et en utilisant un algorithme de planification approximatif en ligne, nous sommes capables d'appliquer les techniques d'apprentissage par renforcement bayéesien dans le cadre de sur plusieurs domaines de dialogues réealistes. Nous considéerons quelques domaines expéerimentaux. Le premier comprend des donnéees synthéetiques qui servent à illustrer plusieurs propriéetées de notre approche. Le deuxième est un gestionnaire de dialogues basée sur le corpus SACTI1 qui contient 144 dialogues entre 36 utilisateurs et 12 experts. Le troisième gestionnaire aide les patients atteints de déemence à vivre au quotidien. Finalement, nous considéerons un grand gestionnaire de dialogue qui assise des patients à manoeuvrer une chaise roulante automatiséee.
Chinaei, Hamid Reza. "Learning Dialogue POMDP Model Components from Expert Dialogues." Thesis, Université Laval, 2013. http://www.theses.ulaval.ca/2013/29690/29690.pdf.
Full textSpoken dialogue systems should realize the user intentions and maintain a natural and efficient dialogue with users. This is however a difficult task as spoken language is naturally ambiguous and uncertain, and further the automatic speech recognition (ASR) output is noisy. In addition, the human user may change his intention during the interaction with the machine. To tackle this difficult task, the partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while supporting automated policy solving. In this context, estimating the dialogue POMDP model components is a signifficant challenge as they have a direct impact on the optimized dialogue POMDP policy. This thesis proposes methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Speciffically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced signifficantly while the POMDP performance remains high particularly in the intention POMDP. In addition to these model components, POMDPs also require a reward function. So, we propose new algorithms for learning the POMDP reward model from dialogues based on inverse reinforcement learning (IRL). In particular, we propose the POMDP-IRL-BT algorithm (BT for belief transition) that works on the belief states available in the dialogues. This algorithm learns the reward model by estimating a belief transition model, similar to MDP (Markov decision process) transition models. Ultimately, we apply the proposed methods on a healthcare domain and learn a dialogue POMDP essentially from real unannotated and noisy dialogues.
Li, Xin. "POMDP compression and decomposition via belief state analysis." HKBU Institutional Repository, 2009. http://repository.hkbu.edu.hk/etd_ra/1012.
Full textZheltova, Ludmila. "STRUCTURED MAINTENANCE POLICIES ON INTERIOR SAMPLE PATHS." Case Western Reserve University School of Graduate Studies / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=case1264627939.
Full textMemarzadeh, Milad. "System-Level Adaptive Monitoring and Control of Infrastructures: A POMDP-Based Framework." Research Showcase @ CMU, 2015. http://repository.cmu.edu/dissertations/664.
Full textPinheiro, Paulo Gurgel 1983. "Planning for mobile robot localization using architectural design features on a hierarchical POMDP approach = Planejamento para localização de robôs móveis utilizando padrões arquitetônicos em um modelo hierárquico de POMDP." [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275601.
Full textTese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-24T02:06:24Z (GMT). No. of bitstreams: 1 Pinheiro_PauloGurgel_D.pdf: 41476694 bytes, checksum: f3d5b1e2aa32aa6f00ef7ac689a261e2 (MD5) Previous issue date: 2013
Resumo: Localização de robôs móveis é uma das áreas mais exploradas da robótica devido a sua importância para a resolução de problemas, como: navegação, mapeamento e SLAM. Muitos trabalhos apresentaram soluções envolvendo cooperação, comunicação e exploração do ambiente, onde em geral a localização é obtida através de ações randômicas ou puramente orientadas pelo estado de crença. Nesta tese, é apresentado um modelo de planejamento para localização utilizando POMDP e Localização de Markov, que indicaria a melhor ação que o robô deve efetuar em cada momento, com o objetivo de diminuir a quantidade de passos. O foco está principalmente em: i) problemas de difícil localização: onde não há landmark ou informação extra no ambiente que auxilie o robô, ii) situações de performance crítica: onde o robô deve evitar passos randômicos e o gasto de energia e, por último, iii) situações com múltiplas missões. Sabendo que um robô é projetado para desempenhar missões, será proposto, neste trabalho, um modelo onde essas missões são consideradas em paralelo com a localização. Planejar para cenários com múltiplos ambientes é um desafio devido a grande quantidade de estados que deve ser tratada. Para esse tipo de problema, será apresentado um modelo de compressão de mapas que utiliza padrões arquiteturais e de design, como: quantidade de portas, paredes ou área total de um ambiente, para condensar informações que possam ser redundantes. O modelo baseia-se na similaridade das características de desing para agrupar ambientes similares e combiná-los, gerando um único mapa representante que possui uma quantidade de estados menor que a soma total de todos os estados dos ambientes do grupo. Planos em POMDP são gerados apenas para os representantes e não para todo o mapa. Finalmente, será apresentado o modelo hierárquico onde a localização é executada em duas camadas. Na camada superior, o robô utiliza os planos POMDP e os mapas compactos para estimar a grossa estimativa de sua localização e, na camada inferior, utiliza POMDP ou Localização de Markov para a obtenção da postura mais precisa. O modelo hierárquico foi demonstrado com experimentos utilizando o simulador V-REP, e o robô Pioneer 3-DX. Resultados comparativos mostraram que o robô utilizando o modelo proposto, foi capaz de realizar o processo de localização em cenários com múltiplos ambientes e cumprir a missão, mantendo a precisão com uma significativa redução na quantidade de passos efetuados
Abstract: Mobile Robot localization is one of the most explored areas in robotics due to its importance for solving problems, such as navigation, mapping and SLAM. In this work, we are interested in solving global localization problems, where the initial pose of the robot is completely unknown. Several works have proposed solutions for localization focusing on robot cooperation, communication or environment exploration, where the robot's pose is often found by a certain amount of random actions or state belief oriented actions. In order to decrease the total steps performed, we will introduce a model of planning for localization using POMDPs and Markov Localization that indicates the optimal action to be taken by the robot for each decision time. Our focus is on i) hard localization problems, where there are no special landmarks or extra features over the environment to help the robot, ii) critical performance situation, where the robot is required to avoid random actions and the waste of energy roaming over the environment, and iii) multiple missions situations. Aware the robot is designed to perform missions, we have proposed a model that runs missions and the localization process, simultaneously. Also, since the robot can have different missions, the model computes the planning for localization as an offline process, but loading the missions at runtime. Planning for multiple environments is a challenge due to the amount of states we must consider. Thus, we also proposed a solution to compress the original map, creating a smaller topological representation that is easier and cheaper to get plans done. The map compression takes advantage of the similarity of rooms found especially in offices and residential environments. Similar rooms have similar architectural design features that can be shared. To deal with the compressed map, we proposed a hierarchical approach that uses light POMDP plans and the compressed map on the higher layer to find the gross pose, and on the lower layer, decomposed maps to find the precise pose. We have demonstrated the hierarchical approach with the map compression using both V-REP Simulator and a Pioneer 3-DX robot. Comparing to other active localization models, the results show that our approach allowed the robot to perform both localization and the mission in a multiple room environment with a significant reduction on the number of steps while keeping the pose accuracy
Doutorado
Ciência da Computação
Doutor em Ciência da Computação
Saldaña, Gadea Santiago Jesús. "The effectiveness of social plan sharing in online planning in POMDP-type domains." Winston-Salem, NC : Wake Forest University, 2009. http://dspace.zsr.wfu.edu/jspui/handle/10339/44699.
Full textTitle from electronic thesis title page. Thesis advisor: William H. Turkett Jr. Vita. Includes bibliographical references (p. 47-48).
BRAVO, RAISSA ZURLI BITTENCOURT. "THE USE OF UAVS IN HUMANITARIAN RELIEF: A POMDP BASED METHODOLOGY FOR FINDING VICTIMS." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2016. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=30364@1.
Full textCOORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE SUPORTE À PÓS-GRADUAÇÃO DE INSTS. DE ENSINO
O uso de Veículos Aéreos Não Tripulados (VANTs) na ajuda humanitária tem sido proposto por pesquisadores para localizar vítimas em áreas afetadas por desastres. A urgência desse tipo de operação é encontrar pessoas afetadas o mais rápido possível, o que significa que determinar a roteirização ótima para os VANTs é muito importante para salvar vidas. Como os VANTs tem que percorrer toda a área afetada para encontrar vítimas, a operação de roteirização se torna equivalente a um problema de cobertura. Neste trabalho, uma metodologia para resolver o problema de cobertura é proposta, baseada na heurística do Processo de Decisão de Markov Parcialmente Observável (POMDP), onde as observações feitas pelos VANTs são consideradas. Essa heurística escolhe as ações baseando-se nas informações disponíveis, essas informações são as ações e observações anteriores. A formulação da roteirização do VANT é baseada na ideia de dar prioridades mais altas às áreas mais propensas a terem vítimas. Para aplicar esta técnica em casos reais, foi criada uma metodologia que consiste em quatro etapas. Primeiramente, o problema é modelado em relação à área afetada, tipo de drone que será utilizado, resolução da câmera, altura média do voo, ponto de partida ou decolagem, além do tamanho e prioridade dos estados. Em seguida, a fim de testar a eficiência do algoritmo através de simulações, grupos de vítimas são distribuídos pela área a ser sobrevoada. Então, o algoritmo é iniciado e o drone, a cada iteração, muda de estado de acordo com a heurística POMDP, até percorrer toda a área afetada. Por fim, a eficiência do algoritmo é testada através de quatro estatísticas: distância percorrida, tempo de operação, percentual de cobertura e tempo para encontrar grupos de vítimas. Essa metodologia foi aplicada em dois exemplos ilustrativos: um tornado em Xanxerê, no Brasil, que foi um desastre de início súbito em Abril de 2015, e em um campo de refugiados no Sudão do Sul, um desastre de início lento que começou em 2013. Depois de fazer simulações, foi demonstrado que a solução cobre toda a área afetada por desastres em um período de tempo razoável. A distância percorrida pelo VANT e a duração da operação, que dependem do número de estados, não tiveram um desvio padrão significativo entre as simulações, o que significa que, ainda que existam vários caminhos possíveis devido ao empate das prioridades, o algoritmo tem resultados homogêneos. O tempo para encontrar grupos de vítimas, e portanto o sucesso da operação de resgate, depende da definição das prioridades dos estados, estabelecidas por um especialista. Caso as prioridades sejam mal definidas, o VANT começará a sobrevoar áreas sem vítimas, o que levará ao fracasso da operação de resgate, uma vez que o algoritmo não estará salvando vidas o mais rápido possível. Ainda foi feita uma comparação do algoritmo proposto com o método guloso. A princípio, esse método não cobriu 100 por cento da área afetada, o que tornou a comparação injusta. Para contornar esse problema, o algoritmo guloso foi forçado a percorrer 100 por cento da área afetada e os resultados mostram que o POMDP tem resultados melhores em relação ao tempo para salvar vítimas. Já em relação a distância percorrida e tempo de operação, os resultados são iguais ou melhores para o POMDP. Isso ocorre porque o algoritmo guloso tem o viés de otimizar distância percorrida e, logo, otimiza o tempo de operação. Já o POMDP tem como objetivo, nesta dissertação, salvar vidas e faz isso de forma dinâmica, atualizando sua distribuição de probabilidades a cada observação feita. O ineditismo desta metodologia é ressaltado no capítulo 3, onde mais de 139 trabalhos foram lidos e classificados com o intuito de mostrar quais são as aplicações que drones em logística humanitária, como o POMDP é usado em drones e como a técnica de simulação é utilizada em logística humanitária. Apenas um artigo propõe o u
The use of Unmanned Aerial Vehicles (UAVs) in humanitarian relief has been proposed by researchers for searching victims in disaster affected areas. The urgency of this type of operation is to find the affected people as soon as possible, which means that determining the optimal flight path for UAVs is very important to save lifes. Since the UAVs have to search through the entire affected area to find victims, the path planning operation becomes equivalent to an area coverage problem. In this study, a methodology to solve the coverage problem is proposed, based on a Partially Observable Markov Decision Processes (POMDP) heuristic, which considers the observations made from UAVs. The formulation of the UAV path planning is based on the idea of assigning higher priorities to the areas which are more likely to contain victims. The methodology was applied in two illustrative examples: a tornado in Xanxerê, Brazil, which was a rapid-onset disaster in April 2015 and a refugee s camp in South Sudan, a slow-onset disaster that started in 2013. After simulations, it is demonstrated that this solution achieves full coverage of disaster affected areas in a reasonable time span. The traveled distance and the operation s durations, which are dependent on the number of states, did not have a significative standard deviation between the simulations. It means that even if there were many possible paths, due to the tied priorities, the algorithm has homogeneous results. The time to find groups of victims, and so the success of the search and rescue operation, depends on the specialist s definition of states priorities. A comparison with a greedy algorithm showed that POMDP is faster to find victims while greedy s performance focuses on minimizing the traveled distance. Future research indicates a practical application of the methodology proposed.
Corona, Gabriel. "Utilisation de croyances heuristiques pour la planification multi-agent dans le cadre des Dec-POMDP." Phd thesis, Université Henri Poincaré - Nancy I, 2011. http://tel.archives-ouvertes.fr/tel-00598689.
Full textCorona, Gabriel. "Utilisation de croyances heuristiques pour la planification multi-agent dans le cadre des Dec-POMDP." Electronic Thesis or Diss., Nancy 1, 2011. http://www.theses.fr/2011NAN10026.
Full textIn this thesis, we focus on planning in decentralised sequentialdecision taking in uncertainty. In the centralised case, the MDP andPOMDP frameworks leads to efficient planning algorithms. The Dec-POMDPframework is used to model decentralised problems. This kind ofproblems is in a higher class of complexity than the centralisedproblem. For this reason, until recently, only very small problem could be solved and only for very small horizons. Recently, some heuristic algorithms have been proposed to handle problem of higher size but there is no theoretic proof of the solution quality. In this thesis, we show how to use a heuristic information in the problem, modelled as a probability distribution on the centralised beliefs, to guide the search for a good approximate policy. Using this heuristic information, we formulate each time step of the planning procedure as a combinatorial optimisation problem. This formulation leads to policies of better quality than previously existing approaches
Morere, Philippe. "Bayesian Optimisation for Planning And Reinforcement Learning." Thesis, The University of Sydney, 2019. https://hdl.handle.net/2123/21230.
Full textMarchant, Matus Roman. "Bayesian Optimisation for Planning in Dynamic Environments." Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14497.
Full textAberdeen, Douglas Alexander, and doug aberdeen@anu edu au. "Policy-Gradient Algorithms for Partially Observable Markov Decision Processes." The Australian National University. Research School of Information Sciences and Engineering, 2003. http://thesis.anu.edu.au./public/adt-ANU20030410.111006.
Full textFerrari, Fabio Valerio. "Cooperative POMDPs for human-Robot joint activities." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMC257/document.
Full textThis thesis presents a novel method for ensuring cooperation between humans and robots in public spaces, under the constraint of human behavior uncertainty. The thesis introduces a hierarchical and flexible framework based on POMDPs. The framework partitions the overall joint activity into independent planning modules, each dealing with a specific aspect of the joint activity: either ensuring the human-robot cooperation, or proceeding with the task to achieve. The cooperation part can be solved independently from the task and executed as a finite state machine in order to contain online planning effort. In order to do so, we introduce a belief shift function and describe how to use it to transform a POMDP policy into an executable finite state machine.The developed framework has been implemented in a real application scenario as part of the COACHES project. The thesis describes the Escort mission used as testbed application and the details of implementation on the real robots. This scenario has as well been used to carry several experiments and to evaluate our contributions
Pinault, Florian. "Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral." Phd thesis, Université d'Avignon, 2011. http://tel.archives-ouvertes.fr/tel-00933937.
Full textHabachi, Oussama. "Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage." Phd thesis, Université d'Avignon, 2012. http://tel.archives-ouvertes.fr/tel-00799903.
Full textVanegas, Alvarez Fernando. "Uncertainty based online planning for UAV missions in GPS-denied and cluttered environments." Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/103846/1/Fernando_Vanegas%20Alvarez_Thesis.pdf.
Full textRaiss, El Fenni Mohammed. "Opportunistic spectrum usage and optimal control in heterogeneous wireless networks." Phd thesis, Université d'Avignon, 2012. http://tel.archives-ouvertes.fr/tel-00907120.
Full textZhang, Zhao. "Learning Path Recommendation : A Sequential Decision Process." Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0108.
Full textOver the past couple of decades, there has been an increasing adoption of Internet technology in the e-learning domain, associated with the availability of an increasing number of educational resources. Effective systems are thus needed to help learners to find useful and adequate resources, among which recommender systems play an important role. In particular, learning path recommender systems, that recommend sequences of educational resources, are highly valuable to improve learners' learning experiences. Under this context, this PhD Thesis focuses on the field of learning path recommender systems and the associated offline evaluation of these systems. This PhD Thesis views the learning path recommendation task as a sequential decision problem and considers the partially observable Markov decision process (POMDP) as an adequate approach. In the field of education, the learners' memory strength is a very important factor and several models of learners' memory strength have been proposed in the literature and used to promote review in recommendations. However, little work has been conducted for POMDP-based recommendations, and the models proposed are complex and data-intensive. This PhD Thesis proposes POMDP-based recommendation models that manage learners' memory strength, while limiting the increase in complexity and data required. Under the premise that recommending learners useful and effective learning paths is becoming more and more popular, the evaluation of the effectiveness these recommended learning paths is still a challenging task, that is not often addressed in the literature. Online evaluation is highly popular but it relies on the path recommendations to actual learners, which may have dramatic implications if the recommendations are not accurate. Offline evaluation relies on static datasets of learners' learning activities and simulates learning paths recommendations. Although easier to run, it is difficult to accurately evaluate the effectiveness of a learning path recommendation. This tends to justify the lack of literature on this topic. To tackle this issue, this PhD Thesis also proposes offline evaluation measures, that are designed to be simple to be used in most of the application cases. The recommendation models and evaluation measures the we propose are evaluated on two real learning datasets. The experiments confirm that the recommendation models proposed outperform the models from the literature, with a limited increase in complexity, including for a medium-size dataset
Ponzoni, Carvalho Chanel Caroline. "Planification de perception et de mission en environnement incertain : Application à la détection et à la reconnaissance de cibles par un hélicoptère autonome." Thesis, Toulouse, ISAE, 2013. http://www.theses.fr/2013ESAE0011/document.
Full textMobile and aerial robots are faced to the need of planning actions with incomplete information about the state of theworld. In this context, this thesis proposes a modeling and resolution framework for perception and mission planningproblems where an autonomous helicopter must detect and recognize targets in an uncertain and partially observableenvironment. We founded our work on Partially Observable Markov Decision Processes (POMDPs), because it proposes ageneral optimization framework for perception and decision tasks under long-term horizon. A special attention is given tothe outputs of the image processing algorithm in order to model its uncertain behavior as a probabilistic observationfunction. A critical study on the POMDP model and its optimization criterion is also conducted. In order to respect safetyconstraints of aerial robots, we then propose an approach to properly handle action feasibility constraints in partiallyobservable domains: the AC-POMDP model, which distinguishes between the verification of environmental properties andthe information about targets' nature. Furthermore, we propose a framework to optimize and execute POMDP policies inparallel under time constraints. This framework is based on anticipated and probabilistic optimization of future executionstates of the system. Finally, we embedded this algorithmic framework on-board Onera's autonomous helicopters, andperformed real flight experiments for multi-target detection and recognition missions
Olafsson, Björgvin. "Partially Observable Markov Decision Processes for Faster Object Recognition." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198632.
Full textHudson, Joshua. "A Partially Observable Markov Decision Process for Breast Cancer Screening." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154437.
Full textDutech, Alain. "Apprentissage par Renforcement : Au delà des Processus Décisionnels de Markov (Vers la cognition incarnée)." Habilitation à diriger des recherches, Université Nancy II, 2010. http://tel.archives-ouvertes.fr/tel-00549108.
Full textPradhan, Neil. "Deep Reinforcement Learning for Autonomous Highway Driving Scenario." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289444.
Full textVi presenterar ett autonomt körföretag på ett simulerat motorvägsscenario med fordon som bilar och lastbilar som rör sig med stokastiskt variabla hastighetsprofiler. Fokus för den simulerade miljön är att testa taktiskt beslutsfattande i motorvägsscenarier. När en agent (fordon) upprätthåller ett optimalt hastighetsområde är det fördelaktigt både när det gäller energieffektivitet och grönare miljö. För att upprätthålla ett optimalt hastighetsområde föreslog jag i detta avhandlingsarbete två nya belöningsstrukturer: (a) gaussisk belöningsstruktur och (b) exponentiell uppgång och nedgång belöningsstruktur. Jag utbildade respektive två djupförstärkande inlärningsagenter för att studera deras skillnader och utvärdera deras prestanda baserat på en uppsättning parametrar som är mest relevanta i motorvägsscenarier. Algoritmen som implementeras i detta avhandlingsarbete är dubbel-duell djupt Q- nätverk med prioriterad återuppspelningsbuffert. Experiment utfördes genom att lägga till brus i ingångarna, simulera delvis observerbar Markov-beslutsprocess för att erhålla tillförlitlighetsjämförelse mellan olika belöningsstrukturer. Hastighetsbeläggningsgaller visade sig vara bättre än binärt beläggningsgaller som inmatning för algoritmen. Dessutom har metodik för att generera bränsleeffektiv politik diskuterats och demonstrerats med ett exempel.
Drougard, Nicolas. "Exploiting imprecise information sources in sequential decision making problems under uncertainty." Thesis, Toulouse, ISAE, 2015. http://www.theses.fr/2015ESAE0037/document.
Full textPartially Observable Markov Decision Processes (POMDPs) define a useful formalism to express probabilistic sequentialdecision problems under uncertainty. When this model is used for a robotic mission, the system is defined as the featuresof the robot and its environment, needed to express the mission. The system state is not directly seen by the agent (therobot). Solving a POMDP consists thus in computing a strategy which, on average, achieves the mission best i.e. a functionmapping the information known by the agent to an action. Some practical issues of the POMDP model are first highlightedin the robotic context: it concerns the modeling of the agent ignorance, the imprecision of the observation model and thecomplexity of solving real world problems. A counterpart of the POMDP model, called pi-POMDP, simplifies uncertaintyrepresentation with a qualitative evaluation of event plausibilities. It comes from Qualitative Possibility Theory whichprovides the means to model imprecision and ignorance. After a formal presentation of the POMDP and pi-POMDP models,an update of the possibilistic model is proposed. Next, the study of factored pi-POMDPs allows to set up an algorithmnamed PPUDD which uses Algebraic Decision Diagrams to solve large structured planning problems. Strategies computedby PPUDD, which have been tested in the context of the competition IPPC 2014, can be more efficient than those producedby probabilistic solvers when the model is imprecise or for high dimensional problems. This thesis proposes some ways ofusing Qualitative Possibility Theory to improve computation time and uncertainty modeling in practice
Araya-López, Mauricio. "Des algorithmes presque optimaux pour les problèmes de décision séquentielle à des fins de collecte d'information." Phd thesis, Université de Lorraine, 2013. http://tel.archives-ouvertes.fr/tel-00943513.
Full textPokharel, Gaurab. "Increasing the Value of Information During Planning in Uncertain Environments." Oberlin College Honors Theses / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin1624976272271825.
Full textIbrahim, Rita. "Utilisation des communications Device-to-Device pour améliorer l'efficacité des réseaux cellulaires." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLC002/document.
Full textThis thesis considers Device-to-Device (D2D) communications as a promising technique for enhancing future cellular networks. Modeling, evaluating and optimizing D2D features are the fundamental goals of this thesis and are mainly achieved using the following mathematical tools: queuing theory, Lyapunov optimization and Partially Observed Markov Decision Process (POMDP). The findings of this study are presented in three parts. In the first part, we investigate a D2D mode selection scheme. We derive the queuing stability regions of both scenarios: pure cellular networks and D2D-enabled cellular networks. Comparing both scenarios leads us to elaborate a D2D vs cellular mode selection design that improves the capacity of the network. In the second part, we develop a D2D resource allocation algorithm. We observe that D2D users are able to estimate their local Channel State Information (CSI), however the base station needs some signaling exchange to acquire this information. Based on the D2D users' knowledge of their local CSI, we provide an energy efficient resource allocation framework that shows how distributed scheduling outperforms centralized one. In the distributed approach, collisions may occur between the different CSI reporting; thus, we propose a collision reduction algorithm. Moreover, we give a detailed description on how both centralized and distributed algorithms can be implemented in practice. In the third part, we propose a mobile relay selection policy in a D2D relay-aided network. Relays' mobility appears as a crucial challenge for defining the strategy of selecting the optimal D2D relays. The problem is formulated as a constrained POMDP which captures the dynamism of the relays and aims to find the optimal relay selection policy that maximizes the performance of the network under cost constraints
Allen, Martin William. "Agent interactions in decentralized environments." Amherst, Mass. : University of Massachusetts Amherst, 2009. http://scholarworks.umass.edu/open_access_dissertations/1.
Full textHänsel, Rosemarie. "Abschied von Ingeborg Pomp." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2007. http://nbn-resolving.de/urn:nbn:de:swb:14-1184240704601-52673.
Full textAras, Raghav. "Mathematical programming methods for decentralized POMDPs." Thesis, Nancy 1, 2008. http://www.theses.fr/2008NAN10092/document.
Full textIn this thesis, we study the problem of the optimal decentralized control of a partially observed Markov process over a finite horizon. The mathematical model corresponding to the problem is a decentralized POMDP (DEC-POMDP). Many problems in practice from the domains of artificial intelligence and operations research can be modeled as DEC-POMDPs. However, solving a DEC-POMDP exactly is intractable (NEXP-hard). The development of exact algorithms is necessary in order to guide the development of approximate algorithms that can scale to practical sized problems. Existing algorithms are mainly inspired from POMDP research (dynamic programming and forward search) and require an inordinate amount of time for even very small DEC-POMDPs. In this thesis, we develop a new mathematical programming based approach for exactly solving a finite horizon DEC-POMDP. We use the sequence form of a control policy in this approach. Using the sequence form, we show how the problem can be formulated as a mathematical progam with a nonlinear object and linear constraints. We thereby show how this nonlinear program can be linearized to a 0-1 mixed integer linear program (MIP). We present two different 0-1 MIPs based on two different properties of a DEC-POMDP. The computational experience of the mathematical programs presented in the thesis on four benchmark problems (MABC, MA-Tiger, Grid Meeting, Fire Fighting) shows that the time taken to find an optimal joint policy is one or two orders or magnitude lesser than the exact existing algorithms. In the problems tested, the time taken drops from several hours to a few seconds or minutes
Aras, Raghav Charpillet François Dutech Alain. "Mathematical programming methods for decentralized POMDPs." S. l. : Nancy 1, 2008. http://www.scd.uhp-nancy.fr/docnum/SCD_T_2008_0092_ARAS.pdf.
Full textEndo, Yoichiro. "Countering Murphys law the use of anticipation and improvisation via an episodic memory in support of intelligent robot behavior /." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26466.
Full textCommittee Chair: Arkin, Ronald; Committee Member: Balch, Tucker; Committee Member: Dellaert, Frank; Committee Member: Potter, Steve; Committee Member: Ram, Ashwin. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Ortiz, Olga L. "Stochastic inventory control with partial demand observability." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/22551.
Full textCommittee Co-Chair: Alan L Erera; Committee Co-Chair: Chelsea C, White III; Committee Member: Julie Swann; Committee Member: Paul Griffin; Committee Member: Soumen Ghosh.
Brooks, Alex. "Parametric POMDPs for planning in continuous state spaces." University of Sydney, 2007. http://hdl.handle.net/2123/1861.
Full textThis thesis is concerned with planning and acting under uncertainty in partially-observable continuous domains. In particular, it focusses on the problem of mobile robot navigation given a known map. The dominant paradigm for robot localisation is to use Bayesian estimation to maintain a probability distribution over possible robot poses. In contrast, control algorithms often base their decisions on the assumption that a single state, such as the mode of this distribution, is correct. In scenarios involving significant uncertainty, this can lead to serious control errors. It is generally agreed that the reliability of navigation in uncertain environments would be greatly improved by the ability to consider the entire distribution when acting, rather than the single most likely state. The framework adopted in this thesis for modelling navigation problems mathematically is the Partially Observable Markov Decision Process (POMDP). An exact solution to a POMDP problem provides the optimal balance between reward-seeking behaviour and information-seeking behaviour, in the presence of sensor and actuation noise. Unfortunately, previous exact and approximate solution methods have had difficulty scaling to real applications. The contribution of this thesis is the formulation of an approach to planning in the space of continuous parameterised approximations to probability distributions. Theoretical and practical results are presented which show that, when compared with similar methods from the literature, this approach is capable of scaling to larger and more realistic problems. In order to apply the solution algorithm to real-world problems, a number of novel improvements are proposed. Specifically, Monte Carlo methods are employed to estimate distributions over future parameterised beliefs, improving planning accuracy without a loss of efficiency. Conditional independence assumptions are exploited to simplify the problem, reducing computational requirements. Scalability is further increased by focussing computation on likely beliefs, using metric indexing structures for efficient function approximation. Local online planning is incorporated to assist global offline planning, allowing the precision of the latter to be decreased without adversely affecting solution quality. Finally, the algorithm is implemented and demonstrated during real-time control of a mobile robot in a challenging navigation task. We argue that this task is substantially more challenging and realistic than previous problems to which POMDP solution methods have been applied. Results show that POMDP planning, which considers the evolution of the entire probability distribution over robot poses, produces significantly more robust behaviour when compared with a heuristic planner which considers only the most likely states and outcomes.
Brooks, Alex M. "Parametric POMDPs for planning in continuous state spaces." Connect to full text, 2007. http://hdl.handle.net/2123/1861.
Full textTitle from title screen (viewed 15 January 2009). Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy to the Australian Centre for Field Robotics, School of Aerospace, Mechanical and Mechatronic Engineering. Includes bibliographical references. Also available in print form.
Parikh, Rachel. "Persian pomp, Indian circumstance : the Khalili Falnama." Thesis, University of Cambridge, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648619.
Full textJarzyna, Tomasz. "Modelowanie i analiza dynamiczna pionowych pomp diagonalnych." Rozprawa doktorska, [Nakł.aut.], 2011. http://dlibra.utp.edu.pl/Content/272.
Full textAtrash, Amin. "A Bayesian Framework for Online Parameter Learning in POMDPs." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104587.
Full textComme le nombre d'agents autonomes et semi-autonomes dansnotre société ne cesse de croître, les prises de décisions sous incertitude constituent désormais un problème critique. Malgré l'incertitude et l'ambiguité inhérentes à leurs environnements, ces agents doivent demeurer robustes dans l'exécution de leurs tâches. Les processus de décision markoviens partiellement observables (POMDP) offrent un cadre mathématique permettant la modélisation des agents et de leurs environnements. Ces modèles sont capables de capturer l'incertitude due aux perturbations dans les capteurs ainsi qu'aux actionneurs imprécis. Ils permettent conséquemment une prise de décision tenant compte des connaissances imparfaites des agents. À ce jour, les POMDP ont été utilisés avec succès dans un éventail de domaines, allant de la robotique à la gestion de dialogue, en passant par la médecine. Plusieurs travaux de recherche se sont penchés sur des méthodes visant à optimiser les POMDP. Cependant, ces méthodes requièrent habituellement un modèle environnemental préalablement connu. Dans ce mémoire, une méthode bayésienne d'apprentissage par renforcement est présentée, avec laquelle il est possible d'apprendre les paramètres du modèle POMDP pendant l'éxécution. Cette méthode tire avantage d'une coopération avec un opérateur capable de guider l'apprentissage en divulguant certaines données optimales. Avec l'aide du renforcement bayésien, l'agent peut apprendre pendant l'éxécution, incorporer immédiatement les données nouvelles et profiter des connaissances précédentes, pour finalement pouvoir adapter sa politique de décision à celle de l'opérateur. La méthodologie décrite est validée à l'aide de données produites par le gestionnaire d'interactions d'une chaise roulante autonome. Ce gestionnaire prend la forme d'une interface intelligente entre le robot et l'usager, permettant à celui-ci de stipuler des commandes de haut niveau de façon naturelle, par exemple en parlant à voix haute. Les fonctions du gestionnaire sont accomplies à l'aide d'un POMDP et constituent un scénario d'apprentissage idéal, dans lequel l'agent doit s'ajuster progressivement aux besoins de l'usager.
Skoglund, Caroline. "Risk-aware Autonomous Driving Using POMDPs and Responsibility-Sensitive Safety." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300909.
Full textAutonoma fordon förutspås spela en stor roll i framtiden med målen att förbättra effektivitet och säkerhet för vägtransporter. Men även om vi sett flera exempel av autonoma fordon ute på vägarna de senaste åren är frågan om hur säkerhet ska kunna garanteras ett utmanande problem. Det här examensarbetet har studerat denna fråga genom att utveckla ett ramverk för riskmedvetet beslutsfattande. Det autonoma fordonets dynamik och den oförutsägbara omgivningen modelleras med en partiellt observerbar Markov-beslutsprocess (POMDP från engelskans “Partially Observable Markov Decision Process”). Ett riskmått föreslås baserat på ett säkerhetsavstånd förkortat RSS (från engelskans “Responsibility-Sensitive Safety”) som kvantifierar det minsta avståndet till andra fordon för garanterad säkerhet. Riskmåttet integreras i POMDP-modellens belöningsfunktion för att åstadkomma riskmedvetna beteenden. Den föreslagna riskmedvetna POMDP-modellen utvärderas i två fallstudier. I ett scenario där det egna fordonet följer ett annat fordon på en enfilig väg visar vi att det egna fordonet kan undvika en kollision då det framförvarande fordonet bromsar till stillastående. I ett scenario där det egna fordonet ansluter till en huvudled från en ramp visar vi att detta görs med ett tillfredställande avstånd till andra fordon. Slutsatsen är att den riskmedvetna POMDP-modellen lyckas realisera en avvägning mellan säkerhet och användbarhet genom att hålla ett rimligt säkerhetsavstånd och anpassa sig till andra fordons beteenden.
Wright, Allan. "Frank Zappa's orchestral works art music or "bogus pomp"? /." Connect to e-thesis, 2007. http://theses.gla.ac.uk/492/.
Full textCohen, Jonathan. "Formation dynamique d'équipes dans les DEC-POMDPS ouverts à base de méthodes Monte-Carlo." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMC225/document.
Full textThis thesis addresses the problem where a team of cooperative and autonomous agents, working in a stochastic and partially observable environment towards solving a complex task, needs toe dynamically modify its structure during the process execution, so as to adapt to the evolution of the task. It is a problem that has been seldom studied in the field of multi-agent planning. However, there are many situations where the team of agents is likely to evolve over time.We are particularly interested in the case where the agents can decide for themselves to leave or join the operational team. Sometimes, using few agents can be for the greater good. Conversely, it can sometimes be useful to call on more agents if the situation gets worse and the skills of some agents turn out to be valuable assets.In order to propose a decision model that can represent those situations, we base upon the decentralized and partially observable Markov decision processes, the standard model for planning under uncertainty in decentralized multi-agent settings. We extend this model to allow agents to enter and exit the system. This is what is called agent openness. We then present two planning algorithms based on the popular Monte-Carlo Tree Search methods. The first algorithm builds separable joint policies by computing series of best responses individual policies, while the second algorithm builds non-separable joint policies by ranking the teams in each situation via an Elo rating system. We evaluate our methods on new benchmarks that allow to highlight some interesting features of open systems
Paulmann, Johannes. "Pomp und Politik : Monarchenbegegnungen in Europa zwischen Ancien Régime und Erstem Weltkrieg /." Paderborn [u.a.] : Schöningh, 2000. http://www.gbv.de/dms/bs/toc/31944564x.pdf.
Full textBłaszczyk, Andrzej. "Metoda projektowania pomp o specjalnych wymaganiach eksploatacyjno-ruchowych z wykorzystaniem numerycznej analizy przepływów trójwymiarowych /." Łódź : Wydawn. Politechn, 2003. http://www.gbv.de/dms/goettingen/372713971.pdf.
Full textPomp, Sarah [Verfasser]. "The role of depressive symptoms in the process of health behavior change / Sarah Pomp." Berlin : Freie Universität Berlin, 2012. http://d-nb.info/1027815340/34.
Full textFricke, Benjamin. "Lokalisation, Isolierung und in vitro Generierung von Assemblierungsintermediaten des humanen 20S Proteasoms." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2006. http://dx.doi.org/10.18452/15533.
Full textThe 20S Proteasom represents the protein degrading part of the Ubiquitin- Proteasom- System and is therefore a participant in important cellular processes like gene expression, cell cycle control, apoptosis, peptide generation for MHC class I presentation and degradation of misfolded proteins. Only the beginnings of the individual steps of the 20S proteasome biogenesis in eucaryotes are so far understood. Tis work examines the subunit composition of assembly intermediates and their subcellular localisation and organisation in eucaryotic cells. Distinct assembly intermediates of human proteasomes have been generated by establishing an in vitro system. As an earlier intermediate in the in vitro system an (-ring could be identified. In vivo experiments using radioactive marked total lysates of HeLa cells shed light on the following sequence of assembly. Thus new synthesised and finally incorporated subunits could be indentified. Two of this subunits are (1 and (7 which could force the dimerisation process of two half-proteasome-precursor by their trans acting c-terminal extensions. Furthermore the (1 subunit has been identified as the (-ring completing subunit in the precursor complex. In addition it was possible to detect proteasomal assembly intermediates through immuncytochemical and biochemical methods on the ER of human cell lines. Thereby the assembly factor POMP plays a key role as it allows in the first place the precursor association with the ER. This work clarifies further steps of complex procedure of biogenesis of constitutive 20S proteasomes in human cell lines and allows the characterisation of the subcellular localisation of assembly intermediates in human cells for the first time.
Keryell, Ronan. "Pomp : d'un petit ordinateur massivement parallele smid a base de processeurs risc concepts, etude et realisation." Paris 11, 1992. http://www.theses.fr/1992PA112499.
Full textLusena, Christopher. "Finite Memory Policies for Partially Observable Markov Decision Proesses." UKnowledge, 2001. http://uknowledge.uky.edu/gradschool_diss/323.
Full text