Dissertations / Theses on the topic 'Markov processes'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Markov processes.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Desharnais, Josée. "Labelled Markov processes." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0031/NQ64546.pdf.
Full textBalan, Raluca M. "Set-Markov processes." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ66119.pdf.
Full textEltannir, Akram A. "Markov interactive processes." Diss., Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/30745.
Full textHaugomat, Tristan. "Localisation en espace de la propriété de Feller avec application aux processus de type Lévy." Thesis, Rennes 1, 2018. http://www.theses.fr/2018REN1S046/document.
Full textIn this PhD thesis, we give a space localisation for the theory of Feller processes. A first objective is to obtain simple and precise results on the convergence of Markov processes. A second objective is to study the link between the notions of Feller property, martingale problem and Skorokhod topology. First we give a localised version of the Skorokhod topology. We study the notions of compactness and tightness for this topology. We make the connexion between localised and unlocalised Skorokhod topologies, by using the notion of time change. In a second step, using the localised Skorokhod topology and the time change, we study martingale problems. We show the equivalence between, on the one hand, to be solution of a well-posed martingale problem, on the other hand, to satisfy a localised version of the Feller property, and finally, to be a Markov process weakly continuous with respect to the initial condition. We characterise the weak convergence for solutions of martingale problems in terms of convergence of associated operators and give a similar result for discrete time approximations. Finally, we apply the theory of locally Feller process to some examples. We first apply it to the Lévy-type processes and obtain convergence results for discrete and continuous time processes, including simulation methods and Euler’s schemes. We then apply the same theory to one-dimensional diffusions in a potential and we obtain convergence results of diffusions or random walks towards singular diffusions. As a consequences, we deduce the convergence of random walks in random environment towards diffusions in random potential
莊競誠 and King-sing Chong. "Explorations in Markov processes." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1997. http://hub.hku.hk/bib/B31235682.
Full textJames, Huw William. "Transient Markov decision processes." Thesis, University of Bristol, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.430192.
Full textKu, Ho Ming. "Interacting Markov branching processes." Thesis, University of Liverpool, 2014. http://livrepository.liverpool.ac.uk/2002759/.
Full textChong, King-sing. "Explorations in Markov processes /." Hong Kong : University of Hong Kong, 1997. http://sunzi.lib.hku.hk/hkuto/record.jsp?B18736105.
Full textPötzelberger, Klaus. "On the Approximation of finite Markov-exchangeable processes by mixtures of Markov Processes." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 1991. http://epub.wu.ac.at/526/1/document.pdf.
Full textSeries: Forschungsberichte / Institut für Statistik
Ferns, Norman Francis. "Metrics for Markov decision processes." Thesis, McGill University, 2003. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=80263.
Full textChaput, Philippe. "Approximating Markov processes by averaging." Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=66654.
Full textNous reconsidérons les processus de Markov étiquetés sous une nouvelle approche, dans un certain sens "dual'' au point de vue usuel. Au lieu de considérer les transitions d'état en état en tant qu'une collection de distributions de sous-probabilités sur l'espace d'états, nous les regardons en tant que transformations de fonctions réelles. En généralisant l'opération d'espérance conditionelle, nous construisons une catégorie où les objets sont des processus de Markov étiquetés regardés en tant qu'un rassemblement d'opérateurs; les flèches de cette catégorie se comportent comme des projections sur un espace d'états plus petit. Nous définissons une notion d'équivalence pour de tels processus, que l'on appelle bisimulation, qui est intimement liée avec la définition usuelle pour les processus probabilistes. Nous démontrons que nous pouvons construire, d'une manière catégorique, le plus petit processus bisimilaire à un processus donné, et que ce plus petit object est lié à une logique modale bien connue. Nous développons une méthode d'approximation basée sur cette logique, où l'espace d'états des processus approximatifs est fini; de plus, nous démontrons que ces processus approximatifs convergent, d'une manière catégorique, au plus petit processus bisimilaire.
Baxter, Martin William. "Discounted functionals of Markov processes." Thesis, University of Cambridge, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309008.
Full textFurloni, Walter. "Controle em horizonte finito com restriçoes de sistemas lineares discretos com saltos markovianos." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/259271.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação
Made available in DSpace on 2018-08-13T17:57:42Z (GMT). No. of bitstreams: 1 Furloni_Walter_M.pdf: 917619 bytes, checksum: 10cdfc1afdfa09f1573d3e30d14415c4 (MD5) Previous issue date: 2009
Resumo: O objetivo deste trabalho é propor e resolver o problema de controle em horizonte finito com restrições de Sistemas Lineares Discretos com Saltos Markovianos (SLDSM) na presença de ruído. As restrições dos vetores de estado e de controle não são rígidas e são estabelecidas por valores limites dos seus respectivos primeiro e segundo momentos. O controlador baseia-se numa estrutura de realimentação linear de estados, devendo minimizar uma função custo quadrática. Consideram-se duas situações com respeito à informação disponível da cadeia de Markov associada: num primeiro caso o estado da cadeia de Markov é conhecido em cada instante e num segundo caso dispõe-se apenas de sua distribuição probabilística inicial. Uma formulação determinística do problema estocástico é desenvolvida de modo que as condições necessárias de otimalidade propostas e as restrições possam ser facilmente incluídas utilizando-se desigualdades matriciais lineares (do inglês, Linear Matrix Inequalities - LMI). A inclusão de restrições constitui a principal contribuição, uma vez que elas são pertinentes a vários campos de aplicação tais como indústria química, transporte de massa, economia, etc. Para ilustração do método são apresentadas duas aplicações: uma referente à regulação de tráfego em linhas metroviárias e outra referente ao problema de seleção de ativos de portfólios em aplicações financeiras
Abstract: The purpose of this work is to propose and solve the constrained control problem within a finite horizon of Markovian Jump Discrete Linear Systems (MJDLS) driven by noise. The constraints of the state and control vectors are not rigid and limits are established respectively to their first and second moments. The controller is based on a linear state feedback structure and shall minimize a quadratic cost function. Two cases regarding the available information of the Markovian chain states are considered: firstly the Markov chain states are known at each step and secondly only its initial probability distribution is available. A deterministic formulation to the stochastic problem is developped in order that the proposed necessary optimality conditions and the constraints are easily included by using Linear Matrix Inequalities (LMI). The constraints consideration constitutes the main contribution, since they are pertinent to several application fields as for example chemical industry, mass transportation, economy etc. Two applications are presented for ilustration: one refers to metro lines traffic regulation and another refers to the financial investment income control
Mestrado
Automação
Mestre em Engenharia Elétrica
Pinheiro, Maicon Aparecido. "Processos pontuais no modelo de Guiol-Machado-Schinazi de sobrevivência de espécies." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-01062016-191528/.
Full textRecently, Guiol, Machado and Schinazi proposed a stochastic model for species evolution. In this model, births and deaths of species occur with intensities invariant over time. Moreover, at the time of birth of a new species, it is labeled with a random number sampled from an absolutely continuous distribution. Each time there is an extinction event, exactly one existing species disappears: that with the smallest number. When the birth rate is greater than the extinction rate, there is a critical value f_c such that all species that come with number less than f_c will almost certainly die after a finite random time, and those with numbers higher than f_c survive forever with positive probability. However, less suitable species continue to appear during the evolutionary process and there is no guarantee the emergence of an immortal species. We consider a particular case of Guiol, Machado and Schinazi model and approach these last two points. We characterize the limit point process linked to species in the subcritical phase of the model and discuss the existence of immortal species.
Pra, Paolo Dai, Pierre-Yves Louis, and Ida G. Minelli. "Complete monotone coupling for Markov processes." Universität Potsdam, 2008. http://opus.kobv.de/ubp/volltexte/2008/1828/.
Full textDe, Stavola Bianca Lucia. "Multistate Markov processes with incomplete information." Thesis, Imperial College London, 1985. http://hdl.handle.net/10044/1/37672.
Full textCarpio, Kristine Joy Espiritu, and kjecarpio@lycos com. "Long-Range Dependence of Markov Processes." The Australian National University. School of Mathematical Sciences, 2006. http://thesis.anu.edu.au./public/adt-ANU20061024.131933.
Full textCastro, Rivadeneira Pablo Samuel. "Bayesian exploration in Markov decision processes." Thesis, McGill University, 2007. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=18479.
Full textLes processus de décision Markoviens sont des modèles mathématiques fréquemment utilisés pour résoudre des problèmes d'optimisation stochastique et de contrôle. L'apprentissage par renforcement est une branche de l'intelligence artificielle qui s'intéresse aux environnements stochastiques où la dynamique du système est inconnue. Un problème majeur des algorithmes d'apprentissage est de bien balancer l'exploration de l'environnement, pour acquérir de nouvelles connaissances, et l'exploitation des connaissances acquises. Nous présentons trois méthodes pour obtenir de bons compromis exploration-exploitation dans les processus de décision Markoviens. L'approche adoptée est Bayésienne, en ce sens où nous utilisons et maintenons une estimation du modèle. L'existence d'une politique optimale pour l'exploration Bayésienne a été démontrée, mais elle est impossible à calculer efficacement. Nous présentons trois approximations de la politique optimale qui utilise l'échantillonnage statistique. \\ La première approche utilise une combinaison de programmation linéaire et de l'algorithme "Q-Learning". Nous présentons des résultats empiriques qui démontrent sa performance. La deuxième approche est une extension de cette idée, et nous démontrons des garanties théoriques de son efficacité, confirmées par des résultats empiriques. Finalement, nous présentons un algorithme qui s'adapte efficacement au temps alloué pour le calcul de la politique. Cette idée est présentée comme une approximation d'un programme linéaire à dimension infini; nous garantissons sa convergence et démontrons une dualité forte.
Propp, Michael Benjamin. "The thermodynamic properties of Markov processes." Thesis, Massachusetts Institute of Technology, 1985. http://hdl.handle.net/1721.1/17193.
Full textMICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING.
Includes glossary.
Bibliography: leaves 87-91.
by Michael Benjamin Propp.
Ph.D.
Korpas, Agata K. "Occupation Times of Continuous Markov Processes." Bowling Green State University / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1151347146.
Full textChu, Shanyun. "Some contributions to Markov decision processes." Thesis, University of Liverpool, 2015. http://livrepository.liverpool.ac.uk/2038000/.
Full textCarpio, Kristine Joy Espiritu. "Long-range dependence of Markov processes /." View thesis entry in Australian Digital Theses Program, 2006. http://thesis.anu.edu.au/public/adt-ANU20061024.131933/index.html.
Full textLIKA, ADA. "MARKOV PROCESSES IN FINANCE AND INSURANCE." Doctoral thesis, Università degli Studi di Cagliari, 2017. http://hdl.handle.net/11584/249618.
Full textDurrell, Fernando. "Constrained portfolio selection with Markov and non-Markov processes and insiders." Doctoral thesis, University of Cape Town, 2007. http://hdl.handle.net/11427/4379.
Full textWright, James M. "Stable processes with opposing drifts /." Thesis, Connect to this title online; UW restricted, 1996. http://hdl.handle.net/1773/5807.
Full textWerner, Ivan. "Contractive Markov systems." Thesis, University of St Andrews, 2004. http://hdl.handle.net/10023/15173.
Full textElsayad, Amr Lotfy. "Numerical solution of Markov Chains." CSUSB ScholarWorks, 2002. https://scholarworks.lib.csusb.edu/etd-project/2056.
Full textBartholme, Carine. "Self-similarity and exponential functionals of Lévy processes." Doctoral thesis, Universite Libre de Bruxelles, 2014. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209256.
Full textDans la première partie, le principal objet d’intérêt est la soi-disant fonctionnelle exponentielle de processus de Lévy. La loi de cette variable aléatoire joue un rôle primordial dans de nombreux domaines divers tant sur le plan théorique que dans des domaines appliqués. Doney dérive une factorisation de la loi arc-sinus en termes de suprema de processus stables indépendants et de même index. Une factorisation similaire de la loi arc-sinus en termes de derniers temps de passage au niveau 1 de processus de Bessel peut aussi être établie en utilisant un résultat dû à Getoor. Des factorisations semblables d’une variable de Pareto en termes des mêmes objets peut également être obtenue. Le but de cette partie est de donner une preuve unifiée et une généralisation de ces factorisations qui semblent n’avoir aucun lien à première vue. Même s’il semble n’y avoir aucune connexion entre le supremum d’un processus stable et le dernier temps de passage d’un processus de Bessel, il peut être montré que ces variables aleatoires sont liées à des fonctionnelles exponentielles de processus de Lévy spécifiques. Notre contribution principale dans cette partie et aussi au niveau de caractérisations de la loi de la fonctionnelle exponentielle sont des factorisations de la loi arc-sinus et de variables de Pareto généralisées. Notre preuve s’appuie sur une factorisation de Wiener-Hopf récente de Patie et Savov.
Dans la deuxième partie, motivée par le fait que la dérivée fractionnaire de Caputo et d’autres opérateurs fractionnaires classiques coïncident avec le générateur de processus de Markov auto-similaires positifs particuliers, nous introduisons des opérateurs généralisés de Caputo et nous étudions certaines propriétés. Nous nous intéressons particulièrement aux conditions sous lesquelles ces opérateurs coïncident avec les générateurs infinitésimaux de processus de Markov auto-similaires positifs généraux. Dans ce cas, nous étudions les fonctions invariantes de ces opérateurs qui admettent une représentation en termes de séries entières. Nous précisons que cette classe de fonctions contient les fonctions de Bessel modifiées, les fonctions de Mittag-Leffler ainsi que plusieurs fonctions hypergéométriques. Nous proposons une étude unifiant et en profondeur de cette classe de fonctions.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished
Dendievel, Sarah. "Skip-free markov processes: analysis of regular perturbations." Doctoral thesis, Universite Libre de Bruxelles, 2015. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209050.
Full textThis thesis focuses on a category of methods, called matrix analytic methods, that has gained much interest because of good computational properties for the analysis of a large family of stochastic processes. Those methods are used in this work in order i) to analyze the effect of regular perturbations of the transition matrix on the stationary distribution of skip-free Markov processes; ii) to determine transient distributions of skip-free Markov processes by performing regular perturbations.
In the class of skip-free Markov processes, we focus in particular on quasi-birth-and-death (QBD) processes and Markov modulated fluid models.
We first determine the first order derivative of the stationary distribution - a key vector in Markov models - of a QBD for which we slightly perturb the transition matrix. This leads us to the study of Poisson equations that we analyze for finite and infinite QBDs. The infinite case has to be treated with more caution therefore, we first analyze it using probabilistic arguments based on a decomposition through first passage times to lower levels. Then, we use general algebraic arguments and use the repetitive block structure of the transition matrix to obtain all the solutions of the equation. The solutions of the Poisson equation need a generalized inverse called the deviation matrix. We develop a recursive formula for the computation of this matrix for the finite case and we derive an explicit expression for the elements of this matrix for the infinite case.
Then, we analyze the first order derivative of the stationary distribution of a Markov modulated fluid model. This leads to the analysis of the matrix of first return times to the initial level, a charactersitic matrix of Markov modulated fluid models.
Finally, we study the cumulative distribution function of the level in finite time and joint distribution functions (such as the level at a given finite time and the maximum level reached over a finite time interval). We show that our technique gives good approximations and allow to compute efficiently those distribution functions.
----------
Un processus markovien est défini par sa matrice de transition. Un processus markovien sans sauts est un processus stochastique de Markov défini par un niveau qui ne peut changer que d'une unité à la fois, soit vers le haut, soit vers le bas. Une perturbation régulière est une modification suffisamment petite d'un ou plusieurs paramètres qui ne modifie pas qualitativement le modèle.
Dans ce travail, nous utilisons des méthodes matricielles pour i) analyser l'effet de perturbations régulières de la matrice de transition sur le processus markoviens sans sauts; ii) déterminer des lois de probabilités en temps fini de processus markoviens sans sauts en réalisant des perturbations régulières.
Dans la famille des processus markoviens sans sauts, nous nous concentrons en particulier sur les processus quasi-birth-and-death (QBD) et sur les files fluides markoviennes.
Nous nous intéressons d'abord à la dérivée de premier ordre de la distribution stationnaire – vecteur clé des modèles markoviens – d'un QBD dont on modifie légèrement la matrice de transition. Celle-ci nous amène à devoir résoudre les équations de Poisson, que nous étudions pour les processus QBD finis et infinis. Le cas infini étant plus délicat, nous l'analysons en premier lieu par des arguments probabilistes en nous basant sur une décomposition par des temps de premier passage. En second lieu, nous faisons appel à un théorème général d'algèbre linéaire et utilisons la structure répétitive de la matrice de transition pour obtenir toutes les solutions à l’équation. Les solutions de l'équation de Poisson font appel à un inverse généralisé, appelé la matrice de déviation. Nous développons ensuite une formule récursive pour le calcul de cette matrice dans le cas fini et nous dérivons une expression explicite des éléments de cette dernière dans le cas infini.
Ensuite, nous analysons la dérivée de premier ordre de la distribution stationnaire d'une file fluide markovienne perturbée. Celle-ci nous amène à développer l'analyse de la matrice des temps de premier retour au niveau initial – matrice caractéristique des files fluides markoviennes.
Enfin, dans les files fluides markoviennes, nous étudions la fonction de répartition en temps fini du niveau et des fonctions de répartitions jointes (telles que le niveau à un instant donné et le niveau maximum atteint pendant un intervalle de temps donné). Nous montrerons que cette technique permet de trouver des bonnes approximations et de calculer efficacement ces fonctions de répartitions.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished
Manstavicius, Martynas. "The p-variation of strong Markov processes /." Thesis, Connect to Dissertations & Theses @ Tufts University, 2003.
Find full textAdvisers: Richard M. Dudley; Marjorie G. Hahn. Submitted to the Dept. of Mathematics. Includes bibliographical references (leaves 109-113). Access restricted to members of the Tufts University community. Also available via the World Wide Web;
葉錦元 and Kam-yuen William Yip. "Simulation and inference of aggregated Markov processes." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1993. http://hub.hku.hk/bib/B31977546.
Full textYip, Kam-yuen William. "Simulation and inference of aggregated Markov processes." [Hong Kong : University of Hong Kong], 1994. http://sunzi.lib.hku.hk/hkuto/record.jsp?B13787391.
Full textPatrascu, Relu-Eugen. "Linear approximations from factored Markov Dicision Processes." Waterloo, Ont. : University of Waterloo, 2004. http://etd.uwaterloo.ca/etd/rpatrasc2004.pdf.
Full text"A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Computer Science". Includes bibliographical references.
Patrascu, Relu-Eugen. "Linear Approximations For Factored Markov Decision Processes." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/1171.
Full textCheng, Hsien-Te. "Algorithms for partially observable Markov decision processes." Thesis, University of British Columbia, 1988. http://hdl.handle.net/2429/29073.
Full textBusiness, Sauder School of
Graduate
Mundt, André Philipp. "Dynamic risk management with Markov decision processes." Karlsruhe Univ.-Verl. Karlsruhe, 2007. http://d-nb.info/987216511/04.
Full textMundt, André Philipp. "Dynamic risk management with Markov decision processes." Karlsruhe, Baden : Universitätsverl. Karlsruhe, 2008. http://www.uvka.de/univerlag/volltexte/2008/294/.
Full textSaeedi, Ardavan. "Nonparametric Bayesian models for Markov jump processes." Thesis, University of British Columbia, 2012. http://hdl.handle.net/2429/42963.
Full textHuang, Wenzong. "Spatial queueing systems and reversible markov processes." Diss., Georgia Institute of Technology, 1996. http://hdl.handle.net/1853/24871.
Full textPaduraru, Cosmin. "Off-policy evaluation in Markov decision processes." Thesis, McGill University, 2013. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=117008.
Full textCette thèse se situe dans le contexte d'un cadre largement utilisé pour formaliser les méchanismes autonomes de décision, à savoir les processus de décision markoviens (MDP). L'un des principaux problèmes qui se posent dans les MDP est celui de l'évaluation d'une stratégie de prise de décision, généralement appelée une politique. C'est souvent le cas qu'obtenir des données recueillies dans le cadre de la politique qu'on souhaite évaluer est difficile, ou même impossible. Dans ce cas, des données recueillies sous une autre politique doivent être utilisées, une situation appelée "évaluation hors-politique". L'objectif principal de cette thèse est de proposer un nouvel éclairage sur les propriétés des méthodes pour l'évaluation hors-politique. Ce résultat est obtenu grâce à une série de nouveaux résultats théoriques et illustrations empiriques. La première série de résultats concerne des problèmes de type bandit (des MDP avec un seul état et une seule étape de décision). Dans cette configuration, le biais et la variance de divers estimateurs hors-politique peuvent être calculés sous forme fermée sans avoir recours à des approximations. Nous comparons également le compromis biais-variance pour les différents estimateurs, du point de vue théorique et empirique. Dans le cadre séquentiel (plus d'une étape de décision), une étude empirique comparative des différents estimateurs hors-politique pour les MDP avec des états et des actions discrètes est menée. Les méthodes comparées sont trois estimateurs existants, ainsi que deux nouveaux proposés dans cette thèse. Tous ces estimateurs se sont avérés convergents et asymptotiquement normaux. L'étude empirique montre comment le comportement relatif des estimateurs est affecté par des changements aux paramètres du problème. L'analyse des MDP discrets est complétée par des formules récursives pour le biais et la variance pour l'estimateur basé sur le modèle. Ce sont les premières formules analytiques pour les MDP à horizon fini, et on montre qu'ils produisent des résultats plus précis que les estimations "bootstrap".La contribution finale consiste à introduire un nouveau cadre pour délimiter le retour d'une politique. Le cadre peut être utilisé chaque fois que des bornes sur le prochain état et la récompense sont disponibles, indépendamment du fait que les espaces d'état et d'action soient discrètes ou continues. Si les limites du prochain état sont calculées en supposant la continuité Lipschitz de la fonction de transition et en utilisant un échantillon de transitions, notre cadre peut conduire à des bornes plus strictes que celles qui sont proposées dans des travaux antérieurs.Tout au long de cette thèse, la performance empirique des estimateurs étudiés est illustrée sur plusieurs problèmes de durabilité: un modèle de calcul des émissions de gaz à effet de serre associées à la consommation de nourriture, un modèle dynamique de la population des mallards, et un domaine de gestion de la pêche.
Dai, Peng. "FASTER DYNAMIC PROGRAMMING FOR MARKOV DECISION PROCESSES." UKnowledge, 2007. http://uknowledge.uky.edu/gradschool_theses/428.
Full textBlack, Mary. "Applying Markov decision processes in asset management." Thesis, University of Salford, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.400817.
Full textNieto-Barajas, Luis E. "Bayesian nonparametric survival analysis via Markov processes." Thesis, University of Bath, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.343767.
Full textMarbach, Peter 1966. "Simulation-based optimization of Markov decision processes." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9660.
Full textIncludes bibliographical references (p. 127-129).
Markov decision processes have been a popular paradigm for sequential decision making under uncertainty. Dynamic programming provides a framework for studying such problems, as well as for devising algorithms to compute an optimal control policy. Dynamic programming methods rely on a suitably defined value function that has to be computed for every state in the state space. However, many interesting problems involve very large state spaces ( "curse of dimensionality"), which prohibits the application of dynamic programming. In addition, dynamic programming assumes the availability of an exact model, in the form of transition probabilities ( "curse of modeling"). In many situations, such a model is not available and one must resort to simulation or experimentation with an actual system. For all of these reasons, dynamic programming in its pure form may be inapplicable. In this thesis we study an approach for overcoming these difficulties where we use (a) compact (parametric) representations of the control policy, thus avoiding the curse of dimensionality, and (b) simulation to estimate quantities of interest, thus avoiding model-based computations and the curse of modeling. ,Furthermore, .our approach is not limited to Markov decision processes, but applies to general Markov reward processes for which the transition probabilities and the one-stage rewards depend on a tunable parameter vector 0. We propose gradient-type algorithms for updating 0 based on the simulation of a single sample path, so as to improve a given performance measure. As possible performance measures, we consider the weighted reward-to-go and the average reward. The corresponding algorithms(a) can be implemented online and update the parameter vector either at visits to a certain state; or at every time step . . . ,(b) have the property that the gradient ( with respect to 0) of the performance 'measure converges to O with probability 1. This is the strongest possible result · for gradient:-related stochastic approximation algorithms.
by Peter Marbach.
Ph.D.
Winder, Lee F. (Lee Francis) 1973. "Hazard avoidance alerting with Markov decision processes." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/28860.
Full textIncludes bibliographical references (p. 123-125).
(cont.) (incident rate and unnecessary alert rate), the MDP-based logic can meet or exceed that of alternate logics.
This thesis describes an approach to designing hazard avoidance alerting systems based on a Markov decision process (MDP) model of the alerting process, and shows its benefits over standard design methods. One benefit of the MDP method is that it accounts for future decision opportunities when choosing whether or not to alert, or in determining resolution guidance. Another benefit is that it provides a means of modeling uncertain state information, such as unmeasurable mode variables, so that decisions are more informed. A mode variable is an index for distinct types of behavior that a system exhibits at different times. For example, in many situations normal system behavior tends to be safe, but rare deviations from the normal increase the likelihood of a harmful incident. Accurate modeling of mode information is needed to minimize alerting system errors such as unnecessary or late alerts. The benefits of the method are illustrated with two alerting scenarios where a pair of aircraft must avoid collisions when passing one another. The first scenario has a fully observable state and the second includes an uncertain mode describing whether an intruder aircraft levels off safely above the evader or is in a hazardous blunder mode. In MDP theory, outcome preferences are described in terms of utilities of different state trajectories. In keeping with this, alerting system requirements are stated in the form of a reward function. This is then used with probabilistic dynamic and sensor models to compute an alerting logic (policy) that maximizes expected utility. Performance comparisons are made between the MDP-based logics and alternate logics generated with current methods. It is found that in terms of traditional performance measures
by Lee F. Winder.
Ph.D.
Vera, Ruiz Victor. "Recoding of Markov Processes in Phylogenetic Models." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/13433.
Full textYu, Huizhen Ph D. Massachusetts Institute of Technology. "Approximate solution methods for partially observable Markov and semi-Markov decision processes." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/35299.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 165-169).
We consider approximation methods for discrete-time infinite-horizon partially observable Markov and semi-Markov decision processes (POMDP and POSMDP). One of the main contributions of this thesis is a lower cost approximation method for finite-space POMDPs with the average cost criterion, and its extensions to semi-Markov partially observable problems and constrained POMDP problems, as well as to problems with the undiscounted total cost criterion. Our method is an extension of several lower cost approximation schemes, proposed individually by various authors, for discounted POMDP problems. We introduce a unified framework for viewing all of these schemes together with some new ones. In particular, we establish that due to the special structure of hidden states in a POMDP, there is a class of approximating processes, which are either POMDPs or belief MDPs, that provide lower bounds to the optimal cost function of the original POMDP problem. Theoretically, POMDPs with the long-run average cost criterion are still not fully understood.
(cont.) The major difficulties relate to the structure of the optimal solutions, such as conditions for a constant optimal cost function, the existence of solutions to the optimality equations, and the existence of optimal policies that are stationary and deterministic. Thus, our lower bound result is useful not only in providing a computational method, but also in characterizing the optimal solution. We show that regardless of these theoretical difficulties, lower bounds of the optimal liminf average cost function can be computed efficiently by solving modified problems using multichain MDP algorithms, and the approximating cost functions can be also used to obtain suboptimal stationary control policies. We prove the asymptotic convergence of the lower bounds under certain assumptions. For semi-Markov problems and total cost problems, we show that the same method can be applied for computing lower bounds of the optimal cost function. For constrained average cost POMDPs, we show that lower bounds of the constrained optimal cost function can be computed by solving finite-dimensional LPs. We also consider reinforcement learning methods for POMDPs and MDPs. We propose an actor-critic type policy gradient algorithm that uses a structured policy known as a finite-state controller.
(cont.) We thus provide an alternative to the earlier actor-only algorithm GPOMDP. Our work also clarifies the relationship between the reinforcement learning methods for POMDPs and those for MDPs. For average cost MDPs, we provide a convergence and convergence rate analysis for a least squares temporal difference (TD) algorithm, called LSPE, and previously proposed for discounted problems. We use this algorithm in the critic portion of the policy gradient algorithm for POMDPs with finite-state controllers. Finally, we investigate the properties of the limsup and liminf average cost functions of various types of policies. We show various convexity and concavity properties of these costfunctions, and we give a new necessary condition for the optimal liminf average cost to be constant. Based on this condition, we prove the near-optimality of the class of finite-state controllers under the assumption of a constant optimal liminf average cost. This result provides a theoretical guarantee for the finite-state controller approach.
by Huizhen Yu.
Ph.D.
Ciolek, Gabriela. "Bootstrap and uniform bounds for Harris Markov chains." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT024/document.
Full textThis thesis concentrates on some extensions of empirical processes theory when the data are Markovian. More specifically, we focus on some developments of bootstrap, robustness and statistical learning theory in a Harris recurrent framework. Our approach relies on the regenerative methods that boil down to division of sample paths of the regenerative Markov chain under study into independent and identically distributed (i.i.d.) blocks of observations. These regeneration blocks correspond to path segments between random times of visits to a well-chosen set (the atom) forming a renewal sequence. In the first part of the thesis we derive uniform bootstrap central limit theorems for Harris recurrent Markov chains over uniformly bounded classes of functions. We show that the result can be generalized also to the unbounded case. We use the aforementioned results to obtain uniform bootstrap central limit theorems for Fr´echet differentiable functionals of Harris Markov chains. Propelledby vast applications, we discuss how to extend some concepts of robustness from the i.i.d. framework to a Markovian setting. In particular, we consider the case when the data are Piecewise-determinic Markov processes. Next, we propose the residual and wild bootstrap procedures for periodically autoregressive processes and show their consistency. In the second part of the thesis we establish maximal versions of Bernstein, Hoeffding and polynomial tail type concentration inequalities. We obtain the inequalities as a function of covering numbers and moments of time returns and blocks. Finally, we use those tail inequalities toderive generalization bounds for minimum volume set estimation for regenerative Markov chains
Wortman, M. A. "Vacation queues with Markov schedules." Diss., Virginia Polytechnic Institute and State University, 1988. http://hdl.handle.net/10919/54468.
Full textPh. D.
Dassios, Angelos. "Insurance, storage and point processes : an approach via piecewise deterministicc Markov processes." Thesis, Imperial College London, 1987. http://hdl.handle.net/10044/1/38278.
Full text