Academic literature on the topic 'Markov decision theory'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Markov decision theory.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Markov decision theory"
Weng, Paul, and Olivier Spanjaard. "Functional Reward Markov Decision Processes: Theory and Applications." International Journal on Artificial Intelligence Tools 26, no. 03 (June 2017): 1760014. http://dx.doi.org/10.1142/s0218213017600144.
Full textBuchholz, Peter. "Bounding reward measures of Markov models using the Markov decision processes." Numerical Linear Algebra with Applications 18, no. 6 (October 18, 2011): 919–30. http://dx.doi.org/10.1002/nla.792.
Full textOrtega-Gutiérrez, R. Israel, and H. Cruz-Suárez. "A Moreau-Yosida regularization for Markov decision processes." Proyecciones (Antofagasta) 40, no. 1 (February 1, 2020): 117–37. http://dx.doi.org/10.22199/issn.0717-6279-2021-01-0008.
Full textOrtega-Gutiérrez, R. Israel, and H. Cruz-Suárez. "A Moreau-Yosida regularization for Markov decision processes." Proyecciones (Antofagasta) 40, no. 1 (February 1, 2020): 117–37. http://dx.doi.org/10.22199/issn.0717-6279-2021-01-0008.
Full textKoole, Ger. "Monotonicity in Markov Reward and Decision Chains: Theory and Applications." Foundations and Trends® in Stochastic Systems 1, no. 1 (2006): 1–76. http://dx.doi.org/10.1561/0900000002.
Full textCai, Lin. "Research of Optimizing Computer Network Based on Dynamism Theory." Applied Mechanics and Materials 556-562 (May 2014): 5356–58. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.5356.
Full textBrázdil, Tomáš, Václav Brožek, Vojtěch Forejt, and Antonín Kučera. "Reachability in recursive Markov decision processes." Information and Computation 206, no. 5 (May 2008): 520–37. http://dx.doi.org/10.1016/j.ic.2007.09.002.
Full textBarker, Richard J., and Matthew R. Schofield. "Putting Markov Chains Back into Markov Chain Monte Carlo." Journal of Applied Mathematics and Decision Sciences 2007 (October 30, 2007): 1–13. http://dx.doi.org/10.1155/2007/98086.
Full textBarkalov, S. A., A. V. Ananiev, K. S. Ivannikov, and S. I. Moiseev. "Algorithm and methods for management decision-making based on the theory of latent variables under time conditions." Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics 22, no. 3 (2022): 106–16. http://dx.doi.org/10.14529/ctcr220310.
Full textKadota, Yoshinobu, Masami Kurano, and Masami Yasuda. "Discounted Markov decision processes with utility constraints." Computers & Mathematics with Applications 51, no. 2 (January 2006): 279–84. http://dx.doi.org/10.1016/j.camwa.2005.11.013.
Full textDissertations / Theses on the topic "Markov decision theory"
Winkelmann, Stefanie [Verfasser]. "Markov Decision Processes with Information Costs : Theory and Application / Stefanie Winkelmann." Berlin : Freie Universität Berlin, 2013. http://d-nb.info/1037343131/34.
Full textKoh, You Beng, and 辜有明. "Bayesian analysis in Markov regime-switching models." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B48521644.
Full textpublished_or_final_version
Statistics and Actuarial Science
Doctoral
Doctor of Philosophy
Lusena, Christopher. "Finite Memory Policies for Partially Observable Markov Decision Proesses." UKnowledge, 2001. http://uknowledge.uky.edu/gradschool_diss/323.
Full textChuang, Dong-ming. "Risk-sensitive control of discrete-time partially observed Markov decision processes /." Digital version accessible at:, 1999. http://wwwlib.umi.com/cr/utexas/main.
Full textVan, Gael Jurgen. "Bayesian nonparametric hidden Markov models." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610196.
Full textHudson, Joshua. "A Partially Observable Markov Decision Process for Breast Cancer Screening." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154437.
Full textPellegrini, Jerônimo. "Processo de decisão de Markov limitados por linguagem." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276256.
Full textTese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-08T13:44:24Z (GMT). No. of bitstreams: 1 Pellegrini_Jeronimo_D.pdf: 889995 bytes, checksum: 1b9f02c9ce7815bf114b1b82de6df579 (MD5) Previous issue date: 2006
Resumo: Processos de decisão de Markov (MDPs) são usados para modelar situações onde é necessário executar ações em sequência em ambientes com incerteza. Este trabalho define uma nova formulação dos processos de decisão de Markov, adicionando a estes a possibilidade de restringir as ações e observações a serem consideradas a cada época de decisão. Estas restrições são descritas na forma de um autômato finito ? assim, a sequência de possíveis ações e observações consideradas na busca pela política ótima passa a ser uma linguagem regular. Chamamos estes processos de Markov limitados por linguagem (LLMDPs e LL-POMDPs). O uso de autômatos para a especificação de restrições facilita o processo de modelagem de problemas. Apresentamos diferentes abordagens para a solução destes problemas, e comparamos seus desempenhos, mostrando que a solução é viável, e mostramos também que em algumas situações o uso de restrições pode ser usado para acelerar a busca por uma solução. Além disso, apresentamos uma modificação nos LLPOMDPs de forma que seja possível especificar duração probabilística discreta para as ações e observações
Abstract: Markov decision processes (MDPs) are used to model situations where one needs to execute sequences of actions under uncertainty. This work defines a new formulation of Markov decision processes, with the possibility of restricting the actions and observations to be considered at each decision epoch. These restrictions are described as a finite automation, so the sequence of possible actions (and observations) considered during the search for an optimal policy is a regular language. We call these ?language limited Markov decision processes (LL-MDPs and LL-POMDPs). The use of automata for specifying restrictions helps make the modeling process easier. We present different approaches to solve these problems, and compare their performance, showing that the solution is feasible, and we also show that in some situations some restrictions can be used to speed up the search for a solution. Besides that, we also present one modification on LL-POMDPs to make it possible to specify probabilistic discrete duration for actions and observations
Doutorado
Sistemas de Informação
Doutor em Ciência da Computação
El, Khalfi Zeineb. "Lexicographic refinements in possibilistic sequential decision-making models." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30269/document.
Full textThis work contributes to possibilistic decision theory and more specifically to sequential decision-making under possibilistic uncertainty, at both the theoretical and practical levels. Even though appealing for its ability to handle qualitative decision problems, possibilisitic decision theory suffers from an important drawback: qualitative possibilistic utility criteria compare acts through min and max operators, which leads to a drowning effect. To overcome this lack of decision power, several refinements have been proposed in the literature. Lexicographic refinements are particularly appealing since they allow to benefit from the expected utility background, while remaining "qualitative". However, these refinements are defined for the non-sequential decision problems only. In this thesis, we present results on the extension of the lexicographic preference relations to sequential decision problems, in particular, to possibilistic Decision trees and Markov Decision Processes. This leads to new planning algorithms that are more "decisive" than their original possibilistic counterparts. We first present optimistic and pessimistic lexicographic preference relations between policies with and without intermediate utilities that refine the optimistic and pessimistic qualitative utilities respectively. We prove that these new proposed criteria satisfy the principle of Pareto efficiency as well as the property of strict monotonicity. This latter guarantees that dynamic programming algorithm can be used for calculating lexicographic optimal policies. Considering the problem of policy optimization in possibilistic decision trees and finite-horizon Markov decision processes, we provide adaptations of dynamic programming algorithm that calculate lexicographic optimal policy in polynomial time. These algorithms are based on the lexicographic comparison of the matrices of trajectories associated to the sub-policies. This algorithmic work is completed with an experimental study that shows the feasibility and the interest of the proposed approach. Then we prove that the lexicographic criteria still benefit from an Expected Utility grounding, and can be represented by infinitesimal expected utilities. The last part of our work is devoted to policy optimization in (possibly infinite) stationary Markov Decision Processes. We propose a value iteration algorithm for the computation of lexicographic optimal policies. We extend these results to the infinite-horizon case. Since the size of the matrices increases exponentially (which is especially problematic in the infinite-horizon case), we thus propose an approximation algorithm which keeps the most interesting part of each matrix of trajectories, namely the first lines and columns. Finally, we reports experimental results that show the effectiveness of the algorithms based on the cutting of the matrices
Ignatieva, Ekaterina. "Adaptive Bayesian sampling with application to 'bubbles'." Connect to e-thesis, 2008. http://theses.gla.ac.uk/356/.
Full textMSc(R). thesis submitted to the Department of Mathematics, Faculty of Information and Mathematical Sciences, University of Glasgow, 2008. Includes bibliographical references.
Wang, Jiahui. "Three essays on econometrics /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/7477.
Full textBooks on the topic "Markov decision theory"
Chang, Hyeong Soo. Simulation-Based Algorithms for Markov Decision Processes. 2nd ed. London: Springer London, 2013.
Find full textUlrich, Rieder, and SpringerLink (Online service), eds. Markov Decision Processes with Applications to Finance. Berlin, Heidelberg: Springer-Verlag Berlin Heidelberg, 2011.
Find full textKoole, G. M. Monotonicity in Markov reward and decision chains: Theory and applications. Boston: Now Publishers, 2007.
Find full textFeinberg, Eugene A. Handbook of Markov Decision Processes: Methods and Applications. Boston, MA: Springer US, 2002.
Find full textChing, Wai-Ki. Markov Chains: Models, Algorithms and Applications. 2nd ed. Boston, MA: Springer US, 2013.
Find full textMarkov chain Monte Carlo: Stochastic simulation for Bayesian inference. London: Chapman & Hall, 1997.
Find full textGamerman, Dani. Markov chain Monte Carlo: Stochastic simulation for Bayesian inference. London: Chapman & Hall, 1997.
Find full textRachev, Svetlozar T. Bayesian Methods in Finance. New York: John Wiley & Sons, Ltd., 2008.
Find full textT, Rachev S., ed. Bayesian methods in finance. Hoboken, N.J: Wiley, 2008.
Find full textFreitas, Lopes Hedibert, ed. Markov chain Monte Carlo: Stochastic simulation for Bayesian inference. 2nd ed. Boca Raton: Taylor & Francis, 2006.
Find full textBook chapters on the topic "Markov decision theory"
Ogryczak, Wlodzimierz, Patrice Perny, and Paul Weng. "On Minimizing Ordered Weighted Regrets in Multiobjective Markov Decision Processes." In Algorithmic Decision Theory, 190–204. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24873-3_15.
Full textBrázdil, Tomáš, Václav Brožek, Vojtěch Forejt, and Antonín Kučera. "Reachability in Recursive Markov Decision Processes." In CONCUR 2006 – Concurrency Theory, 358–74. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11817949_24.
Full textDoyen, Laurent, Thierry Massart, and Mahsa Shirmohammadi. "Robust Synchronization in Markov Decision Processes." In CONCUR 2014 – Concurrency Theory, 234–48. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-662-44584-6_17.
Full textBorda, Monica, Romulus Terebes, Raul Malutan, Ioana Ilea, Mihaela Cislariu, Andreia Miclea, and Stefania Barburiceanu. "Markov Systems." In Randomness and Elements of Decision Theory Applied to Signals, 79–88. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-90314-5_6.
Full textDodson, Thomas, Nicholas Mattei, and Judy Goldsmith. "A Natural Language Argumentation Interface for Explanation Generation in Markov Decision Processes." In Algorithmic Decision Theory, 42–55. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24873-3_4.
Full textEven-Dar, Eyal, and Yishay Mansour. "Approximate Equivalence of Markov Decision Processes." In Learning Theory and Kernel Machines, 581–94. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/978-3-540-45167-9_42.
Full textDoyen, Laurent, and Marie van den Bogaard. "Bounds for Synchronizing Markov Decision Processes." In Computer Science – Theory and Applications, 133–51. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-09574-0_9.
Full textDelahaye, Benoît, Kim G. Larsen, Axel Legay, Mikkel L. Pedersen, and Andrzej Wąsowski. "Decision Problems for Interval Markov Chains." In Language and Automata Theory and Applications, 274–85. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-21254-3_21.
Full textTewari, Ambuj, and Peter L. Bartlett. "Bounded Parameter Markov Decision Processes with Average Reward Criterion." In Learning Theory, 263–77. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-72927-3_20.
Full textBäuerle, Nicole, and Ulrich Rieder. "Theory of Finite Horizon Markov Decision Processes." In Universitext, 13–57. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-18324-9_2.
Full textConference papers on the topic "Markov decision theory"
Neely, Michael J. "Asynchronous control for coupled Markov decision systems." In 2012 IEEE Information Theory Workshop (ITW 2012). IEEE, 2012. http://dx.doi.org/10.1109/itw.2012.6404677.
Full textRamos, J. A., and E. I. Verriest. "A 2-D realization theory for Markov chains." In 29th IEEE Conference on Decision and Control. IEEE, 1990. http://dx.doi.org/10.1109/cdc.1990.203709.
Full textNi, Chengzhuo, and Mengdi Wang. "Maximum Likelihood Tensor Decomposition of Markov Decision Process." In 2019 IEEE International Symposium on Information Theory (ISIT). IEEE, 2019. http://dx.doi.org/10.1109/isit.2019.8849765.
Full textPetreczky, Mihaly, and Rene Vidal. "Realization theory of stochastic jump-Markov linear systems." In 2007 46th IEEE Conference on Decision and Control. IEEE, 2007. http://dx.doi.org/10.1109/cdc.2007.4434509.
Full textRoss, Keith W., and Ravi Varadarajan. "A sample path theory for time-average Markov decision processes." In 26th IEEE Conference on Decision and Control. IEEE, 1987. http://dx.doi.org/10.1109/cdc.1987.272945.
Full textWen, Ru, Kai Chen, Yilin Zhang, Wenmin Huang, Jiyuan Tian, Kuan Xu, and Jiang Wu. "A model of music perceptual theory based on Markov chains." In 2018 Chinese Control And Decision Conference (CCDC). IEEE, 2018. http://dx.doi.org/10.1109/ccdc.2018.8407293.
Full textDebras, Guillaume, Abdel-Illah Mouaddib, Laurent Jean Pierre, and Simon Le Gloannec. "Dealing With Groups of Actions in Multiagent Markov Decision Processes." In 8th International Conference on Evolutionary Computation Theory and Applications. SCITEPRESS - Science and Technology Publications, 2016. http://dx.doi.org/10.5220/0006048000490058.
Full textTembine, Hamidou, Jean-Yves Le Boudec, Rachid El-Azouzi, and Eitan Altman. "Mean field asymptotics of Markov Decision Evolutionary Games and teams." In 2009 International Conference on Game Theory for Networks (GameNets). IEEE, 2009. http://dx.doi.org/10.1109/gamenets.2009.5137395.
Full textYing Tang, Min Huang, and Wai-Ki Ching. "Performance analysis based Markov theory for Hybrid control serial production lines." In 2010 Chinese Control and Decision Conference (CCDC). IEEE, 2010. http://dx.doi.org/10.1109/ccdc.2010.5498688.
Full textGatimu, Kevin, and Ben Lee. "qMDP: DASH Adaptation using Queueing Theory within a Markov Decision Process." In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). IEEE, 2021. http://dx.doi.org/10.1109/ccnc49032.2021.9369481.
Full textReports on the topic "Markov decision theory"
Marold, Juliane, Ruth Wagner, Markus Schöbel, and Dietrich Manzey. Decision-making in groups under uncertainty. Fondation pour une culture de sécurité industrielle, February 2012. http://dx.doi.org/10.57071/361udm.
Full textSoloviev, Vladimir N., Andrii O. Bielinskyi, and Natalia A. Kharadzjan. Coverage of the Coronavirus Pandemic through Entropy Measures. CEUR Workshop Proceedings, March 2021. http://dx.doi.org/10.31812/123456789/4427.
Full textMai Phuong, Nguyen, Hanna North, Duong Minh Tuan, and Nguyen Manh Cuong. Assessment of women’s benefits and constraints in participating in agroforestry exemplar landscapes. World Agroforestry, 2021. http://dx.doi.org/10.5716/wp21015.pdf.
Full textVingre, Anete, Peter Kolarz, and Billy Bryan. On your marks, get set, fund! Rapid responses to the Covid-19 pandemic. Fteval - Austrian Platform for Research and Technology Policy Evaluation, April 2022. http://dx.doi.org/10.22163/fteval.2022.538.
Full textRieger, Oya Y., Roger Schonfeld, and Liam Sweeney. The Effectiveness and Durability of Digital Preservation and Curation Systems. Ithaka S+R, July 2022. http://dx.doi.org/10.18665/sr.316990.
Full textAbuMezied, Asmaa, and Rahhal Rahhal. Towards a Gender-Sensitive Private Sector in the OPT. Oxfam, April 2021. http://dx.doi.org/10.21201/2021.7338.
Full textFinkelstain, Israel, Steven Buccola, and Ziv Bar-Shira. Pooling and Pricing Schemes for Marketing Agricultural Products. United States Department of Agriculture, August 1993. http://dx.doi.org/10.32747/1993.7568099.bard.
Full textStewart, Alastair, and Miranda Morgan. A Final Evaluation of Oxfam's Gendered Enterprise and Markets Programme (2014-18): Summary of findings. Oxfam GB, December 2019. http://dx.doi.org/10.21201/2019.5358.
Full textFindlay, Trevor. The Role of International Organizations in WMD Compliance and Enforcement: Autonomy, Agency, and Influence. The United Nations Institute for Disarmament Research, December 2020. http://dx.doi.org/10.37559/wmd/20/wmdce9.
Full textMorgan, Miranda, and Alastair Stewart. Making Market Systems Work for Women Farmers in Tajikistan: A final evaluation of Oxfam's Gendered Enterprise and Markets programme in Tajikistan. Oxfam GB, December 2019. http://dx.doi.org/10.21201/2019.5372.
Full text