Log in

Relevant bibliographies by topics / Structured continuous time Markov decision processes / Journal articles

To see the other types of publications on this topic, follow the link: Structured continuous time Markov decision processes.

Journal articles on the topic 'Structured continuous time Markov decision processes'

Author: Grafiati

Published: 9 March 2023

Last updated: 10 March 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Structured continuous time Markov decision processes.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Shelton, C. R., and G. Ciardo. "Tutorial on Structured Continuous-Time Markov Processes." Journal of Artificial Intelligence Research 51 (December 23, 2014): 725–78. http://dx.doi.org/10.1613/jair.4415.

Full text

Abstract:

A continuous-time Markov process (CTMP) is a collection of variables indexed by a continuous quantity, time. It obeys the Markov property that the distribution over a future variable is independent of past variables given the state at the present time. We introduce continuous-time Markov process representations and algorithms for filtering, smoothing, expected sufficient statistics calculations, and model estimation, assuming no prior knowledge of continuous-time processes but some basic knowledge of probability and statistics. We begin by describing "flat" or unstructured Markov processes and then move to structured Markov processes (those arising from state spaces consisting of assignments to variables) including Kronecker, decision-diagram, and continuous-time Bayesian network representations. We provide the first connection between decision-diagrams and continuous-time Bayesian networks.

APA, Harvard, Vancouver, ISO, and other styles

2

D'Amico, Guglielmo, Jacques Janssen, and Raimondo Manca. "Monounireducible Nonhomogeneous Continuous Time Semi-Markov Processes Applied to Rating Migration Models." Advances in Decision Sciences 2012 (October 16, 2012): 1–12. http://dx.doi.org/10.1155/2012/123635.

Full text

Abstract:

Monounireducible nonhomogeneous semi- Markov processes are defined and investigated. The mono- unireducible topological structure is a sufficient condition that guarantees the absorption of the semi-Markov process in a state of the process. This situation is of fundamental importance in the modelling of credit rating migrations because permits the derivation of the distribution function of the time of default. An application in credit rating modelling is given in order to illustrate the results.

APA, Harvard, Vancouver, ISO, and other styles

3

Beutler, Frederick J., and Keith W. Ross. "Uniformization for semi-Markov decision processes under stationary policies." Journal of Applied Probability 24, no. 3 (September 1987): 644–56. http://dx.doi.org/10.2307/3214096.

Full text

Abstract:

Uniformization permits the replacement of a semi-Markov decision process (SMDP) by a Markov chain exhibiting the same average rewards for simple (non-randomized) policies. It is shown that various anomalies may occur, especially for stationary (randomized) policies; uniformization introduces virtual jumps with concomitant action changes not present in the original process. Since these lead to discrepancies in the average rewards for stationary processes, uniformization can be accepted as valid only for simple policies.We generalize uniformization to yield consistent results for stationary policies also. These results are applied to constrained optimization of SMDP, in which stationary (randomized) policies appear naturally. The structure of optimal constrained SMDP policies can then be elucidated by studying the corresponding controlled Markov chains. Moreover, constrained SMDP optimal policy computations can be more easily implemented in discrete time, the generalized uniformization being employed to relate discrete- and continuous-time optimal constrained policies.

APA, Harvard, Vancouver, ISO, and other styles

4

Beutler, Frederick J., and Keith W. Ross. "Uniformization for semi-Markov decision processes under stationary policies." Journal of Applied Probability 24, no. 03 (September 1987): 644–56. http://dx.doi.org/10.1017/s0021900200031375.

Full text

Abstract:

Uniformization permits the replacement of a semi-Markov decision process (SMDP) by a Markov chain exhibiting the same average rewards for simple (non-randomized) policies. It is shown that various anomalies may occur, especially for stationary (randomized) policies; uniformization introduces virtual jumps with concomitant action changes not present in the original process. Since these lead to discrepancies in the average rewards for stationary processes, uniformization can be accepted as valid only for simple policies. We generalize uniformization to yield consistent results for stationary policies also. These results are applied to constrained optimization of SMDP, in which stationary (randomized) policies appear naturally. The structure of optimal constrained SMDP policies can then be elucidated by studying the corresponding controlled Markov chains. Moreover, constrained SMDP optimal policy computations can be more easily implemented in discrete time, the generalized uniformization being employed to relate discrete- and continuous-time optimal constrained policies.

APA, Harvard, Vancouver, ISO, and other styles

5

Dibangoye, Jilles Steeve, Christopher Amato, Olivier Buffet, and François Charpillet. "Optimally Solving Dec-POMDPs as Continuous-State MDPs." Journal of Artificial Intelligence Research 55 (February 24, 2016): 443–97. http://dx.doi.org/10.1613/jair.4623.

Full text

Abstract:

Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving these problems, we introduce the idea of transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear and convex value function. This approach makes use of the fact that planning can be accomplished in a centralized offline manner, while execution can still be decentralized. This new Dec-POMDP formulation, which we call an occupancy MDP, allows powerful POMDP and continuous-state MDP methods to be used for the first time. To provide scalability, we refine this approach by combining heuristic search and compact representations that exploit the structure present in multi-agent domains, without losing the ability to converge to an optimal solution. In particular, we introduce a feature-based heuristic search value iteration (FB-HSVI) algorithm that relies on feature-based compact representations, point-based updates and efficient action selection. A theoretical analysis demonstrates that FB-HSVI terminates in finite time with an optimal solution. We include an extensive empirical analysis using well-known benchmarks, thereby demonstrating that our approach provides significant scalability improvements compared to the state of the art.

APA, Harvard, Vancouver, ISO, and other styles

6

Pazis, Jason, and Ronald Parr. "Sample Complexity and Performance Bounds for Non-Parametric Approximate Linear Programming." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (June 30, 2013): 782–88. http://dx.doi.org/10.1609/aaai.v27i1.8696.

Full text

Abstract:

One of the most difficult tasks in value function approximation for Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. Recent results in non-parametric approximate linear programming (NP-ALP), have demonstrated that this can be done effectively using nothing more than a smoothness assumption on the value function. In this paper we extend these results to the case where samples come from real world transitions instead of the full Bellman equation, adding robustness to noise. In addition, we provide the first max-norm, finite sample performance guarantees for any form of ALP. NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution.

APA, Harvard, Vancouver, ISO, and other styles

7

Abid, Amira, Fathi Abid, and Bilel Kaffel. "CDS-based implied probability of default estimation." Journal of Risk Finance 21, no. 4 (July 21, 2020): 399–422. http://dx.doi.org/10.1108/jrf-05-2019-0079.

Full text

Abstract:

Purpose This study aims to shed more light on the relationship between probability of default, investment horizons and rating classes to make decision-making processes more efficient. Design/methodology/approach Based on credit default swaps (CDS) spreads, a methodology is implemented to determine the implied default probability and the implied rating, and then to estimate the term structure of the market-implied default probability and the transition matrix of implied rating. The term structure estimation in discrete time is conducted with the Nelson and Siegel model and in continuous time with the Vasicek model. The assessment of the transition matrix is performed using the homogeneous Markov model. Findings The results show that the CDS-based implied ratings are lower than those based on Thomson Reuters approach, which can partially be explained by the fact that the real-world probabilities are smaller than those founded on a risk-neutral framework. Moreover, investment and sub-investment grade companies exhibit different risk profiles with respect of the investment horizons. Originality/value The originality of this study consists in determining the implied rating based on CDS spreads and to detect the difference between implied market rating and the Thomson Reuters StarMine rating. The results can be used to analyze credit risk assessments and examine issues related to the Thomson Reuters StarMine credit risk model.

APA, Harvard, Vancouver, ISO, and other styles

8

Puterman, Martin L., and F. A. Van der Duyn Schouten. "Markov Decision Processes With Continuous Time Parameter." Journal of the American Statistical Association 80, no. 390 (June 1985): 491. http://dx.doi.org/10.2307/2287942.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Fu, Yaqing. "Variance Optimization for Continuous-Time Markov Decision Processes." Open Journal of Statistics 09, no. 02 (2019): 181–95. http://dx.doi.org/10.4236/ojs.2019.92014.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Guo, Xianping, and Yi Zhang. "Constrained total undiscounted continuous-time Markov decision processes." Bernoulli 23, no. 3 (August 2017): 1694–736. http://dx.doi.org/10.3150/15-bej793.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Zhang, Yi. "Continuous-Time Markov Decision Processes with Exponential Utility." SIAM Journal on Control and Optimization 55, no. 4 (January 2017): 2636–60. http://dx.doi.org/10.1137/16m1086261.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Dufour, François, and Alexei B. Piunovskiy. "Impulsive Control for Continuous-Time Markov Decision Processes." Advances in Applied Probability 47, no. 1 (March 2015): 106–27. http://dx.doi.org/10.1239/aap/1427814583.

Full text

Abstract:

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

APA, Harvard, Vancouver, ISO, and other styles

13

Dufour, François, and Alexei B. Piunovskiy. "Impulsive Control for Continuous-Time Markov Decision Processes." Advances in Applied Probability 47, no. 01 (March 2015): 106–27. http://dx.doi.org/10.1017/s0001867800007722.

Full text

Abstract:

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

APA, Harvard, Vancouver, ISO, and other styles

14

Piunovskiy, Alexey. "Realizable Strategies in Continuous-Time Markov Decision Processes." SIAM Journal on Control and Optimization 56, no. 1 (January 2018): 473–95. http://dx.doi.org/10.1137/17m1138959.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Hu, Qiying. "Continuous time shock markov decision processes with discounted criterion." Optimization 25, no. 2-3 (January 1992): 271–83. http://dx.doi.org/10.1080/02331939208843824.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Wei, Qingda. "Mean–semivariance optimality for continuous-time Markov decision processes." Systems & Control Letters 125 (March 2019): 67–74. http://dx.doi.org/10.1016/j.sysconle.2019.02.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Guo, Xianping, XinYuan Song, and Junyu Zhang. "Bias optimality for multichain continuous-time Markov decision processes." Operations Research Letters 37, no. 5 (September 2009): 317–21. http://dx.doi.org/10.1016/j.orl.2009.04.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Zhang, Lanlan, and Xianping Guo. "Constrained continuous-time Markov decision processes with average criteria." Mathematical Methods of Operations Research 67, no. 2 (March 23, 2007): 323–40. http://dx.doi.org/10.1007/s00186-007-0154-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Hu, Q. Y. "Nonstationary Continuous Time Markov Decision Processes with Discounted Criterion." Journal of Mathematical Analysis and Applications 180, no. 1 (November 1993): 60–70. http://dx.doi.org/10.1006/jmaa.1993.1382.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Hu, Qiying. "Continuous Time Markov Decision Processes with Discounted Moment Criterion." Journal of Mathematical Analysis and Applications 203, no. 1 (October 1996): 1–12. http://dx.doi.org/10.1006/jmaa.1996.9999.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Piunovskiy, Alexey, and Yi Zhang. "The Transformation Method for Continuous-Time Markov Decision Processes." Journal of Optimization Theory and Applications 154, no. 2 (March 2, 2012): 691–712. http://dx.doi.org/10.1007/s10957-012-0015-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Bartocci, Ezio, Luca Bortolussi, Tomáš Brázdil, Dimitrios Milios, and Guido Sanguinetti. "Policy learning in continuous-time Markov decision processes using Gaussian Processes." Performance Evaluation 116 (November 2017): 84–100. http://dx.doi.org/10.1016/j.peva.2017.08.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Guo, Xianping. "Constrained Optimization for Average Cost Continuous-Time Markov Decision Processes." IEEE Transactions on Automatic Control 52, no. 6 (June 2007): 1139–43. http://dx.doi.org/10.1109/tac.2007.899040.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Xianping Guo and Xinyuan Song. "Mean-Variance Criteria for Finite Continuous-Time Markov Decision Processes." IEEE Transactions on Automatic Control 54, no. 9 (September 2009): 2151–57. http://dx.doi.org/10.1109/tac.2009.2023833.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Piunovskiy, Alexey. "Randomized and Relaxed Strategies in Continuous-Time Markov Decision Processes." SIAM Journal on Control and Optimization 53, no. 6 (January 2015): 3503–33. http://dx.doi.org/10.1137/15m1014012.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Piunovskiy, A. B. "DISCOUNTED CONTINUOUS TIME MARKOV DECISION PROCESSES: THE CONVEX ANALYTIC APPROACH." IFAC Proceedings Volumes 38, no. 1 (2005): 31–36. http://dx.doi.org/10.3182/20050703-6-cz-1902.00357.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Guo, Xianping, Mantas Vykertas, and Yi Zhang. "Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria." Advances in Applied Probability 45, no. 2 (June 2013): 490–519. http://dx.doi.org/10.1239/aap/1370870127.

Full text

Abstract:

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems with N constraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more than N+1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

APA, Harvard, Vancouver, ISO, and other styles

28

Guo, Xianping, Mantas Vykertas, and Yi Zhang. "Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria." Advances in Applied Probability 45, no. 02 (June 2013): 490–519. http://dx.doi.org/10.1017/s0001867800006418.

Full text

Abstract:

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems withNconstraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more thanN+1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

APA, Harvard, Vancouver, ISO, and other styles

29

Anselmi, Jonatha, François Dufour, and Tomás Prieto-Rumeau. "Computable approximations for average Markov decision processes in continuous time." Journal of Applied Probability 55, no. 2 (June 2018): 571–92. http://dx.doi.org/10.1017/jpr.2018.36.

Full text

Abstract:

Abstract In this paper we study the numerical approximation of the optimal long-run average cost of a continuous-time Markov decision process, with Borel state and action spaces, and with bounded transition and reward rates. Our approach uses a suitable discretization of the state and action spaces to approximate the original control model. The approximation error for the optimal average reward is then bounded by a linear combination of coefficients related to the discretization of the state and action spaces, namely, the Wasserstein distance between an underlying probability measure μ and a measure with finite support, and the Hausdorff distance between the original and the discretized actions sets. When approximating μ with its empirical probability measure we obtain convergence in probability at an exponential rate. An application to a queueing system is presented.

APA, Harvard, Vancouver, ISO, and other styles

30

Guo, Xianping, Yonghui Huang, and Yi Zhang. "Constrained Continuous-Time Markov Decision Processes on the Finite Horizon." Applied Mathematics & Optimization 75, no. 2 (April 15, 2016): 317–41. http://dx.doi.org/10.1007/s00245-016-9352-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Ye, Liuer, and Xianping Guo. "Continuous-Time Markov Decision Processes with State-Dependent Discount Factors." Acta Applicandae Mathematicae 121, no. 1 (February 24, 2012): 5–27. http://dx.doi.org/10.1007/s10440-012-9669-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Guo, Xianping, and Xinyuan Song. "Discounted continuous-time constrained Markov decision processes in Polish spaces." Annals of Applied Probability 21, no. 5 (October 2011): 2016–49. http://dx.doi.org/10.1214/10-aap749.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Wei, Qingda. "Finite approximation for finite-horizon continuous-time Markov decision processes." 4OR 15, no. 1 (June 11, 2016): 67–84. http://dx.doi.org/10.1007/s10288-016-0321-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Zhu, Quan-xin. "Variance minimization for continuous-time Markov decision processes: two approaches." Applied Mathematics-A Journal of Chinese Universities 25, no. 4 (December 2010): 400–410. http://dx.doi.org/10.1007/s11766-010-2428-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Zhang, Junyu, and Xi-Ren Cao. "Continuous-time Markov decision processes with nth-bias optimality criteria." Automatica 45, no. 7 (July 2009): 1628–38. http://dx.doi.org/10.1016/j.automatica.2009.03.009.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Guo, Xianping, and Liuer Ye. "New discount and average optimality conditions for continuous-time Markov decision processes." Advances in Applied Probability 42, no. 4 (December 2010): 953–85. http://dx.doi.org/10.1239/aap/1293113146.

Full text

Abstract:

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

APA, Harvard, Vancouver, ISO, and other styles

37

Guo, Xianping, and Liuer Ye. "New discount and average optimality conditions for continuous-time Markov decision processes." Advances in Applied Probability 42, no. 04 (December 2010): 953–85. http://dx.doi.org/10.1017/s000186780000447x.

Full text

Abstract:

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

APA, Harvard, Vancouver, ISO, and other styles

38

Hu, Q. Y. "Nonstationary Continuous Time Markov Decision Processes in a Semi-Markov Environment with Discounted Criterion." Journal of Mathematical Analysis and Applications 194, no. 3 (September 1995): 640–59. http://dx.doi.org/10.1006/jmaa.1995.1322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

HUANG, XiangXiang, and LiuEr YE. "A mean-variance optimization problem for continuous-time Markov decision processes." SCIENTIA SINICA Mathematica 44, no. 8 (August 1, 2014): 883–98. http://dx.doi.org/10.1360/n012013-00117.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Wei, Qingda, and Xian Chen. "Risk-sensitive average continuous-time Markov decision processes with unbounded rates." Optimization 68, no. 4 (November 15, 2018): 773–800. http://dx.doi.org/10.1080/02331934.2018.1547382.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Guo, Xianping, and Lanlan Zhang. "Total reward criteria for unconstrained/constrained continuous-time Markov decision processes." Journal of Systems Science and Complexity 24, no. 3 (June 2011): 491–505. http://dx.doi.org/10.1007/s11424-011-8004-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Zou, Xiaolong, and Yonghui Huang. "Verifiable conditions for average optimality of continuous-time Markov decision processes." Operations Research Letters 44, no. 6 (November 2016): 742–46. http://dx.doi.org/10.1016/j.orl.2016.09.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Huo, Haifeng, Xiaolong Zou, and Xianping Guo. "The risk probability criterion for discounted continuous-time Markov decision processes." Discrete Event Dynamic Systems 27, no. 4 (August 10, 2017): 675–99. http://dx.doi.org/10.1007/s10626-017-0257-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Feinberg, Eugene A. "Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach." Mathematics of Operations Research 29, no. 3 (August 2004): 492–524. http://dx.doi.org/10.1287/moor.1040.0089.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Guo, Xianping, and Zhong-Wei Liao. "Risk-Sensitive Discounted Continuous-Time Markov Decision Processes with Unbounded Rates." SIAM Journal on Control and Optimization 57, no. 6 (January 2019): 3857–83. http://dx.doi.org/10.1137/18m1222016.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Liu, Qiuli, Hangsheng Tan, and Xianping Guo. "Denumerable continuous-time Markov decision processes with multiconstraints on average costs." International Journal of Systems Science 43, no. 3 (March 2012): 576–85. http://dx.doi.org/10.1080/00207721.2010.517868.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Guo, Xianping, and Ulrich Rieder. "Average optimality for continuous-time Markov decision processes in Polish spaces." Annals of Applied Probability 16, no. 2 (May 2006): 730–56. http://dx.doi.org/10.1214/105051606000000105.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Guo, Xianping, Onésimo Hernández-Lerma, Tomás Prieto-Rumeau, Xi-Ren Cao, Junyu Zhang, Qiying Hu, Mark E. Lewis, and Ricardo Vélez. "A survey of recent results on continuous-time Markov decision processes." TOP 14, no. 2 (December 2006): 177–261. http://dx.doi.org/10.1007/bf02837562.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Buchholz, Peter, and Ingo Schulz. "Numerical analysis of continuous time Markov decision processes over finite horizons." Computers & Operations Research 38, no. 3 (March 2011): 651–59. http://dx.doi.org/10.1016/j.cor.2010.08.011.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Gopalan, Nakul, Marie DesJardins, Michael Littman, James MacGlashan, Shawn Squire, Stefanie Tellex, John Winder, and Lawson Wong. "Planning with Abstract Markov Decision Processes." Proceedings of the International Conference on Automated Planning and Scheduling 27 (June 5, 2017): 480–88. http://dx.doi.org/10.1609/icaps.v27i1.13867.

Full text

Abstract:

Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!