Journal articles on the topic 'Markov Decision Process Planning'

To see the other types of publications on this topic, follow the link: Markov Decision Process Planning.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Markov Decision Process Planning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Pinder, Jonathan P. "An Approximation of a Markov Decision Process for Resource Planning." Journal of the Operational Research Society 46, no. 7 (July 1995): 819. http://dx.doi.org/10.2307/2583966.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Pinder, Jonathan P. "An Approximation of a Markov Decision Process for Resource Planning." Journal of the Operational Research Society 46, no. 7 (July 1995): 819–30. http://dx.doi.org/10.1057/jors.1995.115.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Mouaddib, Abdel-Illah. "Vector-Value Markov Decision Process for multi-objective stochastic path planning." International Journal of Hybrid Intelligent Systems 9, no. 1 (March 13, 2012): 45–60. http://dx.doi.org/10.3233/his-2012-0146.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Naguleswaran, Sanjeev, and Langford B. White. "Planning without state space explosion: Petri net to Markov decision process." International Transactions in Operational Research 16, no. 2 (March 2009): 243–55. http://dx.doi.org/10.1111/j.1475-3995.2009.00674.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Schell, Greggory J., Wesley J. Marrero, Mariel S. Lavieri, Jeremy B. Sussman, and Rodney A. Hayward. "Data-Driven Markov Decision Process Approximations for Personalized Hypertension Treatment Planning." MDM Policy & Practice 1, no. 1 (July 2016): 238146831667421. http://dx.doi.org/10.1177/2381468316674214.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Nguyen, Truong-Huy, David Hsu, Wee-Sun Lee, Tze-Yun Leong, Leslie Kaelbling, Tomas Lozano-Perez, and Andrew Grant. "CAPIR: Collaborative Action Planning with Intention Recognition." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 7, no. 1 (October 9, 2011): 61–66. http://dx.doi.org/10.1609/aiide.v7i1.12425.

Full text
Abstract:
We apply decision theoretic techniques to construct non-player characters that are able to assist a human player in collaborative games. The method is based on solving Markov decision processes, which can be difficult when the game state is described by many variables. To scale to more complex games, the method allows decomposition of a game task into subtasks, each of which can be modelled by a Markov decision process. Intention recognition is used to infer the subtask that the human is currently performing, allowing the helper to assist the human in performing the correct task. Experiments show that the method can be effective, giving near-human level performance in helping a human in a collaborative game.
APA, Harvard, Vancouver, ISO, and other styles
7

Hamasha, Mohammad M., and George Rumbe. "Determining optimal policy for emergency department using Markov decision process." World Journal of Engineering 14, no. 5 (October 2, 2017): 467–72. http://dx.doi.org/10.1108/wje-12-2016-0148.

Full text
Abstract:
Purpose Emergency departments (ED) are faced with the challenge of capacity planning that caused by the high demand for patients and limited resources. Consequently, inadequate resources lead to increased delays, impacts on the quality of care and increase the health-care costs. Such circumstances necessitate utilizing operational research modules, such as the Markov decision process (MDP) to enable better decision-making. The purpose of this paper is to demonstrate the applicability and usage of MDP on ED. Design/methodology/approach The adoption of MDP provides invaluable insights into system operations based on the different system states (e.g. very busy to unoccupied) to ensure optimal assigning of resources and reduced costs. In this paper, a descriptive health system model based on the MDP is presented, and a numerical example is illustrated to elaborate its appropriateness in optimal policy decision determination. Findings Faced with numerous decisions, hospital managers have to ensure that the appropriate technique is used to minimize any undesired outcomes. MDP has been shown to be a robust approach which provides support to the critical decision-making processes. Additionally, MDP also provides insights on the associated costs which enable the hospital managers to efficiently allocate resources ensuring quality health care and increased throughput while minimizing costs. Originality/value Applying MDP in the ED is a unique and good starting. MDP is powerful tool helps in making a decision in the critical situations, and the ED needs such tool.
APA, Harvard, Vancouver, ISO, and other styles
8

Yordanova, Veronika, Hugh Griffiths, and Stephen Hailes. "Rendezvous planning for multiple autonomous underwater vehicles using a Markov decision process." IET Radar, Sonar & Navigation 11, no. 12 (December 2017): 1762–69. http://dx.doi.org/10.1049/iet-rsn.2017.0098.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Hai-Feng, Jiu, Chen Yu, Deng Wei, and Pang Shuo. "Underwater chemical plume tracing based on partially observable Markov decision process." International Journal of Advanced Robotic Systems 16, no. 2 (March 1, 2019): 172988141983187. http://dx.doi.org/10.1177/1729881419831874.

Full text
Abstract:
Chemical plume tracing based on autonomous underwater vehicle uses chemical as a guidance to navigate and search in the unknown environments. To solve the key issue of tracing and locating the source, this article proposes a path-planning strategy based on partially observable Markov decision process algorithm and artificial potential field algorithm. The partially observable Markov decision process algorithm is used to construct a source likelihood map and update it in real time with environmental information from the sensors on autonomous underwater vehicle in search area. The artificial potential field algorithm uses the source likelihood map for accurately planning tracing path and guiding the autonomous underwater vehicle to track along the path until the source is detected. This article carries out simulation experiments on the proposed algorithm. The experimental results show that the algorithms have good performance, which is suitable for chemical plume tracing via autonomous underwater vehicle. Compared with the bionic method, the simulation results show that the proposed method has higher success rate and better stability than the bionic method.
APA, Harvard, Vancouver, ISO, and other styles
10

Lin, Yong, Xingjia Lu, and Fillia Makedon. "Approximate Planning in POMDPs with Weighted Graph Models." International Journal on Artificial Intelligence Tools 24, no. 04 (August 2015): 1550014. http://dx.doi.org/10.1142/s0218213015500141.

Full text
Abstract:
Markov decision process (MDP) based heuristic algorithms have been considered as simple, fast, but imprecise solutions for partially observable Markov decision processes (POMDPs). The main reason comes from how we approximate belief points. We use weighted graphs to model the state space and the belief space, in order for a detailed analysis of the MDP heuristic algorithm. As a result, we provide the prerequisite conditions to build up a robust belief graph. We further introduce a dynamic mechanism to manage belief space in the belief graph, so as to improve the efficiency and decrease the space complexity. Experimental results indicate our approach is fast and has high quality for POMDPs.
APA, Harvard, Vancouver, ISO, and other styles
11

Ragi, Shankarachary, and Edwin K. P. Chong. "UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process." IEEE Transactions on Aerospace and Electronic Systems 49, no. 4 (October 2014): 2397–412. http://dx.doi.org/10.1109/taes.2014.6619936.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Ragi, Shankarachary, and Edwin K. P. Chong. "UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process." IEEE Transactions on Aerospace and Electronic Systems 49, no. 4 (October 2013): 2397–412. http://dx.doi.org/10.1109/taes.2013.6621824.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Gedik, Ridvan, Shengfan Zhang, and Chase Rainwater. "Strategic level proton therapy patient admission planning: a Markov decision process modeling approach." Health Care Management Science 20, no. 2 (January 25, 2016): 286–302. http://dx.doi.org/10.1007/s10729-016-9354-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Bai, Yun, Saeed Babanajad, and Zheyong Bian. "Transportation infrastructure asset management modeling using Markov decision process under epistemic uncertainties." Smart and Resilient Transport 3, no. 3 (October 25, 2021): 249–65. http://dx.doi.org/10.1108/srt-11-2020-0026.

Full text
Abstract:
Purpose Transportation infrastructure asset management has long been an active but challenging problem for agencies, which urges to maintain a good state of their assets but faces budgetary limitations. Managing a network of transportation infrastructure assets, especially when the number is large, is a multifaceted challenge. This paper aims to develop a life-cycle cost analysis (LCCA) based transportation infrastructure asset management analytical framework to study the impacts of a few key parameters/factors on deterioration and life-cycle cost. Using the bridge as an example infrastructure type, the framework incorporates an optimization model for optimizing maintenance, repair, rehabilitation (MR&R) and replacement decisions in a finite planning horizon. Design/methodology/approach The analytical framework is further developed through a series of model variations, scenario and sensitivity analysis, simulation processes and numerical experiments to show the impacts of various parameters/factors and draw managerial insights. One notable analysis is to explicitly model the epistemic uncertainties of infrastructure deterioration models, which have been overlooked in previous research. The proposed methodology can be adapted to different types of assets for solving general asset management and capital planning problems. Findings The experiments and case studies revealed several findings. First, the authors showed the importance of the deterioration model parameter (i.e. Markov transition probability). Inaccurate information of p will lead to suboptimal solutions and results in excessive total cost. Second, both agency cost and user cost of a single facility will have significant impacts on the system cost and correlation between them also influences the system cost. Third, the optimal budget can be found and the system cost is tolerant to budge variations within a certain range. Four, the model minimizes the total cost by optimizing the allocation of funds to bridges weighing the trade-off between user and agency costs. Originality/value On the path forward to develop the next generation of bridge management systems methodologies, the authors make an exploration of incorporating the epistemic uncertainties of the stochastic deterioration models into bridge MR&R capital planning and decision-making. The authors propose an optimization approach that does not only incorporate the inherent stochasticity of bridge deterioration but also considers the epistemic uncertainties and variances of the model parameters of Markovian transition probabilities due to data errors or modeling processes.
APA, Harvard, Vancouver, ISO, and other styles
15

Rigter, Marc, Bruno Lacerda, and Nick Hawes. "Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 13 (May 18, 2021): 11930–38. http://dx.doi.org/10.1609/aaai.v35i13.17417.

Full text
Abstract:
The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on planning for Stochastic Shortest Path (SSP) UMDPs with uncertain cost and transition functions. We introduce a Bellman equation to compute the regret for a policy. We propose a dynamic programming algorithm that utilises the regret Bellman equation, and show that it optimises minimax regret exactly for UMDPs with independent uncertainties. For coupled uncertainties, we extend our approach to use options to enable a trade off between computation and solution quality. We evaluate our approach on both synthetic and real-world domains, showing that it significantly outperforms existing baselines.
APA, Harvard, Vancouver, ISO, and other styles
16

Cheng, Minghui, and Dan M. Frangopol. "Optimal load rating-based inspection planning of corroded steel girders using Markov decision process." Probabilistic Engineering Mechanics 66 (October 2021): 103160. http://dx.doi.org/10.1016/j.probengmech.2021.103160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Mikhalov, Oleksandr Illich, Oleksandr Afrykanovych Stenin, Viktor Petrovych Pasko, Oleksandr Serhiiovych Stenin, and Yurii Opanasovych Tymoshyn. "Situational planning and operational adjustment of the route of the Autonomous robotic underwater vehicle." System technologies 3, no. 122 (October 10, 2019): 3–11. http://dx.doi.org/10.34185/1562-9945-3-122-2019-01.

Full text
Abstract:
Currently, missions (tasks) for the underwater robot formed using imperative programming methods (both text and graphic), describing in detail the sequence of robot actions that need performed to achieve the desired goal. At the same time, only the operator of the underwater robot, which makes up the mission, for example, the delivery of cargo to the target point, has an idea of the goal itself. Such technology is effective if the robot's mission carried out within a priori scenario. In other cases, it can either not be executed at all, or it can be executed with large violations and a threat to the safety of the device.When assessing the effectiveness of an underwater robot, the degree of its information autonomy, i.e. the ability to act independently in an unknown or insufficiently defined environment, is of fundamental importance. Therefore, the "intellectualization" of the Autonomous control system of the underwater robot is extremely important for the mission in unforeseen circumstances. For this propose to use intelligent decision support system. Two ways to implement optimal decision-making strategies based on the mathematical apparatus of the theory of Markov and semi-Markov processes using the Bellman optimality principle propose. The considered ways of implementation of optimal strategies of decision - making process relate to the strategy for a short finite time of cargo delivery, which is the most common in practice, and for a long interval of cargo delivery relative to the entire task. In addition, the article discusses ways to find optimal strategies when the time of making single decisions is fixed or when the time of translation is implement randomly.Hence, the situational approach to decision-making in the planning of the route ARPA is very relevant and allows not only to assess the possible situation on the route, but also to determine the control solutions for the operational adjustment of the route using the intelligent decision support system (ISPR). The development of models of the routing process based on the representation of the situational model in the form of nodes of the graph, the transitions of which correspond to the control solutions.The paper proposes two ways to implement optimal strategies of decision - making based on the mathematical apparatus of the theory of Markov and semi-Markov processes using the Bellman principle of optimality.
APA, Harvard, Vancouver, ISO, and other styles
18

OGRYCZAK, WLODZIMIERZ, PATRICE PERNY, and PAUL WENG. "A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES." International Journal of Information Technology & Decision Making 12, no. 05 (September 2013): 1021–53. http://dx.doi.org/10.1142/s0219622013400075.

Full text
Abstract:
A Markov decision process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagent problems in which the value of a decision must be evaluated according to several viewpoints, sometimes conflicting. Although most of the studies concentrate on the determination of the set of Pareto-optimal policies, we focus here on a more specialized problem that concerns the direct determination of policies achieving well-balanced tradeoffs. To this end, we introduce a reference point method based on the optimization of a weighted ordered weighted average (WOWA) of individual disachievements. We show that the resulting notion of optimal policy does not satisfy the Bellman principle and depends on the initial state. To overcome these difficulties, we propose a solution method based on a linear programming (LP) reformulation of the problem. Finally, we illustrate the feasibility of the proposed method on two types of planning problems under uncertainty arising in navigation of an autonomous agent and in inventory management.
APA, Harvard, Vancouver, ISO, and other styles
19

Bubnov, Yakov. "DNS Data Exfiltration Detection Using Online Planning for POMDP." European Journal of Engineering Research and Science 4, no. 9 (September 10, 2019): 22–25. http://dx.doi.org/10.24018/ejers.2019.4.9.1500.

Full text
Abstract:
This paper addresses a problem of blocking Domain Name System (DNS) exfiltration in a computer network. DNS exfiltration implies unauthorized transfer of sensitive data from the organization network to the remote adversary. Given detector of data exfiltration in DNS lookup queries this paper proposes an approach to automate query blocking decisions. More precisely, it defines an L-parametric Partially Observable Markov Decision Process (POMDP) formulation to enforce query blocking strategy on each network egress point, where L is a hyper-parameter that defines necessary level of the network security. The efficiency of the approach is based on (i) absence of interactions between distributed detectors, blocking decisions are taken individually by each detector; (ii) blocking strategy is applied to each particular query, therefore minimizing potentially incorrect blocking decisions.
APA, Harvard, Vancouver, ISO, and other styles
20

Bubnov, Yakov. "DNS Data Exfiltration Detection Using Online Planning for POMDP." European Journal of Engineering and Technology Research 4, no. 9 (September 10, 2019): 22–25. http://dx.doi.org/10.24018/ejeng.2019.4.9.1500.

Full text
Abstract:
This paper addresses a problem of blocking Domain Name System (DNS) exfiltration in a computer network. DNS exfiltration implies unauthorized transfer of sensitive data from the organization network to the remote adversary. Given detector of data exfiltration in DNS lookup queries this paper proposes an approach to automate query blocking decisions. More precisely, it defines an L-parametric Partially Observable Markov Decision Process (POMDP) formulation to enforce query blocking strategy on each network egress point, where L is a hyper-parameter that defines necessary level of the network security. The efficiency of the approach is based on (i) absence of interactions between distributed detectors, blocking decisions are taken individually by each detector; (ii) blocking strategy is applied to each particular query, therefore minimizing potentially incorrect blocking decisions.
APA, Harvard, Vancouver, ISO, and other styles
21

Monteiro, Neemias Silva, Vinicius Mariano Goncalves, and Carlos Andrey Maia. "Motion Planning of Mobile Robots in Indoor Topological Environments using Partially Observable Markov Decision Process." IEEE Latin America Transactions 19, no. 8 (August 2021): 1315–24. http://dx.doi.org/10.1109/tla.2021.9475862.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

AlDurgam, Mohammad M. "An Integrated Inventory and Workforce Planning Markov Decision Process Model with a Variable Production Rate." IFAC-PapersOnLine 52, no. 13 (2019): 2792–97. http://dx.doi.org/10.1016/j.ifacol.2019.11.631.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Kim, M., A. Ghate, and M. H. Phillips. "A Markov decision process approach to temporal modulation of dose fractions in radiation therapy planning." Physics in Medicine and Biology 54, no. 14 (June 26, 2009): 4455–76. http://dx.doi.org/10.1088/0031-9155/54/14/007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Xu, Jiuyun, Kun Chen, and Stephan Reiff-Marganiec. "Using Markov Decision Process Model with Logic Scoring of Preference Model to Optimize HTN Web Services Composition." International Journal of Web Services Research 8, no. 2 (April 2011): 53–73. http://dx.doi.org/10.4018/jwsr.2011040103.

Full text
Abstract:
Automatic Web services composition can be achieved using AI planning techniques. HTN planning has been adopted to handle the OWL-S Web service composition problem. However, existing composition methods based on HTN planning have not considered the choice of decompositions available to a problem, which can lead to a variety of valid solutions. In this paper, the authors propose a model of combining a Markov decision process model and HTN planning to address Web services composition. In the model, HTN planning is enhanced to decompose a task in multiple ways and find more than one plan, taking into account both functional and non-functional properties. Furthermore, an evaluation method to choose the optimal plan and experimental results illustrate that the proposed approach works effectively. The paper extends previous work by refining a number of aspects of the approach and applying it to a realistic case study.
APA, Harvard, Vancouver, ISO, and other styles
25

Shu, Mingrui, Xiuyu Zheng, Fengguo Li, Kaiyong Wang, and Qiang Li. "Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method." Applied Sciences 12, no. 6 (March 17, 2022): 3064. http://dx.doi.org/10.3390/app12063064.

Full text
Abstract:
Many path planning algorithms developed for land or air based autonomous vehicles no longer apply under the water. A time-optimal path planning method for autonomous underwater vehicles (AUVs), based on a Markov decision process (MDP) algorithm, is proposed for the marine environment. Its performance is examined for different oceanic conditions, including complex coastal bathymetry and time-varying ocean currents, revealing advantages compared to the A* algorithm, a traditional path planning method. The ocean current is predicted using a regional ocean model and then provided to the MDP algorithm as a priori. A computation-efficient and feature-resolved spatial resolution are determined through a series of sensitivity experiments. The simulations demonstrate the importance to incorporate ocean currents in the path planning of AUVs in the real ocean. The MDP algorithm remains robust even if the ocean current is complex.
APA, Harvard, Vancouver, ISO, and other styles
26

Zhang, Zhen, Jianfeng Wu, Yan Zhao, and Ruining Luo. "Research on Distributed Multi-Sensor Cooperative Scheduling Model Based on Partially Observable Markov Decision Process." Sensors 22, no. 8 (April 14, 2022): 3001. http://dx.doi.org/10.3390/s22083001.

Full text
Abstract:
In the context of distributed defense, multi-sensor networks are required to be able to carry out reasonable planning and scheduling to achieve the purpose of continuous, accurate and rapid target detection. In this paper, a multi-sensor cooperative scheduling model based on the partially observable Markov decision process is proposed. By studying the partially observable Markov decision process and the posterior Cramer–Rao lower bound, a multi-sensor cooperative scheduling model and optimization objective function were established. The improvement of the particle filter algorithm by the beetle swarm optimization algorithm was studied to improve the tracking accuracy of the particle filter. Finally, the improved elephant herding optimization algorithm was used as the solution algorithm of the scheduling scheme, which further improved the algorithm performance of the solution model. The simulation results showed that the model could solve the distributed multi-sensor cooperative scheduling problem well, had higher solution performance than other algorithms, and met the real-time requirements.
APA, Harvard, Vancouver, ISO, and other styles
27

Morley, C. D., and J. B. Thornes. "A Markov Decision Model for Network Flows*." Geographical Analysis 4, no. 2 (September 3, 2010): 180–93. http://dx.doi.org/10.1111/j.1538-4632.1972.tb00468.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Yuan, Minsen, Zhenshan Shi, and Zhouyi Shen. "Mission Planning in Time-Invariant Domains with MDPs and Gaussian Distribution." Journal of Physics: Conference Series 2386, no. 1 (December 1, 2022): 012022. http://dx.doi.org/10.1088/1742-6596/2386/1/012022.

Full text
Abstract:
Abstract This thesis uses the Markov decision process (MDP) to investigate how the optimal path is decided by the underwater robot in the presence of different currents. The use of this MDP decision allows the machine to reduce the unwanted travel energy and travel time in different environments, enhancing the cost of using the machine. The scope of the study is a specific problem, and the paper describes how to use a python planning framework to solve path selection in the case where the direction and strength of the currents are determined, while incorporating a probability distribution of Gaussian functions in a finite space. The dynamical ocean current problem is used as a basis to use the MDP decision process. The final results show that the present study meets the hypotheses presented in the expectations. The results are able to simulate that the robot can find the optimal path and reach the target point when facing different currents underwater. By expanding the spatial extent of the observation and changing the parameter settings, it is possible to simulate the robot’s path in other situations. The research method can be extended from specificity to generality, and the same research method can be used to obtain optimal path decisions for other different situations.
APA, Harvard, Vancouver, ISO, and other styles
29

Bäuerle, Nicole, and Alexander Glauner. "Minimizing spectral risk measures applied to Markov decision processes." Mathematical Methods of Operations Research 94, no. 1 (July 27, 2021): 35–69. http://dx.doi.org/10.1007/s00186-021-00746-w.

Full text
Abstract:
AbstractWe study the minimization of a spectral risk measure of the total discounted cost generated by a Markov Decision Process (MDP) over a finite or infinite planning horizon. The MDP is assumed to have Borel state and action spaces and the cost function may be unbounded above. The optimization problem is split into two minimization problems using an infimum representation for spectral risk measures. We show that the inner minimization problem can be solved as an ordinary MDP on an extended state space and give sufficient conditions under which an optimal policy exists. Regarding the infinite dimensional outer minimization problem, we prove the existence of a solution and derive an algorithm for its numerical approximation. Our results include the findings in Bäuerle and Ott (Math Methods Oper Res 74(3):361–379, 2011) in the special case that the risk measure is Expected Shortfall. As an application, we present a dynamic extension of the classical static optimal reinsurance problem, where an insurance company minimizes its cost of capital.
APA, Harvard, Vancouver, ISO, and other styles
30

Zhang, Jian, Mahjoub Dridi, and Abdellah El Moudni. "A Markov decision model with dead ends for operating room planning considering dynamic patient priority." RAIRO - Operations Research 53, no. 5 (October 15, 2019): 1819–41. http://dx.doi.org/10.1051/ro/2018110.

Full text
Abstract:
This paper addresses an operating room planning problem with surgical demands from both the elective patients and the non-elective ones. A dynamic waiting list is established to prioritize and manage the patients according to their urgency levels and waiting times. In every decision period, sequential decisions are taken by selecting high-priority patients from the waiting list to be scheduled. With consideration of random arrivals of new patients and uncertain surgery durations, the studied problem is formulated as a novel Markov decision process model with dead ends. The objective is to optimize a combinatorial cost function involving patient waiting times and operating room over-utilizations. Considering that the conventional dynamic programming algorithms have difficulties in coping with large-scale problems, we apply several adapted real-time dynamic programming algorithms to solve the proposed model. In numerical experiments, we firstly apply different algorithms to solve the same instance and compare the computational efficiencies. Then, to evaluate the effects of dead ends on the policy and the computation, we conduct simulations for multiple instances with the same problem scale but different dead ends. Experimental results indicate that incorporating dead ends into the model helps to significantly shorten the patient waiting times and improve the computational efficiency.
APA, Harvard, Vancouver, ISO, and other styles
31

Larach, Abdelhadi, Cherki Daoui, and Mohamed Baslam. "A Markov Decision Model for Area Coverage in Autonomous Demining Robot." International Journal of Informatics and Communication Technology (IJ-ICT) 6, no. 2 (August 1, 2017): 105. http://dx.doi.org/10.11591/ijict.v6i2.pp105-116.

Full text
Abstract:
A review of literature shows that there is a variety of works studying coverage path planning in several autonomous robotic applications. In this work, we propose a new approach using Markov Decision Process to plan an optimum path to reach the general goal of exploring an unknown environment containing buried mines. This approach, called Goals to Goals Area Coverage on-line Algorithm, is based on a decomposition of the state space into smaller regions whose states are considered as goals with the same reward value, the reward value is decremented from one region to another according to the desired search mode. The numerical simulations show that our approach is promising for minimizing the necessary cost-energy to cover the entire area.
APA, Harvard, Vancouver, ISO, and other styles
32

Mubiru, Kizito Paul. "Joint Replenishment Problem in Drug Inventory Management of Pharmacies under Stochastic Demand." Brazilian Journal of Operations & Production Management 15, no. 2 (June 1, 2018): 302–10. http://dx.doi.org/10.14488/bjopm.2018.v15.n2.a12.

Full text
Abstract:
In today’s fast-paced and competitive market place, organizations need every edge available to them to ensure success in planning and managing inventory of items with demand uncertainty. In such an effort, cost effective methods in determining optimal replenishment policies are paramount. In this paper, a mathematical model is proposed that optimize inventory replenishment policies of a periodic review inventory system under stochastic demand; with particular focus on malaria drugs in Ugandan pharmacies. Adopting a Markov decision process approach, the states of a Markov chain represent possible states of demand for drugs that treat malaria. Using weekly equal intervals, the decisions of whether or not to replenish additional units of drugs were made using discrete time Markov chains and dynamic programming over a finite period planning horizon. Empirical data was collected from two pharmacies in Uganda. The customer transactions of drugs were taken on a weekly basis; where data collected was analyzed and tested to establish the optimal replenishment policy and inventory costs of drugs. Results from the study indicated the existence of an optimal state-dependent replenishment policy and inventory costs of drugs at the respective pharmacies.
APA, Harvard, Vancouver, ISO, and other styles
33

Zhang, Hanrui, Yu Cheng, and Vincent Conitzer. "Planning with Participation Constraints." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (June 28, 2022): 5260–67. http://dx.doi.org/10.1609/aaai.v36i5.20462.

Full text
Abstract:
We pose and study the problem of planning in Markov decision processes (MDPs), subject to participation constraints as studied in mechanism design. In this problem, a planner must work with a self-interested agent on a given MDP. Each action in the MDP provides an immediate reward to the planner and a (possibly different) reward to the agent. The agent has no control in choosing the actions, but has the option to end the entire process at any time. The goal of the planner is to find a policy that maximizes her cumulative reward, taking into consideration the agent's ability to terminate. We give a fully polynomial-time approximation scheme for this problem. En route, we present polynomial-time algorithms for computing (exact) optimal policies for important special cases of this problem, including when the time horizon is constant, or when the MDP exhibits a "definitive decisions" property. We illustrate our algorithms with two different game-theoretic applications: the problem of assigning rides in ride-sharing and the problem of designing screening policies. Our results imply efficient algorithms for computing (approximately) optimal policies in both applications.
APA, Harvard, Vancouver, ISO, and other styles
34

Bouton, Maxime, Jana Tumova, and Mykel J. Kochenderfer. "Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (April 3, 2020): 10061–68. http://dx.doi.org/10.1609/aaai.v34i06.6563.

Full text
Abstract:
Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.
APA, Harvard, Vancouver, ISO, and other styles
35

Yousefi, Shamim, Farnaz Derakhshan, Hadis Karimipour, and Hadi S. Aghdasi. "An efficient route planning model for mobile agents on the internet of things using Markov decision process." Ad Hoc Networks 98 (March 2020): 102053. http://dx.doi.org/10.1016/j.adhoc.2019.102053.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Kareem, B., and HA Owolabi. "Optimizing Maintenance Planning in the Production Industry Using the Markovian Approach." Journal of Engineering Research [TJER] 9, no. 2 (December 1, 2012): 46. http://dx.doi.org/10.24200/tjer.vol9iss2pp46-63.

Full text
Abstract:
Maintenance is an essential activity in every manufacturing establishment, as manufacturing effectiveness counts on the functionality of production equipment and machinery in terms of their productivity and operational life. Maintenance cost minimization can be achieved by adopting an appropriate maintenance planning policy. This paper applies the Markovian approach to maintenance planning decision, thereby generating optimal maintenance policy from the identified alternatives over a specified period of time. Markov chains, transition matrices, decision processes, and dynamic programming models were formulated for the decision problem related to maintenance operations of a cable production company. Preventive and corrective maintenance data based on workloads and costs, were collected from the company and utilized in this study. The result showed variability in the choice of optimal maintenance policy that was adopted in the case study. Post optimality analysis of the process buttressed the claim. The proposed approach is promising for solving the maintenance scheduling decision problems of the company.
APA, Harvard, Vancouver, ISO, and other styles
37

Yang, Qiming, Jiancheng Xu, Haibao Tian, and Yong Wu. "Decision Modeling of UAV On-Line Path Planning Based on IMM." Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 36, no. 2 (April 2018): 323–31. http://dx.doi.org/10.1051/jnwpu/20183620323.

Full text
Abstract:
In order to enhance the capability of tracking targets autonomously of UAV, a model for UAV on-line path planning is established based on the theoretical framework of partially observable markov decision process(POMDP). The elements of the POMDP model are analyzed and described. According to the diversity of the target motion in real world, the law of state transition in POMDP model is described by the method of Interactive Multiple Model(IMM) To adapt to the target maneuvering changes. The action strategy of the UAV is calculated through nominal belief-state optimization(NBO) algorithm which is designed to search optimal action policy to minimize the cumulative cost of action. The generated action strategy controls the UAV flight. The simulation results show that the established POMDP model can achieve autonomous planning for UAV route, and it can control the UAV to effectively track target. The planning path is more reasonable and efficient than the result of using single state transition law.
APA, Harvard, Vancouver, ISO, and other styles
38

Winder, John, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr, Shawn Squire, Marie DesJardins, and Cynthia Matuszek. "Planning with Abstract Learned Models While Learning Transferable Subtasks." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (April 3, 2020): 9992–10000. http://dx.doi.org/10.1609/aaai.v34i06.6555.

Full text
Abstract:
We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction. We call this framework Planning with Abstract Learned Models (PALM). By representing subtasks symbolically using a new formal structure, the lifted abstract Markov decision process (L-AMDP), PALM learns models that are independent and modular. Through our experiments, we show how PALM integrates planning and execution, facilitating a rapid and efficient learning of abstract, hierarchical models. We also demonstrate the increased potential for learned models to be transferred to new and related tasks.
APA, Harvard, Vancouver, ISO, and other styles
39

Bazrafshan, Nazila, and M. M. Lotfi. "A finite-horizon Markov decision process model for cancer chemotherapy treatment planning: an application to sequential treatment decision making in clinical trials." Annals of Operations Research 295, no. 1 (July 17, 2020): 483–502. http://dx.doi.org/10.1007/s10479-020-03706-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Saini, Gurtej Singh, Oney Erge, Pradeepkumar Ashok, and Eric van Oort. "Well Construction Action Planning and Automation through Finite-Horizon Sequential Decision-Making." Energies 15, no. 16 (August 9, 2022): 5776. http://dx.doi.org/10.3390/en15165776.

Full text
Abstract:
Well construction operations require continuous complex decision-making and multi-step action planning. Action selection at every step demands a careful evaluation of the vast action space, while guided by long-term objectives and desired outcomes. Current human-centric decision-making introduces a degree of bias, which can result in reactive rather than proactive decisions. This can lead from minor operational inefficiencies all the way to catastrophic health and safety issues. This paper details the steps in structuring unbiased purpose-built sequential decision-making systems. Setting up such systems entails representing the operation as a Markov decision process (MDP). This requires explicitly defining states and action values, defining goal states, building a digital twin to model the process, and appropriately shaping reward functions to measure feedback. The digital twin, in conjunction with the reward function, is utilized for simulating and quantifying the different action sequences. A finite-horizon sequential decision-making system, with discrete state and action space, was set up to advise on hole cleaning during well construction. The state was quantified by the cuttings bed height and the equivalent circulation density values, and the action set was defined using a combination of controllable drilling parameters (including mud density and rheology, drillstring rotation speed, etc.). A non-sparse normalized reward structure was formulated as a function of the state and action values. Hydraulics, cuttings transport, and rig state detection models were integrated to build the hole cleaning digital twin. This system was then used for performance tracking and scenario simulations (with each scenario defined as a finite-horizon action sequence) on real-world oil wells. The different scenarios were compared by monitoring state–action transitions and the evolution of the reward with actions. This paper presents a novel method for setting up well construction operations as long-term finite-horizon sequential decision-making systems, and defines a way to quantify and compare different scenarios. The proper construction of such systems is a crucial step towards automating intelligent decision-making.
APA, Harvard, Vancouver, ISO, and other styles
41

Liu, Xiangguo, Neda Masoud, Qi Zhu, and Anahita Khojandi. "A Markov Decision Process framework to incorporate network-level data in motion planning for connected and automated vehicles." Transportation Research Part C: Emerging Technologies 136 (March 2022): 103550. http://dx.doi.org/10.1016/j.trc.2021.103550.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Lee, Chanyoung, Sang Min Han, Young Ho Chae, and Poong Hyun Seong. "Development of a cyberattack response planning method for nuclear power plants by using the Markov decision process model." Annals of Nuclear Energy 166 (February 2022): 108725. http://dx.doi.org/10.1016/j.anucene.2021.108725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Kim, M., A. Ghate, and MH Phillips. "SU-FF-T-170: A Markov Decision Process Approach to Temporal Modulation of Dose Fractions in Radiotherapy Planning." Medical Physics 36, no. 6Part11 (June 2009): 2559. http://dx.doi.org/10.1118/1.3181645.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Shao, Jie, Hai Xia Lin, and Bin Song. "A Hierarchical Conflict Resolution Method for Multi-Robot Path Planning." Applied Mechanics and Materials 380-384 (August 2013): 1482–87. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.1482.

Full text
Abstract:
Multi-robot path planning using shared resources, easily conflict, prioritisation is the shared resource conflicts to resolve an important technology. This paper presents a learning classifier based on dynamic allocation of priority methods to improve the performance of the robot team. Individual robots learn to optimize their behaviors first, and then a high-level planner robot is introduced and trained to resolve conflicts by assigning priority. The novel approach is designed for Partially Observable Markov Decision Process environments. Simulation results show that the method used to solve the conflict in multi-robot path planning is effective and improve the capacity of multi-robot path planning.
APA, Harvard, Vancouver, ISO, and other styles
45

Mern, John, Anil Yildiz, Zachary Sunberg, Tapan Mukerji, and Mykel J. Kochenderfer. "Bayesian Optimized Monte Carlo Planning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 13 (May 18, 2021): 11880–87. http://dx.doi.org/10.1609/aaai.v35i13.17411.

Full text
Abstract:
Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree. The performance of progressive widening search is dependent upon the action sampling policy, often requiring problem-specific samplers. In this work, we present a general method for efficient action sampling based on Bayesian optimization. The proposed method uses a Gaussian process to model a belief over the action-value function and selects the action that will maximize the expected improvement in the optimal action value. We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning (BOMCP). Several experiments show that BOMCP is better able to scale to large action space POMDPs than existing state-of-the-art tree search solvers.
APA, Harvard, Vancouver, ISO, and other styles
46

Yang, Qing, Xu Sun, Xingxing Liu, and Jinmei Wang. "Multi-Agent Simulation of Individuals’ Escape in the Urban Rainstorm Context Based on Dynamic Recognition-Primed Decision Model." Water 12, no. 4 (April 22, 2020): 1190. http://dx.doi.org/10.3390/w12041190.

Full text
Abstract:
The urban rainstorm can evolve into a serious emergency, generally characterized by high complexity, uncertainty, and time pressure. It is often difficult for individuals to find the optimal response strategy due to limited information and time constraints. Therefore, the classical decision-making method based on the “infinite rationality” assumption is sometimes challenging to reflect the reality. Based on the recognition-primed decision (RPD) model, a dynamic RPD (D-RPD) model is proposed in this paper. The D-RPD model assumes that decision-makers can gain experience in the escaping process, and the risk perception of rainstorm disasters can be regarded as a Markov process. The experience of recent attempts would contribute more in decision-making. We design the agent according to the D-RPD model, and employ a multi-agent system (MAS) to simulate individuals’ decisions in the context of a rainstorm. Our results show that experience helps individuals to perform better when they escape in the rainstorm. Recency acts as a one of the key elements in escaping decision making. We also find that filling the information gap between individuals and real-time disaster would help individuals to perform well, especially when individuals tend to avoid extreme decisions.
APA, Harvard, Vancouver, ISO, and other styles
47

Luo, Yuanfu, Haoyu Bai, David Hsu, and Wee Sun Lee. "Importance sampling for online planning under uncertainty." International Journal of Robotics Research 38, no. 2-3 (June 19, 2018): 162–81. http://dx.doi.org/10.1177/0278364918780322.

Full text
Abstract:
The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves DESPOT’s performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks. We also present a general method for learning the importance sampling distribution.
APA, Harvard, Vancouver, ISO, and other styles
48

Witwicki, Stefan, Francisco Melo, Jesús Capitán, and Matthijs Spaan. "A Flexible Approach to Modeling Unpredictable Events in MDPs." Proceedings of the International Conference on Automated Planning and Scheduling 23 (June 2, 2013): 260–68. http://dx.doi.org/10.1609/icaps.v23i1.13566.

Full text
Abstract:
In planning with a Markov decision process (MDP) framework, there is the implicit assumption that the world is predictable. Practitioners must simply take it on good faith the MDP they have constructed is comprehensive and accurate enough to model the exact probabilities with which all events may occur under all circumstances. Here, we challenge the conventional assumption of complete predictability, arguing that some events are inherently unpredictable. Towards more effectively modeling problems with unpredictable events, we develop a hybrid framework that explicitly distinguishes decision factors whose probabilities are not assigned precisely while still representing known probability components using conventional principled MDP transitions. Our approach is also flexible, resulting in a factored model of variable abstraction whose usage for planning results in different levels of approximation. We illustrate the application of our framework to an intelligent surveillance planning domain.
APA, Harvard, Vancouver, ISO, and other styles
49

Mausam and D. S. Weld. "Planning with Durative Actions in Stochastic Domains." Journal of Artificial Intelligence Research 31 (January 19, 2008): 33–82. http://dx.doi.org/10.1613/jair.2269.

Full text
Abstract:
Probabilistic planning problems are typically modeled as a Markov Decision Process (MDP). MDPs, while an otherwise expressive model, allow only for sequential, non-durative actions. This poses severe restrictions in modeling and solving a real world planning problem. We extend the MDP model to incorporate 1) simultaneous action execution, 2) durative actions, and 3) stochastic durations. We develop several algorithms to combat the computational explosion introduced by these features. The key theoretical ideas used in building these algorithms are -- modeling a complex problem as an MDP in extended state/action space, pruning of irrelevant actions, sampling of relevant actions, using informed heuristics to guide the search, hybridizing different planners to achieve benefits of both, approximating the problem and replanning. Our empirical evaluation illuminates the different merits in using various algorithms, viz., optimality, empirical closeness to optimality, theoretical error bounds, and speed.
APA, Harvard, Vancouver, ISO, and other styles
50

Matignon, Laetitia, Laurent Jeanpierre, and Abdel-Illah Mouaddib. "DECENTRALIZED MULTI-ROBOT PLANNING TO EXPLORE AND PERCEIVE." Acta Polytechnica 55, no. 3 (June 30, 2015): 169–76. http://dx.doi.org/10.14311/ap.2015.55.0169.

Full text
Abstract:
In a recent French robotic contest, the objective was to develop a multi-robot system able to autonomously map and explore an unknown area while also detecting and localizing objects. As a participant in this challenge, we proposed a new decentralized Markov decision process (Dec-MDP) resolution based on distributed value functions (DVF) to compute multi-robot exploration strategies. The idea is to take advantage of sparse interactions by allowing each robot to calculate locally a strategy that maximizes the explored space while minimizing robots interactions. In this paper, we propose an adaptation of this method to improve also object recognition by integrating into the DVF the interest in covering explored areas with photos. The robots will then act to maximize the explored space and the photo coverage, ensuring better perception and object recognition.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography