Dissertations / Theses on the topic 'Approximate dynamic programming'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Approximate dynamic programming.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Sadiq, Mohammad. "Approximate Dynamic Programming Methods in HEVs." Thesis, KTH, Maskinkonstruktion (Inst.), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-182762.
Full textElhybridfordon (HEV) har ökat i popularitet över hela världen pga sin låga bränsleförbrukning, vilket har minskat efterfrågan på olja. Detta gynnar i hög grad miljön, eftersom detta leder till mindre utsläpp och följaktligen en lägre växthuseffekt. Därför pågår aktiv forskning inom detta område med en ökad efterfrågan på nya och bättre strategier för bränsleförbrukning. Många olika metoder för energihantering av HEV använder en särskild metod; dynamisk programmering. Dynamisk programmering ger ett optimalt globalt resultat men på bekostnad av längre beräkningstider. Den mest använda metoden för att motverka denna typ av problematik i högdimensionella system är Approximate Dynamic Programming (ADP). Denna avhandling undersöker och beskriver litteraturen på de olika metoderna för ADP tillämpade på HEV samt en genomförandefas som visar en minskning av beräkningstiden för ett HEV-problem gällande energihantering.
Vyzas, Elias. "Approximate dynamic programming for some queueing problems." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/10282.
Full textIncludes bibliographical references (p. 81-82).
by Elias Vyzas.
M.S.
Sauré, Antoine. "Approximate dynamic programming methods for advance patient scheduling." Thesis, University of British Columbia, 2012. http://hdl.handle.net/2429/43448.
Full textChild, Christopher H. T. "Approximate dynamic programming with parallel stochastic planning operators." Thesis, City University London, 2011. http://openaccess.city.ac.uk/1109/.
Full textLiu, Ning. "Approximate dynamic programming algorithms for production-planning problems." Thesis, Wichita State University, 2013. http://hdl.handle.net/10057/10636.
Full textThesis (M.S.)--Wichita State University, College of Engineering, Dept. of Industrial and Manufacturing Engineering
Demir, Ramazan. "An approximate dynamic programming approach to discrete optimization." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/9137.
Full textIncludes bibliographical references (leaves 181-189).
We develop Approximate Dynamic Programming (ADP) methods to integer programming problems. We describe and investigate parametric, nonparametric and base-heuristic learning approaches to approximate the value function in order to break the curse of dimensionality. Through an extensive computational study we illustrate that our ADP approach to integer programming competes successfully with existing methodologies including state of art commercial packages like CPLEX. Our benchmarks for comparison are solution quality, running time and robustness (i.e., small deviations in the computational resources such as running time for varying instances of same size). In this thesis, we particularly focus on knapsack problems and the binary integer programming problem. We explore an integrated approach to solve discrete optimization problems by unifying optimization techniques with statistical learning. Overall, this research illustrates that the ADP is a promising technique by providing near-optimal solutions within reasonable amount of computation time especially for large scale problems with thousands of variables and constraints. Thus, Approximate Dynamic Programming can be considered as a new alternative to existing approximate methods for discrete optimization problems.
by Ramazan Demir.
Ph.D.
Cai, C. "Adaptive traffic signal control using approximate dynamic programming." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/20164/.
Full textNadarajah, Selvaprabu. "Approximate Dynamic Programming for Commodity and Energy Merchant Operations." Research Showcase @ CMU, 2014. http://repository.cmu.edu/dissertations/350.
Full textBethke, Brett (Brett M. ). "Kernel-based approximate dynamic programming using Bellman residual elimination." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/57544.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student submitted PDF version of thesis.
Includes bibliographical references (p. 207-221).
Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.
(cont.) We present theoretical analysis of all BRE algorithms proving convergence to the optimal policy in the limit of sampling the entire state space, and show computational results on several benchmark problems. Another challenge in implementing control policies based on MDPs is that there may be parameters of the system model that are poorly known and/or vary with time as the system operates. System performance can suer if the model used to compute the policy differs from the true model. To address this challenge, we develop an adaptive architecture that allows for online MDP model learning and simultaneous re-computation of the policy. As a result, the adaptive architecture allows the system to continuously re-tune its control policy to account for better model information 3 obtained through observations of the actual system in operation, and react to changes in the model as they occur. Planning in complex, large-scale multi-agent robotic systems is another focus of the thesis. In particular, we investigate the persistent surveillance problem, in which one or more unmanned aerial vehicles (UAVs) and/or unmanned ground vehicles (UGVs) must provide sensor coverage over a designated location on a continuous basis. This continuous coverage must be maintained even in the event that agents suer failures over the course of the mission. The persistent surveillance problem is pertinent to a number of applications, including search and rescue, natural disaster relief operations, urban traffic monitoring, etc.
(cont.) Using both simulations and actual flight experiments conducted in the MIT RAVEN indoor flight facility, we demonstrate the successful application of the BRE algorithms and the adaptive MDP architecture in achieving high mission performance despite the random occurrence of failures. Furthermore, we demonstrate performance benefits of our approach over a deterministic planning approach that does not account for these failures.
by Brett M. Bethke.
Ph.D.
Valenti, Mario J. (Mario James) 1976. "Approximate dynamic programming with applications in multi-agent systems." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/40330.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
MIT Institute Archives copy: contains CDROM of thesis in .pdf format.
Includes bibliographical references (p. 151-161).
This thesis presents the development and implementation of approximate dynamic programming methods used to manage multi-agent systems. The purpose of this thesis is to develop an architectural framework and theoretical methods that enable an autonomous mission system to manage real-time multi-agent operations. To meet this goal, we begin by discussing aspects of the real-time multi-agent mission problem. Next, we formulate this problem as a Markov Decision Process (MDP) and present a system architecture designed to improve mission-level functional reliability through system self-awareness and adaptive mission planning. Since most multi-agent mission problems are computationally difficult to solve in real-time, approximation techniques are needed to find policies for these large-scale problems. Thus, we have developed theoretical methods used to find feasible solutions to large-scale optimization problems. More specifically, we investigate methods designed to automatically generate an approximation to the cost-to-go function using basis functions for a given MDP. Next, these these techniques are used by an autonomous mission system to manage multi-agent mission scenarios. Simulation results using these methods are provided for a large-scale mission problem. In addition, this thesis presents the implementation of techniques used to manage autonomous unmanned aerial vehicles (UAVs) performing persistent surveillance operations. We present an indoor multi-vehicle testbed called RAVEN (Real-time indoor Autonomous Vehicle test ENvironment) that was developed to study long-duration missions in a controlled environment.
(cont.) The RAVEN's design allows researchers to focus on high-level tasks by autonomously managing the platform's realistic air and ground vehicles during multi-vehicle operations, thus promoting the rapid prototyping of UAV technologies by flight testing new vehicle configurations and algorithms without redesigning vehicle hardware. Finally, using the RAVEN, we present flight test results from autonomous, extended mission tests using the technologies developed in this thesis. Flight results from a 24 hr, fully-autonomous air vehicle flight-recharge test and an autonomous, multi-vehicle extended mission test using small, electric-powered air vehicles are provided.
by Mario J. Valenti.
Ph.D.
Keng, Leng Hui. "Approximate String Matching With Dynamic Programming and Suffix Trees." UNF Digital Commons, 2006. http://digitalcommons.unf.edu/etd/196.
Full textBoussios, Constantinos I. "An approach for nonlinear control design via approximate dynamic programming." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9792.
Full textIncludes bibliographical references (p. 173-181).
This thesis proposes and studies a methodology for designing controllers for nonlinear dynamic systems. We are interested in state feedback controllers (policies) that stabilize the state in a given region around an equilibrium point while minimizing a cost functional that captures the performance of the closed loop system. The optimal control problem can be solved in principle using dynamic programming algorithms such as policy iteration. Exact policy iteration is computationally infeasible for systems of even moderate dimension, which leads us to consider methods based on Approximate Policy Iteration. In such methods, we first select an approximation architecture (i.e., a parametrized class of functions) that is used to approximate the cost-to-go function under a given policy, on the basis of cost samples that are obtained through simulation. The resulting approximate cost function is used to derive another, hopefully better policy, and the procedure is repeated iteratively. There are several case studies exploring the use of this methodology, but they are of limited generality, and without much of a theoretical foundation. This thesis explores the soundness of Approximate Policy Iteration. We address the problem of improving the performance of a given stabilizing controller, as well as the problem of designing stabilizing controllers for unstable systems. For the first problem, we develop bounds on the approximation error that can be tolerated if we wish to guarantee that the resulting controllers are stabilizing and/or offer improved performance. We give bounds on the suboptimality of the resulting controllers, in terms of the assumed approximation errors. We also extend the methodology to the unstable case by introducing an appropriate modification of the optimal control problem. The computational burden of cost function approximation can be often reduced, thereby enhancing the practicality of the method, by exploiting special structure. We illustrate this for a special class of nonlinear systems with fast linear and slow nonlinear dynamics. We also present an approximation based on state space gridding, whose performance can be evaluated via a systematic test. Finally, analysis is supported by applying Approximate Policy Iteration to two specific problems, one involving a missile model and the other involving a beam-and-ball model.
by Constantinos I. Boussios.
Ph.D.
Keller, Philipp Wilhelm. "Automatic basis function construction for reinforcement learning and approximate dynamic programming." Thesis, McGill University, 2008. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=19284.
Full textNous adressons la construction automatique de fonctions base pour l'approximation linéaire de la fonction valeur d'un processus de décision Markov. Cette thèse se base sur les résultats de Bertsekas et Castañon (1989), qui ont proposé une méthode pour automatiquement grouper des états dans le but d'accélérer la programmation dynamique. Nous proposons d'utiliser une technique récente de réduction de dimension afin de projeter des états en haute dimension dans un espace à basse dimension. Nous plaçons alors des fonctions base radiales dans ce nouvel espace. Cette technique est appliquée à plusieurs problèmes de référence standards pour les algorithmes d'apprentissage par renforcement, ainsi qu'à un problème de contrôle d'inventaire en haute dimension.
Hwang, Daw-sen. "Projected equation and aggregation-based approximate dynamic programming methods for Tetris." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66033.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 65-67).
In this thesis, we survey approximate dynamic programming (ADP) methods and test the methods with the game of Tetris. We focus on ADP methods where the cost-to- go function J is approximated with [phi]r, where [phi] is some matrix and r is a vector with relatively low dimension. There are two major categories of methods: projected equation methods and aggregation methods. In projected equation methods, the cost-to-go function approximation [phi]r is updated by simulation using one of several policy-updated algorithms such as LSTD([lambda]) [BB96], and LSPE(A) [B196]. Projected equation methods generally may not converge. We define a pseudometric of policies and view the oscillations of policies in Tetris. Aggregation methods are based on a model approximation approach. The original problem is reduced to an aggregate problem with significantly fewer states. The weight vector r is the cost-to-go function of the aggregate problem and [phi] is the matrix of aggregation probabilities. In aggregation methods, the vector r converges to the optimal cost-to-go function of the aggregate problem. In this thesis, we implement aggregation methods for Tetris, and compare the performance of projected equation methods and aggregation methods.
by Daw-sen Hwang.
S.M.
Löhndorf, Nils, David Wozabal, and Stefan Minner. "Optimizing Trading Decisions for Hydro Storage Systems using Approximate Dual Dynamic Programming." INFORMS, 2013. http://dx.doi.org/10.1287/opre.2013.1182.
Full textJeria, David (David O. Jeria López). "An approximate dynamic programming approach to risk sensitive control of execution costs." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/55112.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 43-44).
We study the problem of optimal execution within a dynamic programming framework. Given an exponential objective function, system variables which are normally distributed, and linear market dynamics, we derive a closed form solution for optimal trading trajectories. We show that a trader lacking private information has trajectories which are static in nature, whilst a trader with private information requires real time observations to execute optimally. We further show that Bellman's equations become increasingly complex to solve if either the market dynamics are nonlinear, or if additional constraints are added to the problem. As such, we propose an approximate dynamic program using linear programming which achieves near-optimality. The algorithm approximates the exponential objective function within a class of linear architectures, and takes advantage of a probabilistic constraint sampling scheme in order to terminate. The performance of the algorithm relies on the quality of the approximation, and as such we propose a set of heuristics for its efficient implementation.
by David Jeria.
M.Eng.
Pratikakis, Nikolaos. "Multistage decisions and risk in Markov decision processes towards effective approximate dynamic programming architectures /." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/31654.
Full textCommittee Chair: Jay H. Lee; Committee Member: Martha Grover; Committee Member: Matthew J. Realff; Committee Member: Shabbir Ahmed; Committee Member: Stylianos Kavadias. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Astaraky, Davood. "A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/23622.
Full textNachmani, Gil. "Minimum-energy flight paths for UAVs using mesoscale wind forecasts and approximate dynamic programming." Thesis, Monterey, Calif. : Naval Postgraduate School, 2007. http://bosun.nps.edu/uhtbin/hyperion-image.exe/07Dec%5FNachmani.pdf.
Full textThesis Advisor(s): Royset, Johannes O. Description based on title screen as viewed on January 22, 2007. Includes bibliographical references (p. 57-60). Also available in print.
Xu, Jinbiao. "An approximate dynamic programming approach for coordinated charging control at vehicle-to-grid aggregator." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/36970.
Full textHearnes, Warren E. II. "Near-optimal intelligent control for continuous set-point regulator problems via approximate dynamic programming." Diss., Georgia Institute of Technology, 1999. http://hdl.handle.net/1853/24882.
Full textChen, Xiaoting. "Optimal Control of Non-Conventional Queueing Networks: A Simulation-Based Approximate Dynamic Programming Approach." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1427799942.
Full textYang, Xinan. "Top-percentile traffic routing problem." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/5883.
Full textLi, Dong. "An approximate dynamic programming approach to the scheduling of impatient jobs in a clearing system." Thesis, Lancaster University, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.557156.
Full textLee, Jong Min. "A Study on Architecture, Algorithms, and Applications of Approximate Dynamic Programming Based Approach to Optimal Control." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5048.
Full textSeelhof, Michael. "Long term infrastructure investments under uncertainty in the electric power sector using approximate dynamic programming techniques." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/90724.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 179-183).
A computer model was developed to find optimal long-term investment strategies for the electric power sector under uncertainty with respect to future regulatory regimes and market conditions. The model is based on a multi-stage problem formulation and uses approximate dynamic programming techniques to find an optimal solution. The model was tested under various scenarios. The model results were analyzed with regards to the optimal first-stage investment decision, the final technology mix, total costs, the cost of ignoring uncertainty and the cost of regulatory uncertainty.
by Michael Seelhof.
S.M. in Engineering and Management
Hanselmann, Thomas. "Approximate dynamic programming with adaptive critics and the algebraic perceptron as a fast neural network related to support vector machines." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2003. http://theses.library.uwa.edu.au/adt-WU2004.0005.
Full textStellato, Bartolomeo. "Mixed-integer optimal control of fast dynamical systems." Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:b8a7323c-e36e-45ec-ae8d-6c9eb4350629.
Full textRamirez, Jose A. "Optimal and Simulation-Based Approximate Dynamic Programming Approaches for the Control of Re-Entrant Line Manufacturing Models." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1282329260.
Full textTosukhowong, Thidarat. "Dynamic Real-time Optimization and Control of an Integrated Plant." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14087.
Full textKeerthisinghe, Chanaka. "Fast Solution Techniques for Energy Management in Smart Homes." Thesis, The University of Sydney, 2016. http://hdl.handle.net/2123/16033.
Full textWong, Wee Chin. "Estimation and control of jump stochastic systems." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/31775.
Full textCommittee Chair: Jay H. Lee; Committee Member: Alexander Gray; Committee Member: Erik Verriest; Committee Member: Magnus Egerstedt; Committee Member: Martha Grover; Committee Member: Matthew Realff. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Papageorgiou, Dimitri Jason. "Optimization in maritime inventory routing." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/50117.
Full textLöhndorf, Nils, and Stefan Minner. "Simulation Optimization for the Stochastic Economic Lot Scheduling Problem." Taylor and Francis, 2013. http://dx.doi.org/10.1080/0740817X.2012.662310.
Full textRegatti, Jayanth Reddy. "Dynamic Routing for Fuel Optimization in Autonomous Vehicles." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524145002064074.
Full textCakir, Fahrettin. "Data-centric solution methodologies for vehicle routing problems." Diss., University of Iowa, 2016. https://ir.uiowa.edu/etd/2052.
Full textBountourelis, Theologos. "Efficient pac-learning for episodic tasks with acyclic state spaces and the optimal node visitation problem in acyclic stochastic digaphs." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/28144.
Full textCommittee Chair: Reveliotis, Spyros; Committee Member: Ayhan, Hayriye; Committee Member: Goldsman, Dave; Committee Member: Shamma, Jeff; Committee Member: Zwart, Bert.
Jeddi, Babak. "A coordinated energy management scheme in a residential neighborhood under given market framework." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/200710/1/Babak_Jeddi_Thesis.pdf.
Full textFiocchi, Leonardo. "A Reinforcement Learning strategy for Satellite Attitude Control." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Find full textGoodson, Justin Christopher. "Solution methodologies for vehicle routing problems with stochastic demand." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/675.
Full textYin, Biao. "Contrôle adaptatif des feux de signalisation dans les carrefours : modélisation du système de trafic dynamique et approches de résolution." Thesis, Belfort-Montbéliard, 2015. http://www.theses.fr/2015BELF0279/document.
Full textAdaptive traffic signal control is a decision making optimization problem. People address this crucial problem constantly in order to solve the traffic congestion at urban intersections. It is very popular to use intelligent algorithms to improve control performances, such as traffic delay. In the thesis, we try to study this problem comprehensively with a microscopic and dynamic model in discrete-time, and investigate the related algorithms both for isolated intersection and distributed network control. At first, we focus on dynamic modeling for adaptive traffic signal control and network loading problems. The proposed adaptive phase sequence (APS) mode is highlighted as one of the signal phase control mechanisms. As for the modeling of signal control at intersections, problems are fundamentally formulated by Markov decision process (MDP), especially the concept of tunable system state is proposed for the traffic network coordination. Moreover, a new vehicle-following model supports for the network loading environment.Based on the model, signal control methods in the thesis are studied by optimal and near-optimal algorithms in turn. Two exact DP algorithms are investigated and results show some limitations of DP solution when large state space appears in complex cases. Because of the computational burden and unknown model information in dynamic programming (DP), it is suggested to use an approximate dynamic programming (ADP). Finally, the online near-optimal algorithm using ADP with RLS-TD(λ) is confirmed. In simulation experiments, especially with the integration of APS, the proposed algorithm indicates a great advantage in performance measures and computation efficiency
Holguin, Mijail Gamarra. "Planejamento probabilístico usando programação dinâmica assíncrona e fatorada." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-14042013-131306/.
Full textMarkov Decision Process (MDP) model problems of sequential decision making, where the possible actions have probabilistic effects on the successor states (defined by state transition matrices). Real-time dynamic programming (RTDP), is a technique for solving MDPs when there exists information about the initial state. Traditional approaches show better performance in problems with sparse state transition matrices, because they can achieve the convergence to optimal policy efficiently, without visiting all states. But, this advantage can be lose in problems with dense state transition matrices, in which several states can be achieved in a step (for example, control problems with exogenous events). An approach to overcome this limitation is to explore regularities existing in the domain dynamics through a factored representation, i.e., a representation based on state variables. In this master thesis, we propose a new algorithm called FactRTDP (Factored RTDP), and its approximate version aFactRTDP (Approximate and Factored RTDP), that are the first factored efficient versions of the classical RTDP algorithm. We also propose two other extensions, FactLRTDP and aFactLRTDP, that label states for which the value function has converged to the optimal. The experimental results show that when these new algorithms are executed in domains with dense transition matrices, they converge faster. And they have a good online performance in domains with dense transition matrices and few dependencies among state variables.
Filho, Antonio Martins Lima. "Alocação dinâmica de recursos: aplicação ao transporte rodoviário de cargas em longa distância." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/3/3138/tde-25082011-135158/.
Full textOperational planning of a long haul transportation system implies to solve a dynamic network optimization problem, aiming to perform the freight movements in an efficient and effective way, while utilizing the available transportation capacity. The proposed solution methodology utilizes the Logistic Queueing Network approach, replacing the network global optimization process through Integer Linear Programming by a model of Stochastic, Approximate and Adaptive Dynamic Programming, which allows the resolution of a sequence of sub- problems delimited in time, strongly reducing the quantity of variables involved. This method allows the utilization of more realistic mathematical models in a broader planning horizon. The research extends models found in the literature to solve more complex problems, including the consideration of heterogeneous fleet of vehicles, time windows, third party vehicles and penalties for not attendance of demands. Experimental problems solved successfully with the developed technique are presented. The work also presents the delineation of a Decision Support System incorporating the proposed methodology.
Heng, Jeremy. "On the use of transport and optimal control methods for Monte Carlo simulation." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:6cbc7690-ac54-4a6a-b235-57fa62e5b2fc.
Full textAndrade, Gustavo Araújo de. "PROGRAMAÇÃO DINÂMICA HEURÍSTICA DUAL E REDES DE FUNÇÕES DE BASE RADIAL PARA SOLUÇÃO DA EQUAÇÃO DE HAMILTON-JACOBI-BELLMAN EM PROBLEMAS DE CONTROLE ÓTIMO." Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/517.
Full textIn this work the main objective is to present the development of learning algorithms for online application for the solution of algebraic Hamilton-Jacobi-Bellman equation. The concepts covered are focused on developing the methodology for control systems, through techniques that aims to design online adaptive controllers to reject noise sensors, parametric variations and modeling errors. Concepts of neurodynamic programming and reinforcement learning are are discussed to design algorithms where the context of a given operating point causes the control system to adapt and thus present the performance according to specifications design. Are designed methods for online estimation of adaptive critic focusing efforts on techniques for gradient estimating of the environment value function.
Neste trabalho o principal objetivo é apresentar o desenvolvimento de algoritmos de aprendizagem para execução online para a solução da equação algébrica de Hamilton-Jacobi-Bellman. Os conceitos abordados se concentram no desenvolvimento da metodologia para sistemas de controle, por meio de técnicas que tem como objetivo o projeto online de controladores adaptativos são projetados para rejeitar ruídos de sensores, variações paramétricas e erros de modelagem. Conceitos de programação neurodinâmica e aprendizagem por reforço são abordados para desenvolver algoritmos onde a contextualização de determinado ponto de operação faz com que o sistema de controle se adapte e, dessa forma, apresente o desempenho de acordo com as especificações de projeto. Desenvolve-se métodos para a estimação online do crítico adaptativo concentrando os esforços em técnicas de estimação do gradiente da função valor do ambiente.
Romão, Oberlan Christo. "O problema de corte não-guilhotinado multiperíodo com sobras aproveitáveis." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-01022018-180800/.
Full textIn this research, we study the multi-period two-dimensional cutting problem with usable leftover, which consists of cutting objects to produce a set of items. We assume a finite planning horizon with a finite amount of periods between the initial and final times. First we consider a deterministic version in which we know, a priori, the set of ordered items and the cost of the objects at each period. Some of the leftovers generated during the cutting process of the ordered items in a period may be used as objects in the future. The leftovers that can be used in the future are called usable leftovers. In general, a leftover is considered usable if it has dimensions equal to or greater than that of some item from a predefined list for the period. The goal is to minimize the total cost of the objects used to cut the set of ordered items of the entire considered horizon. If there are solutions with the same cost, we wish to find one that, at the end of the considered time horizon, maximizes the value of the remaining usable leftovers. We present a mathematical model of the problem using a bilevel formulation, which is transformed into a mixed integer linear programming model, due to the characteristics of the problem. Considering the difficulty in solving the developed model, we propose a heuristic approach based on approximate dynamic programming (ADP) to deal with the proposed problem. Other options based on the rolling horizon and relax-and-fix strategies are also considered. We also consider the scenario where we do not know in advance the set of ordered items and the cost of the objects, but we have information about the probability distributions of both. In this case, we present an approach based on approximate dynamic programming to estimate the best strategy to be followed at each period. We compared the results obtained by the ADP with the results found by a greedy method. In suitable scenarios, the results show that the ADP achieves superior solutions to the greedy method.
Zhang, Jian. "Advance Surgery Scheduling with Consideration of Downstream Capacity Constraints and Multiple Sources of Uncertainty." Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCA023.
Full textThis thesis deals with the advance scheduling of elective surgeries in an operating theatre that is composed of operating rooms and downstream recovery units. The arrivals of new patients in each week, the duration of each surgery, and the length-of-stay of each patient in the downstream recovery unit are subject to uncertainty. In each week, the surgery planner should determine the surgical blocks to open and assign some of the surgeries in the waiting list to the open surgical blocks. The objective is to minimize the patient-related costs incurred by performing and postponing surgeries as well as the hospital-related costs caused by the utilization of surgical resources. Considering that the pure mathematical programming models commonly used in literature do not optimize the long-term performance of the surgery schedules, we propose a novel two-phase optimization model that combines Markov decision process (MDP) and stochastic programming to overcome this drawback. The MDP model in the first phase determines the surgeries to be performed in each week and minimizes the expected total costs over an infinite horizon, then the stochastic programming model in the second phase optimizes the assignments of the selected surgeries to surgical blocks. In order to cope with the huge complexity of realistically sized problems, we develop a reinforcement-learning-based approximate dynamic programming algorithm and several column-generation-based heuristic algorithms as the solution approaches. We conduct numerical experiments to evaluate the model and algorithms proposed in this thesis. The experimental results indicate that the proposed algorithms are considerably more efficient than the traditional ones, and that the resulting schedules of the two-phase optimization model significantly outperform those of a conventional stochastic programming model in terms of the patients' waiting times and the total costs on the long run
Qu, Zheng. "Nonlinear Perron-Frobenius theory and max-plus numerical methods for Hamilton-Jacobi equations." Palaiseau, Ecole polytechnique, 2013. http://pastel.archives-ouvertes.fr/docs/00/92/71/22/PDF/thesis.pdf.
Full textDynamic programming is one of the main approaches to solve optimal control problems. It reduces the latter problems to Hamilton-Jacobi partial differential equations (PDE). Several techniques have been proposed in the literature to solve these PDE. We mention, for example, finite difference schemes, the so-called discrete dynamic programming method or semi-Lagrangian method, or the antidiffusive schemes. All these methods are grid-based, i. E. , they require a discretization of the state space, and thus suffer from the so-called curse of dimensionality. The present thesis focuses on max-plus numerical solutions and convergence analysis for medium to high dimensional deterministic optimal control problems. We develop here max-plus based numerical algorithms for which we establish theoretical complexity estimates. The proof of these estimates is based on results of nonlinear Perron-Frobenius theory. In particular, we study the contraction properties of monotone or non-expansive nonlinear operators, with respect to several classical metrics on cones (Thompson's metric, Hilbert's projective metric), and obtain nonlinear or non-commutative generalizations of the "ergodicity coefficients" arising in the theory of Markov chains. These results have applications in consensus theory and also to the generalized Riccati equations arising in stochastic optimal control
Molina, Diogenes. "Intelligent control and system aggregation techniques for improving rotor-angle stability of large-scale power systems." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50291.
Full textDivila, Jaroslav. "Vyhledávání přibližných palindromů v DNA sekvencích." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236524.
Full text