Rozprawy doktorskie: „Dynamic optimal learning rate”

1

Cheng, Martin Chun-Sheng, i pjcheng@ozemail com au. "Dynamical Near Optimal Training for Interval Type-2 Fuzzy Neural Network (T2FNN) with Genetic Algorithm". Griffith University. School of Microelectronic Engineering, 2003. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20030722.172812.

Pełny tekst źródła

Streszczenie:

Type-2 fuzzy logic system (FLS) cascaded with neural network, called type-2 fuzzy neural network (T2FNN), is presented in this paper to handle uncertainty with dynamical optimal learning. A T2FNN consists of type-2 fuzzy linguistic process as the antecedent part and the two-layer interval neural network as the consequent part. A general T2FNN is computational intensive due to the complexity of type 2 to type 1 reduction. Therefore the interval T2FNN is adopted in this paper to simplify the computational process. The dynamical optimal training algorithm for the two-layer consequent part of interval T2FNN is first developed. The stable and optimal left and right learning rates for the interval neural network, in the sense of maximum error reduction, can be derived for each iteration in the training process (back propagation). It can also be shown both learning rates can not be both negative. Further, due to variation of the initial MF parameters, i.e. the spread level of uncertain means or deviations of interval Gaussian MFs, the performance of back propagation training process may be affected. To achieve better total performance, a genetic algorithm (GA) is designed to search better-fit spread rate for uncertain means and near optimal learnings for the antecedent part. Several examples are fully illustrated. Excellent results are obtained for the truck backing-up control and the identification of nonlinear system, which yield more improved performance than those using type-1 FNN.

Style APA, Harvard, Vancouver, ISO itp.

2

Cheng, Martin Chun-Sheng. "Dynamical Near Optimal Training for Interval Type-2 Fuzzy Neural Network (T2FNN) with Genetic Algorithm". Thesis, Griffith University, 2003. http://hdl.handle.net/10072/366350.

Pełny tekst źródła

Streszczenie:

Type-2 fuzzy logic system (FLS) cascaded with neural network, called type-2 fuzzy neural network (T2FNN), is presented in this paper to handle uncertainty with dynamical optimal learning. A T2FNN consists of type-2 fuzzy linguistic process as the antecedent part and the two-layer interval neural network as the consequent part. A general T2FNN is computational intensive due to the complexity of type 2 to type 1 reduction. Therefore the interval T2FNN is adopted in this paper to simplify the computational process. The dynamical optimal training algorithm for the two-layer consequent part of interval T2FNN is first developed. The stable and optimal left and right learning rates for the interval neural network, in the sense of maximum error reduction, can be derived for each iteration in the training process (back propagation). It can also be shown both learning rates can not be both negative. Further, due to variation of the initial MF parameters, i.e. the spread level of uncertain means or deviations of interval Gaussian MFs, the performance of back propagation training process may be affected. To achieve better total performance, a genetic algorithm (GA) is designed to search better-fit spread rate for uncertain means and near optimal learnings for the antecedent part. Several examples are fully illustrated. Excellent results are obtained for the truck backing-up control and the identification of nonlinear system, which yield more improved performance than those using type-1 FNN.
Thesis (Masters)
Master of Philosophy (MPhil)
School of Microelectronic Engineering
Full Text

Style APA, Harvard, Vancouver, ISO itp.

3

Chang, Yusun. "Dynamic Optimal Fragmentation with Rate Adaptation in Wireless Mobile Networks". Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/19824.

Pełny tekst źródła

Streszczenie:

Dynamic optimal fragmentation with rate adaptation (DORA) is an algorithm to achieve maximum goodput in wireless mobile networks. With the analytical model that incorporates number of users, contentions, packet lengths, and bit error rates in the network, DORA computes a fragmentation threshold and transmits optimal sized packets with maximum rates. To estimate the SNR in the model, an adaptive on-demand UDP estimator is designed to reduce overheads. Test-beds to execute experiments for channel estimation, WLANs, Ad Hoc networks, and Vehicle-to-Vehicle networks are developed to evaluate the performance of DORA. DORA is an energy-efficient generic CSMA/CA MAC protocol for wireless mobile computing applications, and enhances system goodput in WLANs, Ad Hoc networks, and Vehicle-to-Vehicle networks without modification of the protocols.

Style APA, Harvard, Vancouver, ISO itp.

4

Moncur, Tyler. "Optimal Learning Rates for Neural Networks". BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8662.

Pełny tekst źródła

Streszczenie:

Neural networks have long been known as universal function approximators and have more recently been shown to be powerful and versatile in practice. But it can be extremely challenging to find the right set of parameters and hyperparameters. Model training is both expensive and difficult due to the large number of parameters and sensitivity to hyperparameters such as learning rate and architecture. Hyperparameter searches are notorious for requiring tremendous amounts of processing power and human resources. This thesis provides an analytic approach to estimating the optimal value of one of the key hyperparameters in neural networks, the learning rate. Where possible, the analysis is computed exactly, and where necessary, approximations and assumptions are used and justified. The result is a method that estimates the optimal learning rate for a certain type of network, a fully connected CReLU network.

Style APA, Harvard, Vancouver, ISO itp.

5

Shu, Weihuan. "Optimal sampling rate assignment with dynamic route selection for real-time wireless sensor networks". Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32351.

Pełny tekst źródła

Streszczenie:

The allocation of computation and communication resources in a manner that optimizes aggregate system performance is a crucial aspect of system management. Wireless sensor network poses new challenges due to the resource constraints and real-time requirements. Existing work has dealt with the real-time sampling rate assignment problem, under single processor case and network case with static routing environment. For wireless sensor networks, in order to achieve better overall network performance, routing should be considered together with the rate assignments of individual flows. In this thesis, we address the problem of optimizing sampling rates with dynamic route selection for wireless sensor networks. We model the problem as a constrained optimization problem and solve it under the Network Utility Maximization framework. Based on the primal-dual method and dual decomposition technique, we design a distributed algorithm that achieves the optimal global network utility considering both dynamic route decision and rate assignment. Extensive simulations have been conducted to demonstrate the efficiency and efficacy of our proposed solutions.
L'attribution de calcul et de la communication ressources d'une mani`ere qui optimise les performances du syst`eme global est un aspect crucial de la gestion du syst`eme. R´eseau de capteurs sans fil pose de nouveaux d´efis en raison de la p´enurie de ressources et en temps r´eel. Travaux existants a traite distribution temps-reel probl`eme de taux d'´echantillonnage, dans un seul processeur cas et r´eseau cas de routage environment statique. Pour les r´eseaux de capteurs sans fil, afin de parvenir `a une meilleure performance globale du r´eseau, le routage devrait tre examin´e en mˆeme temps que la distribution de taux des flux individuels. Dans cet article, nous abordons le probl`eme de l'optimisation des taux d'´echantillonnage avec route s´election dynamique pour r´eseaux de capteurs sans fil. Nous modelisons le probleme comme un probl`eme d'optimisation et le r´esolvons dans le cadre de l'utilite de reseau maximisation. Sur la base de la m´ethode primal-dual et la dual d´ecomposition technique, nous concevons un algorithme distribu´e qui atteint le meilleur l'utilite de reseau globale au vu de route d´ecision dynamique et le taux distribution. Des simulations ont ´et´e r´ealis´ees pour d´emontrer l'efficience et l'efficacit´e de nos solutions propos´ees. fr

Style APA, Harvard, Vancouver, ISO itp.

6

Aroh, Kosisochukwu C. "Determination of optimal conditions and kinetic rate parameters in continuous flow systems with dynamic inputs". Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/121815.

Pełny tekst źródła

Streszczenie:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Chemical Engineering, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 171-185).
.The fourth industrial revolution is said to be brought about by digitization in the manufacturing sector. According to this understanding, the third industrial revolution which involved computers and automation will be further enhanced with smart and autonomous systems fueled by data and machine learning. At the research stage, an analogous story is being told in how automation and new technologies could revolutionize a chemistry laboratory. Flow chemistry is a technique that contrast with traditional batch chemistry in one aspect as a method that facilitates process automation and in small scales, delivers process improvements such as high heat and mass transfer rates. In addition to flow chemistry, analytical tools have also greatly improved and have become fully automated with potential for remote control. Over the past decade, work utilizing optimization techniques to find optimal conditions in flow chemistry have become more prevalent.
In addition, the scope of reactions performed in these systems have also increased. In the first part of this thesis, the construction of a platform capable of performing a wide range of these reactions on the lab scale is discussed. This platform was built with the capability of performing global optimizations using steady state experiments. The rest of the thesis concerns generating dynamic experiments in flow systems and using these conditions to gain more information about a reaction. The ability to use dynamic experiments to accurately determine reaction kinetics is first detailed. Through these experiments we found that only two orthogonal experiments were needed to sample the experimental space. After this an algorithm that utilizes dynamic experiments for kinetic parameter estimation problems is described. The approach here was to use dynamic experiments to first quickly sample the design space to get a reasonable estimate of the kinetic parameters.
Then steady state optimal design of experiments were used to fine tune these estimates. We observed that after initial orthogonal experiments only three more conditions were needed for accurate estimates of the multi-step reaction example. In a similar fashion, an algorithm for reaction optimization that relies on dynamic experiments is also described. The approach here extended that of adaptive response surface methodology where dynamic orthogonal experiments were performed in place of steady state experiments. When compared to steady state optimizations of multi-step reactions, a reduction by half in time needed to locate the optimum is observed. Finally, the potential issues that arise when using transient experiments in automated systems for reaction analysis are addressed. These issues include dispersion, sampling rate, reactor sizes and the rate of change of transients.
These results demonstrate a way with which technological innovation could further revolutionize the chemistry laboratory. By combining machine learning, clouding computing and efficient, high information experiments reaction data could be quickly collected, and the information gained could be maximized for future predictions or optimizations.
by Kosisochukwu C. Aroh.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Chemical Engineering

Style APA, Harvard, Vancouver, ISO itp.

7

Ouyang, Hua. "Optimal stochastic and distributed algorithms for machine learning". Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49091.

Pełny tekst źródła

Streszczenie:

Stochastic and data-distributed optimization algorithms have received lots of attention from the machine learning community due to the tremendous demand from the large-scale learning and the big-data related optimization. A lot of stochastic and deterministic learning algorithms are proposed recently under various application scenarios. Nevertheless, many of these algorithms are based on heuristics and their optimality in terms of the generalization error is not sufficiently justified. In this talk, I will explain the concept of an optimal learning algorithm, and show that given a time budget and proper hypothesis space, only those achieving the lower bounds of the estimation error and the optimization error are optimal. Guided by this concept, we investigated the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We proposed a novel algorithm named Accelerated Nonsmooth Stochastic Gradient Descent, which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algorithm that can achieve the optimal O(1/t) rate for minimizing nonsmooth loss functions. The fast rates are confirmed by empirical comparisons with state-of-the-art algorithms including the averaged SGD. The Alternating Direction Method of Multipliers (ADMM) is another flexible method to explore function structures. In the second part we proposed stochastic ADMM that can be applied to a general class of convex and nonsmooth functions, beyond the smooth and separable least squares loss used in lasso. We also demonstrate the rates of convergence for our algorithm under various structural assumptions of the stochastic function: O(1/sqrt{t}) for convex functions and O(log t/t) for strongly convex functions. A novel application named Graph-Guided SVM is proposed to demonstrate the usefulness of our algorithm. We also extend the scalability of stochastic algorithms to nonlinear kernel machines, where the problem is formulated as a constrained dual quadratic optimization. The simplex constraint can be handled by the classic Frank-Wolfe method. The proposed stochastic Frank-Wolfe methods achieve comparable or even better accuracies than state-of-the-art batch and online kernel SVM solvers, and are significantly faster. The last part investigates the problem of data-distributed learning. We formulate it as a consensus-constrained optimization problem and solve it with ADMM. It turns out that the underlying communication topology is a key factor in achieving a balance between a fast learning rate and computation resource consumption. We analyze the linear convergence behavior of consensus ADMM so as to characterize the interplay between the communication topology and the penalty parameters used in ADMM. We observe that given optimal parameters, the complete bipartite and the master-slave graphs exhibit the fastest convergence, followed by bi-regular graphs.

Style APA, Harvard, Vancouver, ISO itp.

8

Lee, Jong Min. "A Study on Architecture, Algorithms, and Applications of Approximate Dynamic Programming Based Approach to Optimal Control". Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5048.

Pełny tekst źródła

Streszczenie:

This thesis develops approximate dynamic programming (ADP) strategies suitable for process control problems aimed at overcoming the limitations of MPC, which are the potentially exorbitant on-line computational requirement and the inability to consider the future interplay between uncertainty and estimation in the optimal control calculation. The suggested approach solves the DP only for the state points visited by closed-loop simulations with judiciously chosen control policies. The approach helps us combat a well-known problem of the traditional DP called 'curse-of-dimensionality,' while it allows the user to derive an improved control policy from the initial ones. The critical issue of the suggested method is a proper choice and design of function approximator. A local averager with a penalty term is proposed to guarantee a stably learned control policy as well as acceptable on-line performance. The thesis also demonstrates versatility of the proposed ADP strategy with difficult process control problems. First, a stochastic adaptive control problem is presented. In this application an ADP-based control policy shows an "active" probing property to reduce uncertainties, leading to a better control performance. The second example is a dual-mode controller, which is a supervisory scheme that actively prevents the progression of abnormal situations under a local controller at their onset. Finally, two ADP strategies for controlling nonlinear processes based on input-output data are suggested. They are model-based and model-free approaches, and have the advantage of conveniently incorporating the knowledge of identification data distribution into the control calculation with performance improvement.

Style APA, Harvard, Vancouver, ISO itp.

9

Bountourelis, Theologos. "Efficient pac-learning for episodic tasks with acyclic state spaces and the optimal node visitation problem in acyclic stochastic digaphs". Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/28144.

Pełny tekst źródła

Streszczenie:

Thesis (M. S.)--Industrial and Systems Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Reveliotis, Spyros; Committee Member: Ayhan, Hayriye; Committee Member: Goldsman, Dave; Committee Member: Shamma, Jeff; Committee Member: Zwart, Bert.

Style APA, Harvard, Vancouver, ISO itp.

10

Singh, Manish Kumar. "Optimization, Learning, and Control for Energy Networks". Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/104064.

Pełny tekst źródła

Streszczenie:

Massive infrastructure networks such as electric power, natural gas, or water systems play a pivotal role in everyday human lives. Development and operation of these networks is extremely capital-intensive. Moreover, security and reliability of these networks is critical. This work identifies and addresses a diverse class of computationally challenging and time-critical problems pertaining to these networks. This dissertation extends the state of the art on three fronts. First, general proofs of uniqueness for network flow problems are presented, thus addressing open problems. Efficient network flow solvers based on energy function minimizations, convex relaxations, and mixed-integer programming are proposed with performance guarantees. Second, a novel approach is developed for sample-efficient training of deep neural networks (DNN) aimed at solving optimal network dispatch problems. The novel feature here is that the DNNs are trained to match not only the minimizers, but also their sensitivities with respect to the optimization problem parameters. Third, control mechanisms are designed that ensure resilient and stable network operation. These novel solutions are bolstered by mathematical guarantees and extensive simulations on benchmark power, water, and natural gas networks.
Doctor of Philosophy
Massive infrastructure networks play a pivotal role in everyday human lives. A minor service disruption occurring locally in electric power, natural gas, or water networks is considered a significant loss. Uncertain demands, equipment failures, regulatory stipulations, and most importantly complicated physical laws render managing these networks an arduous task. Oftentimes, the first principle mathematical models for these networks are well known. Nevertheless, the computations needed in real-time to make spontaneous decisions frequently surpass the available resources. Explicitly identifying such problems, this dissertation extends the state of the art on three fronts: First, efficient models enabling the operators to tractably solve some routinely encountered problems are developed using fundamental and diverse mathematical tools; Second, quickly trainable machine learning based solutions are developed that enable spontaneous decision making while learning offline from sophisticated mathematical programs; and Third, control mechanisms are designed that ensure a safe and autonomous network operation without human intervention. These novel solutions are bolstered by mathematical guarantees and extensive simulations on benchmark power, water, and natural gas networks.

Style APA, Harvard, Vancouver, ISO itp.

11

Fiocchi, Leonardo. "A Reinforcement Learning strategy for Satellite Attitude Control". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Znajdź pełny tekst źródła

Streszczenie:

In recent years space missions for both scientific and commercial purposes have substantially increased. More and more spacecrafts have flexible multibody structures, are subject to liquid volume changes, fuel utilization, and other behaviours that alter the parameters of the spacecraft's model. Moreover, varying disturbances such as the gravity angle torque due to Earth's gravitational field, aerodynamic torque, and others may lead to unwanted effects on the satellite's dynamics. These uncertainties in the model and environment descriptions make it difficult to set up an exact mathematical model, which makes even more difficult the attitude control tasks. For these reasons, data-driven approaches are introduced to alleviate the shortcomings of model-based control methodologies, or even solving them completely. This thesis focuses on these issues, introducing and implementing a data-driven approach, bridging optimal control and reinforcement learning building blocks. The developments are carried out on discrete-time time-varying linear systems. The particular feature of this methodology is that no prior knowledge of the system parameters is necessary. While no a priori information is used, the results show how the algorithm converges to the optimal control of a controller with full and precise knowledge of the system. While the algorithm and learning process use quaternions representation, a custom animation of the algorithm and system results in Euler Angles is provided to better evaluate the performances of the solution.

Style APA, Harvard, Vancouver, ISO itp.

12

MBITI, JOHN N. "Deep learning for portfolio optimization". Thesis, Linnéuniversitetet, Institutionen för matematik (MA), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-104567.

Pełny tekst źródła

Streszczenie:

In this thesis, an optimal investment problem is studied for an investor who can only invest in a financial market modelled by an Itô-Lévy process; with one risk free (bond) and one risky (stock) investment possibility. We present the dynamic programming method and the associated Hamilton-Jacobi-Bellman (HJB) equation to explicitly solve this problem. It is shown that with purification and simplification to the standard jump diffusion process, closed form solutions for the optimal investment strategy and for the value function are attainable. It is also shown that, an explicit solution can be obtained via a finite training of a neural network using Stochastic gradient descent (SGD) for a specific case.

Style APA, Harvard, Vancouver, ISO itp.

13

Heng, Jeremy. "On the use of transport and optimal control methods for Monte Carlo simulation". Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:6cbc7690-ac54-4a6a-b235-57fa62e5b2fc.

Pełny tekst źródła

Streszczenie:

This thesis explores ideas from transport theory and optimal control to develop novel Monte Carlo methods to perform efficient statistical computation. The first project considers the problem of constructing a transport map between two given probability measures. In the Bayesian formalism, this approach is natural when one introduces a curve of probability measures connecting the prior to posterior by tempering the likelihood function. The main idea is to move samples from the prior using an ordinary differential equation (ODE), constructed by solving the Liouville partial differential equation (PDE) which governs the time evolution of measures along the curve. In this work, we first study the regularity solutions of Liouville equation should satisfy to guarantee validity of this construction. We place an emphasis on understanding these issues as it explains the difficulties associated with solutions that have been previously reported. After ensuring that the flow transport problem is well-defined, we give a constructive solution. However, this result is only formal as the representation is given in terms of integrals which are intractable. For computational tractability, we proposed a novel approximation of the PDE which yields an ODE whose drift depends on the full conditional distributions of the intermediate distributions. Even when the ODE is time-discretized and the full conditional distributions are approximated numerically, the resulting distribution of mapped samples can be evaluated and used as a proposal within Markov chain Monte Carlo and sequential Monte Carlo (SMC) schemes. We then illustrate experimentally that the resulting algorithm can outperform state-of-the-art SMC methods at a fixed computational complexity. The second project aims to exploit ideas from optimal control to design more efficient SMC methods. The key idea is to control the proposal distribution induced by a time-discretized Langevin dynamics so as to minimize the Kullback-Leibler divergence of the extended target distribution from the proposal. The optimal value functions of the resulting optimal control problem can then be approximated using algorithms developed in the approximate dynamic programming (ADP) literature. We introduce a novel iterative scheme to perform ADP, provide a theoretical analysis of the proposed algorithm and demonstrate that the latter can provide significant gains over state-of-the-art methods at a fixed computational complexity.

Style APA, Harvard, Vancouver, ISO itp.

14

Pop, Ionel. "Détection des événements rares dans des vidéos". Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO22023.

Pełny tekst źródła

Streszczenie:

Le travail présenté dans cette étude se place dans le contexte de l’analyse automatique des vidéos. A cause du nombre croissant des données vidéo, il est souvent difficile, voire impossible qu’un ou plusieurs opérateurs puissent les regarder toutes. Une demande récurrente est d’identifier les moments dans la vidéo quand il y a quelque chose d’inhabituel qui se passe, c’est-à-dire la détection des événements anormaux.Nous proposons donc plusieurs algorithmes permettant d’identifier des événements inhabituels, en faisant l’hypothèse que ces événements ont une faible probabilité. Nous abordons plusieurs types d’événements, de l’analyse des zones en mouvement à l’analyse des trajectoires des objets suivis.Après avoir dédié une partie de la thèse à la construction d’un système de suivi,nous proposons plusieurs mesures de similarité entre des trajectoires. Ces mesures, basées sur DTW (Dynamic Time Warping), estiment la similarité des trajectoires prenant en compte différents aspects : spatial, mais aussi temporel, pour pouvoir - par exemple - faire la différence entre des trajectoires qui ne sont pas parcourues de la même façon (en termes de vitesse de déplacement). Ensuite, nous construisons des modèles de trajectoires, permettant de représenter les comportements habituels des objets pour pouvoir ensuite détecter ceux qui s’éloignent de la normale.Pour pallier les défauts de suivi qui apparaissent dans la pratique, nous analysons les vecteurs de flot optique et nous construisons une carte de mouvement. Cette carte modélise sous la forme d’un codebook les directions privilégiées qui apparaissent pour chaque pixel, permettant ainsi d’identifier tout déplacement anormal, sans avoir pour autant la notion d’objet suivi. En utilisant la cohérence temporelle, nous pouvons améliorer encore plus le taux de détection, affecté par les erreurs d’estimation de flot optique. Dans un deuxième temps, nous changeons la méthode de constructions de cette carte de mouvements, pour pouvoir extraire des caractéristiques de plus haut niveau — l’équivalent des trajectoires, mais toujours sans nécessiter le suivi des objets. Nous pouvons ainsi réutiliser partiellement l’analyse des trajectoires pour détecter des événements rares.Tous les aspects présentés dans cette thèse ont été implémentés et nous avons construit certaines applications, comme la prédiction des déplacements des objets ou la mémorisation et la recherche des objets suivis
The growing number of video data makes often difficult, even impossible, any attemptof watching them entirely. In the context of automatic analysis of videos, a recurring request is to identify moments in the video when something unusual happens.We propose several algorithms to identify unusual events, making the hypothesis that these events have a low probability. We address several types of events, from those generates by moving areas to the trajectories of objects tracked. In the first part of the study, we build a simple tracking system. We propose several measures of similarity between trajectories. These measures give an estimate of the similarity of trajectories by taking into account both spatial and/or temporal aspects. It is possible to differentiate between objects moving on the same path, but with different speeds. Based on these measures, we build models of trajectories representing the common behavior of objects, so that we can identify those that are abnormal.We noticed that the tracking yields bad results, especially in crowd situations. Therefore, we use the optical flow vectors to build a movement model based on a codebook. This model stores the preferred movement directions for each pixel. It is possible to identify abnormal movement at pixel-level, without having to use a tracker. By using temporal coherence, we can further improve the detection rate, affected by errors of estimation of optic flow. In a second step, we change the method of construction of this model. With the new approach, we can extract higher-level features — the equivalent trajectories, but still without the notion of object tracking. In this situation, we can reuse partial trajectory analysis to detect rare events.All aspects presented in this study have been implemented. In addition, we have design some applications, like predicting the trajectories of visible objects or storing and retrieving tracked objects in a database

Style APA, Harvard, Vancouver, ISO itp.

15

Monteiro, Rocha Lima Bruno. "Object Surface Exploration Using a Tactile-Enabled Robotic Fingertip". Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39956.

Pełny tekst źródła

Streszczenie:

Exploring surfaces is an essential ability for humans, allowing them to interact with a large variety of objects within their environment. This ability to explore surfaces is also of a major interest in the development of a new generation of humanoid robots, which requires the development of more efficient artificial tactile sensing techniques. The details perceived by statically touching different surfaces of objects not only improve robotic hand performance in force-controlled grasping tasks but also enables the feeling of vibrations on touched surfaces. This thesis presents an extensive experimental study of object surface exploration using biologically-Inspired tactile-enabled robotic fingers. A new multi-modal tactile sensor, embedded in both versions of the robotic fingertips (similar to the human distal phalanx) is capable of measuring the heart rate with a mean absolute error of 1.47 bpm through static explorations of the human skin. A two-phalanx articulated robotic finger with a new miniaturized tactile sensor embedded into the fingertip was developed in order to detect and classify surface textures. This classification is performed by the dynamic exploration of touched object surfaces. Two types of movements were studied: one-dimensional (1D) and two-dimensional (2D) movements. The machine learning techniques - Support Vector Machine (SVM), Multilayer Perceptron (MLP), Random Forest, Extra Trees, and k-Nearest Neighbors (kNN) - were tested in order to find the most efficient one for the classification of the recovered textured surfaces. A 95% precision was achieved when using the Extra Trees technique for the classification of the 1D recovered texture patterns. Experimental results confirmed that the 2D textured surface exploration using a hemispheric tactile-enabled finger was superior to the 1D exploration. Three exploratory velocities were used for the 2D exploration: 30 mm/s, 35 mm/s, and 40 mm/s. The best classification accuracy of the 2D recovered texture patterns was 99.1% and 99.3%, using the SVM classifier, for the two lower exploratory velocities (30 mm/s and 35mm/s), respectively. For the 40 mm/s velocity, the Extra Trees classifier provided a classification accuracy of 99.4%. The results of the experimental research presented in this thesis could be suitable candidates for future development.

Style APA, Harvard, Vancouver, ISO itp.

16

Gonzalez, Karen. "Contribution à l’étude des processus markoviens déterministes par morceaux : étude d’un cas-test de la sûreté de fonctionnement et problème d’arrêt optimal à horizon aléatoire". Thesis, Bordeaux 1, 2010. http://www.theses.fr/2010BOR14139/document.

Pełny tekst źródła

Streszczenie:

Les Processus Markoviens Déterministes par Morceaux (PDMP) ont été introduits dans la littérature par M.H.A Davis comme une classe générale de modèles stochastiques. Les PDMP forment une famille de processus markoviens qui décrivent une trajectoire déterministe ponctuée par des sauts aléatoires. Dans une première partie, les PDMP sont utilisés pour calculer des probabilités d'événements redoutés pour un cas-test de la fiabilité dynamique (le réservoir chauffé) par deux méthodes numériques différentes : la première est basée sur la résolution du système différentieldécrivant l'évolution physique du réservoir et la seconde utilise le calcul de l'espérancede la fonctionnelle d'un PDMP par un système d'équations intégro-différentielles.Dans la seconde partie, nous proposons une méthode numérique pour approcher lafonction valeur du problème d'arrêt optimal pour un PDMP. Notre approche estbasée sur la quantification de la position après saut et le temps inter-sauts de lachaîne de Markov sous-jacente au PDMP, et la discréetisation en temps adaptée à latrajectoire du processus. Ceci nous permet d'obtenir une vitesse de convergence denotre schéma numérique et de calculer un temps d'arrêt ε-optimal
Piecewise Deterministic Markov Processes (PDMP's) have been introduced inthe literature by M.H.A. Davis as a general class of stochastics models. PDMP's area family of Markov processes involving deterministic motion punctuated by randomjumps. In a first part, PDMP's are used to compute probabilities of top eventsfor a case-study of dynamic reliability (the heated tank system) with two di#erentmethods : the first one is based on the resolution of the differential system giving thephysical evolution of the tank and the second uses the computation of the functionalof a PDMP by a system of integro-differential equations. In the second part, wepropose a numerical method to approximate the value function for the optimalstopping problem of a PDMP. Our approach is based on quantization of the post-jump location and inter-arrival time of the Markov chain naturally embedded in thePDMP, and path-adapted time discretization grids. It allows us to derive boundsfor the convergence rate of the algorithm and to provide a computable ε-optimalstopping time

Style APA, Harvard, Vancouver, ISO itp.

17

Pahkasalo, Carolina, i André Sollander. "Adaptive Energy Management Strategies for Series Hybrid Electric Wheel Loaders". Thesis, Linköpings universitet, Fordonssystem, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166284.

Pełny tekst źródła

Streszczenie:

An emerging technology is the hybridization of wheel loaders. Since wheel loaders commonly operate in repetitive cycles it should be possible to use this information to develop an efficient energy management strategy that decreases fuel consumption. The purpose of this thesis is to evaluate if and how this can be done in a real-time online application. The strategy that is developed is based on pattern recognition and Equivalent Consumption Minimization Strategy (ECMS), which together is called Adaptive ECMS (A-ECMS). Pattern recognition uses information about the repetitive cycles and predicts the operating cycle, which can be done with Neural Network or Rule-Based methods. The prediction is then used in ECMS to compute the optimal power distribution of fuel and battery power. For a robust system it is important with stability implementations in ECMS to protect the machine, which can be done by adjusting the cost function that is minimized. The result from these implementations in a quasistatic simulation environment is an improvement in fuel consumption by 7.59 % compared to not utilizing the battery at all.

Style APA, Harvard, Vancouver, ISO itp.

18

Qu, Zheng. "Nonlinear Perron-Frobenius theory and max-plus numerical methods for Hamilton-Jacobi equations". Palaiseau, Ecole polytechnique, 2013. http://pastel.archives-ouvertes.fr/docs/00/92/71/22/PDF/thesis.pdf.

Pełny tekst źródła

Streszczenie:

Une approche fondamentale pour la résolution de problémes de contrôle optimal est basée sur le principe de programmation dynamique. Ce principe conduit aux équations d'Hamilton-Jacobi, qui peuvent être résolues numériquement par des méthodes classiques comme la méthode des différences finies, les méthodes semi-lagrangiennes, ou les schémas antidiffusifs. À cause de la discrétisation de l'espace d'état, la dimension des problèmes de contrôle pouvant être abordés par ces méthodes classiques est souvent limitée à 3 ou 4. Ce phénomène est appellé malédiction de la dimension. Cette thèse porte sur les méthodes numériques max-plus en contôle optimal deterministe et ses analyses de convergence. Nous étudions et developpons des méthodes numériques destinées à attenuer la malédiction de la dimension, pour lesquelles nous obtenons des estimations théoriques de complexité. Les preuves reposent sur des résultats de théorie de Perron-Frobenius non linéaire. En particulier, nous étudions les propriétés de contraction des opérateurs monotones et non expansifs, pour différentes métriques de Finsler sur un cône (métrique de Thompson, métrique projective d'Hilbert). Nous donnons par ailleurs une généralisation du "coefficient d'ergodicité de Dobrushin" à des opérateurs de Markov sur un cône général. Nous appliquons ces résultats aux systèmes de consensus ainsi qu'aux équations de Riccati généralisées apparaissant en contrôle stochastique
Dynamic programming is one of the main approaches to solve optimal control problems. It reduces the latter problems to Hamilton-Jacobi partial differential equations (PDE). Several techniques have been proposed in the literature to solve these PDE. We mention, for example, finite difference schemes, the so-called discrete dynamic programming method or semi-Lagrangian method, or the antidiffusive schemes. All these methods are grid-based, i. E. , they require a discretization of the state space, and thus suffer from the so-called curse of dimensionality. The present thesis focuses on max-plus numerical solutions and convergence analysis for medium to high dimensional deterministic optimal control problems. We develop here max-plus based numerical algorithms for which we establish theoretical complexity estimates. The proof of these estimates is based on results of nonlinear Perron-Frobenius theory. In particular, we study the contraction properties of monotone or non-expansive nonlinear operators, with respect to several classical metrics on cones (Thompson's metric, Hilbert's projective metric), and obtain nonlinear or non-commutative generalizations of the "ergodicity coefficients" arising in the theory of Markov chains. These results have applications in consensus theory and also to the generalized Riccati equations arising in stochastic optimal control

Style APA, Harvard, Vancouver, ISO itp.

19

Ferreira, Ernesto Franklin Marçal. "Melhorias de estabilidade numérica e custo computacional de aproximadores de funções valor de estado baseados em estimadores RLS para projeto online de sistemas de controle HDP-DLQR". Universidade Federal do Maranhão, 2016. http://tedebc.ufma.br:8080/jspui/handle/tede/1687.

Pełny tekst źródła

Streszczenie:

Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-06-23T20:34:27Z No. of bitstreams: 1 ErnestoFerreira.pdf: 1744167 bytes, checksum: c125c90e5eb2aab2618350567f88cb31 (MD5)
Made available in DSpace on 2017-06-23T20:34:27Z (GMT). No. of bitstreams: 1 ErnestoFerreira.pdf: 1744167 bytes, checksum: c125c90e5eb2aab2618350567f88cb31 (MD5) Previous issue date: 2016-03-08
The development and the numerical stability analysis of a new adaptive critic algorithm to approximate the state-value function for online discrete linear quadratic regulator (DLQR) optimal control system design based on heuristic dynamic programming (HDP) are presented in this work. The proposed algorithm makes use of unitary transformations and QR decomposition methods to improve the online learning e-ciency in the critic network through the recursive least-squares (RLS) approach. The developed learning strategy provides computational performance improvements in terms of numerical stability and computational cost which aim at making possible the implementations in real time of optimal control design methodology based upon actor-critic reinforcement learning paradigms. The convergence behavior and numerical stability of the proposed online algorithm, called RLSµ-QR-HDP-DLQR, are evaluated by computational simulations in three Multiple-Input and Multiple-Output (MIMO) models, that represent the automatic pilot of an F-16 aircraft of third order, a fourth order RLC circuit with two input voltages and two controllable voltage levels, and a doubly-fed induction generator with six inputs and six outputs for wind energy conversion systems.
Neste trabalho, apresenta-se o desenvolvimento e a análise da estabilidade numérica de um novo algoritmo crítico adaptativo para aproximar a função valor de estado para o projeto do sistema de controle ótimo online, utilizando o regulador linear quadrático discreto (DLQR), com base em programação dinâmica heurística (HDP). O algoritmo proposto faz uso de transformações unitárias e métodos de decomposição QR para melhorar a e-ciência da aprendizagem online na rede crítica por meio da abordagem dos mínimos quadrados recursivos (RLS). A estratégia de aprendizagem desenvolvida fornece melhorias no desempenho computacional em termos de estabilidade numérica e custo computacional, que visam tornar possíveis as implementações em tempo real da metodologia do projeto de controle ótimo com base em paradigmas de aprendizado por reforço ator-crítico. O comportamento de convergência e estabilidade numérica do algoritmo online proposto, denominado RLSµ-QR-HDP-DLQR, são avaliados por meio de simulações computacionais em três modelos Múltiplas-Entradas e Múltiplas-Saídas (MIMO), que representam o piloto automático de uma aeronave F-16 de terceira ordem, um circuito de quarta ordem RLC com duas tensões de entrada e dois níveis de tensão controláveis, e um gerador de indução duplamente alimentados com seis entradas e seis saídas para sistemas de conversão de energia eólica.

Style APA, Harvard, Vancouver, ISO itp.

20

Yu, Yi. "Radio Resource Planning in Low Power Wide Area IoT Networks". Electronic Thesis or Diss., Paris, CNAM, 2021. http://www.theses.fr/2021CNAM1287.

Pełny tekst źródła

Streszczenie:

Cette thèse étudie le problème de planification des ressources pour les réseaux IoT longues portées basés sur les technologies NB-IoT et LoRa. Dans les deux cas, on suppose que les capteurs et les collecteurs sont distribués suivant des processus de points de Poisson spatial indépendants marqués par le caractère aléatoire du canal. Pour le NB-IoT, nous élaborons un modèle de dimensionnement statistique qui estime le nombre de ressources radio nécessaires en fonction du délai d’accès toléré, de la densité des nœuds actifs, des collecteurs et de la configuration de l’antenne. Pour le réseau LoRa, nous proposons une technique d’allocation de plusieurs sous-bandes pour atténuer le niveau élevé d’interférence induit par les nœuds qui transmettent avec le même facteur d’étalement. Pour allouer dynamiquement le facteur d’étalement et la puissance, nous présentons une approche d’apprentissage automatique avec multi-agents qui permet d’améliorer l’efficacité énergétique
In this thesis, we focus on radio resource planning issues for low power wide area networks based on NB-IoT and LoRa technologies. In both cases, the average behavior of the network is considered by assuming the sensors and the collectors are distributed according to independent random Poisson Point Process marked by the channel randomness. For the NB-IoT, we elaborate a statistical dimensioning model that estimates the number of radio resources in the network depending on the tolerated delay access, the density of active nodes, the collectors, and the antenna configuration with single and multi-user transmission. For the LoRa network, we propose a multi-sub band allocation technique to mitigate the high level of interference induced by nodes that transmit with the same spreading factor. To dynamically allocate the spreading factor and the power, we present a Q-learning multi-agent approach to improve the energy efficiency

Style APA, Harvard, Vancouver, ISO itp.

21

RÊGO, Patrícia Helena Moraes. "Aprendizagem por Reforço e Programação Dinâmica Aproximada para Controle Ótimo: Uma Abordagem para o Projeto Online do Regulador Linear Quadrático Discreto com Programação Dinâmica Heurística Dependente de Estado e Ação". Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/1879.

Pełny tekst źródła

Streszczenie:

Submitted by Maria Aparecida (cidazen@gmail.com) on 2017-08-30T15:33:12Z No. of bitstreams: 1 Patricia Helena.pdf: 11110405 bytes, checksum: ca1f067231658f897d84b86181dbf1b9 (MD5)
Made available in DSpace on 2017-08-30T15:33:12Z (GMT). No. of bitstreams: 1 Patricia Helena.pdf: 11110405 bytes, checksum: ca1f067231658f897d84b86181dbf1b9 (MD5) Previous issue date: 2014-07-24
In this thesis a proposal of an uni ed approach of dynamic programming, reinforcement learning and function approximation theories aiming at the development of methods and algorithms for design of optimal control systems is presented. This approach is presented in the approximate dynamic programming context that allows approximating the optimal feedback solution as to reduce the computational complexity associated to the conventional dynamic programming methods for optimal control of multivariable systems. Speci cally, in the state and action dependent heuristic dynamic programming framework, this proposal is oriented for the development of online approximated solutions, numerically stable, of the Riccati-type Hamilton-Jacobi-Bellman equation associated to the discrete linear quadratic regulator problem which is based on a formulation that combines value function estimates by means of a RLS (Recursive Least-Squares) structure, temporal di erences and policy improvements. The development of the proposed methodologies, in this work, is focused mainly on the UDU T factorization that is inserted in this framework to improve the RLS estimation process of optimal decision policies of the discrete linear quadratic regulator, by circumventing convergence and numerical stability problems related to the covariance matrix ill-conditioning of the RLS approach.
Apresenta-se nesta tese uma proposta de uma abordagem uni cada de teorias de programação dinâmica, aprendizagem por reforço e aproximação de função que tem por objetivo o desenvolvimento de métodos e algoritmos para projeto online de sistemas de controle ótimo. Esta abordagem é apresentada no contexto de programação dinâmica aproximada que permite aproximar a solução de realimentação ótima de modo a reduzir a complexidade computacional associada com métodos convencionais de programação dinâmica para controle ótimo de sistemas multivariáveis. Especi camente, no quadro de programação dinâmica heurística e programação dinâmica heurística dependente de ação, esta proposta é orientada para o desenvolvimento de soluções aproximadas online, numericamente estáveis, da equação de Hamilton-Jacobi-Bellman do tipo Riccati associada ao problema do regulador linear quadrático discreto que tem por base uma formulação que combina estimativas da função valor por meio de uma estrutura RLS (do inglês Recursive Least-Squares), diferenças temporais e melhorias de política. O desenvolvimento das metodologias propostas, neste trabalho, tem seu foco principal voltado para a fatoração UDU T que é inserida neste quadro para melhorar o processo de estimação RLS de políticas de decisão ótimas do regulador linear quadrá- tico discreto, contornando-se problemas de convergência e estabilidade numérica relacionados com o mal condicionamento da matriz de covariância da abordagem RLS.

Style APA, Harvard, Vancouver, ISO itp.

22

Kumar, Tushar. "Characterizing and controlling program behavior using execution-time variance". Diss., Georgia Institute of Technology, 2016. http://hdl.handle.net/1853/55000.

Pełny tekst źródła

Streszczenie:

Immersive applications, such as computer gaming, computer vision and video codecs, are an important emerging class of applications with QoS requirements that are difficult to characterize and control using traditional methods. This thesis proposes new techniques reliant on execution-time variance to both characterize and control program behavior. The proposed techniques are intended to be broadly applicable to a wide variety of immersive applications and are intended to be easy for programmers to apply without needing to gain specialized expertise. First, we create new QoS controllers that programmers can easily apply to their applications to achieve desired application-speciﬁc QoS objectives on any platform or application data-set, provided the programmers verify that their applications satisfy some simple domain requirements speciﬁc to immersive applications. The controllers adjust programmer-identiﬁed knobs every application frame to effect desired values for programmer-identiﬁed QoS metrics. The control techniques are novel in that they do not require the user to provide any kind of application behavior models, and are effective for immersive applications that defy the traditional requirements for feedback controller construction. Second, we create new proﬁling techniques that provide visibility into the behavior of a large complex application, inferring behavior relationships across application components based on the execution-time variance observed at all levels of granularity of the application functionality. Additionally for immersive applications, some of the most important QoS requirements relate to managing the execution-time variance of key application components, for example, the frame-rate. The proﬁling techniques not only identify and summarize behavior directly relevant to the QoS aspects related to timing, but also indirectly reveal non-timing related properties of behavior, such as the identiﬁcation of components that are sensitive to data, or those whose behavior changes based on the call-context.

Style APA, Harvard, Vancouver, ISO itp.

23

Andrade, Gustavo Araújo de. "PROGRAMAÇÃO DINÂMICA HEURÍSTICA DUAL E REDES DE FUNÇÕES DE BASE RADIAL PARA SOLUÇÃO DA EQUAÇÃO DE HAMILTON-JACOBI-BELLMAN EM PROBLEMAS DE CONTROLE ÓTIMO". Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/517.

Pełny tekst źródła

Streszczenie:

Made available in DSpace on 2016-08-17T14:53:28Z (GMT). No. of bitstreams: 1 Dissertacao Gustavo Araujo.pdf: 2606649 bytes, checksum: efb1a5ded768b058f25d23ee8967bd38 (MD5) Previous issue date: 2014-04-28
In this work the main objective is to present the development of learning algorithms for online application for the solution of algebraic Hamilton-Jacobi-Bellman equation. The concepts covered are focused on developing the methodology for control systems, through techniques that aims to design online adaptive controllers to reject noise sensors, parametric variations and modeling errors. Concepts of neurodynamic programming and reinforcement learning are are discussed to design algorithms where the context of a given operating point causes the control system to adapt and thus present the performance according to specifications design. Are designed methods for online estimation of adaptive critic focusing efforts on techniques for gradient estimating of the environment value function.
Neste trabalho o principal objetivo é apresentar o desenvolvimento de algoritmos de aprendizagem para execução online para a solução da equação algébrica de Hamilton-Jacobi-Bellman. Os conceitos abordados se concentram no desenvolvimento da metodologia para sistemas de controle, por meio de técnicas que tem como objetivo o projeto online de controladores adaptativos são projetados para rejeitar ruídos de sensores, variações paramétricas e erros de modelagem. Conceitos de programação neurodinâmica e aprendizagem por reforço são abordados para desenvolver algoritmos onde a contextualização de determinado ponto de operação faz com que o sistema de controle se adapte e, dessa forma, apresente o desempenho de acordo com as especificações de projeto. Desenvolve-se métodos para a estimação online do crítico adaptativo concentrando os esforços em técnicas de estimação do gradiente da função valor do ambiente.

Style APA, Harvard, Vancouver, ISO itp.

24

Alizamir, Saed. "Essays on Optimal Control of Dynamic Systems with Learning". Diss., 2013. http://hdl.handle.net/10161/8066.

Pełny tekst źródła

Streszczenie:

This dissertation studies the optimal control of two different dynamic systems with learning: (i) diagnostic service systems, and (ii) green incentive policy design. In both cases, analytical models have been developed to improve our understanding of the system, and managerial insights are gained on its optimal management.

We first consider a diagnostic service system in a queueing framework, where the service is in the form of sequential hypothesis testing. The agent should dynamically weigh the benefit of performing an additional test on the current task to improve the accuracy of her judgment against the incurred delay cost for the accumulated workload. We analyze the accuracy/congestion tradeoff in this setting and fully characterize the structure of the optimal policy. Further, we allow for admission control (dismissing tasks from the queue without processing) in the system, and derive its implications on the structure of the optimal policy and system's performance.

We then study Feed-in-Tariff (FIT) policies, which are incentive mechanisms by governments to promote renewable energy technologies. We focus on two key network externalities that govern the evolution of a new technology in the market over time: (i) technological learning, and (ii) social learning. By developing an intertemporal model that captures these dynamics, we investigate how lawmakers should leverage on such effects to make FIT policies more efficient. We contrast our findings against the current practice of FIT-implementing jurisdictions, and also determine how the FIT regimes should depend on specific technology and market characteristics.

Dissertation

Style APA, Harvard, Vancouver, ISO itp.

25

LI, JHIH-CHENG, i 黎致呈. "Dynamic Intraday Exchange Rate Forecasting using Machine Learning Methods". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/9d3wcx.

Pełny tekst źródła

Streszczenie:

碩士
輔仁大學
統計資訊學系應用統計碩士班
105
This study applies Random Forest model to forecasting the exchange rate of USD、JPY、EUR、CNY. This study uses spot buying rate of mean, standard deviation, maximum, minimum, starting value, and end value every 30 minutes as the research variables.The empirical interval is from May 12, 2016 to September 25, 2016. The neural network and support vector regression are used as the benchmarks. The empirical results show that the use of the intraday data as the training sample can reduce the prediction error, and the random forest is better than the neural network and the support vector regression.

Style APA, Harvard, Vancouver, ISO itp.

26

Walton, Zachary. "Optimal Control of Perimeter Patrol Using Reinforcement Learning". Thesis, 2011. http://hdl.handle.net/1969.1/ETD-TAMU-2011-05-9520.

Pełny tekst źródła

Streszczenie:

Unmanned Aerial Vehicles (UAVs) are being used more frequently in surveillance scenarios for both civilian and military applications. One such application addresses a UAV patrolling a perimeter, where certain stations can receive alerts at random intervals. Once the UAV arrives at an alert site it can take two actions: 1. Loiter and gain information about the site. 2. Move on around the perimeter. The information that is gained is transmitted to an operator to allow him to classify the alert. The information is a function of the amount of time the UAV is at the alert site, also called the dwell time, and the maximum delay. The goal of the optimization is to classify the alert so as to maximize the expected discounted information gained by the UAV's actions at a station about an alert. This optimization problem can be readily solved using Dynamic Programming. Even though this approach generates feasible solutions, there are reasons to experiment with different approaches. A complication for Dynamic Programming arises when the perimeter patrol problem is expanded. This is that the number of states increases rapidly when one adds additional stations, nodes, or UAVs to the perimeter. This in effect greatly increases the computation time making the determination of the solution intractable. The following attempts to alleviate this problem by implementing a Reinforcement Learning technique to obtain the optimal solution, more specifically Q-Learning. Reinforcement Learning is a simulation-based version of Dynamic Programming and requires lesser information to compute sub-optimal solutions. The effectiveness of the policies generated using Reinforcement Learning for the perimeter patrol problem have been corroborated numerically in this thesis.

Style APA, Harvard, Vancouver, ISO itp.

27

Buckland, Kenneth M. "Optimal control of dynamic systems through the reinforcement learning of transition points". Thesis, 1994. http://hdl.handle.net/2429/6896.

Pełny tekst źródła

Streszczenie:

This work describes the theoretical development and practical application of transition point dynamic programming (TPDP). TPDP is a memory-based, reinforcement learning, direct dynamic programming approach to adaptive optimal control that can reduce the learning time and memory usage required for the control of continuous stochastic dynamic systems. TPDP does so by determining an ideal set of transition points (TPs) which specify, at various system states, only the control action changes necessary for optimal control. TPDP converges to an ideal TP set by using a variation of Q-learning to assess the merits of adding, swapping and removing TPs from states throughout the state space. This work first presents how optimal control is achieved using dynamic programming, in particular Q-learning. It then presents the basic TPDP concept and proof that TPDP converges to an ideal set of TPs. After the formal presentation of TPDP, a Practical TPDP Algorithm will be described which facilitates the application of TPDP to practical problems. The compromises made to achieve good performance with the Practical TPDP Algorithm invalidate the TPDP convergence proofs, but near optimal control policies were nevertheless learned in the practical problems considered. These policies were learned very quickly compared to conventional Q-learning, and less memory was required during the learning process. A neural network implementation of TPDP is also described, and the possibility of this neural network being a plausible model of biological movement control is speculated upon. Finally, the incorporation of TPDP into a complete hierarchical controller is discussed, and potential enhancements of TPDP are presented.

Style APA, Harvard, Vancouver, ISO itp.

28

Tsai, Ding-Yu, i 蔡定宇. "A Study of the Dynamic Programming on Optimal Contract Capacity for a Time-of-Use Rate User". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/f68j6m.

Pełny tekst źródła

Streszczenie:

碩士
國立臺北科技大學
電機工程系優質電力產業研發專班
99
Currently, there are four parts of TAIPOWER Company’s electricity cost are composed of demand charge, energy charge, power factor charge and penalty charge. In order to save power rate spending, the research reduce electricity cost on demand charge and penalty charge but we don’t need to buy any machine. The demand charge and penalty charge that are obtained according to contract capacity with TAIPOWER Company. Therefore, how to make an optimal contract capacity is the common problem for the consumer to face. Because general consumers are short of the judgment to revise optimal contract capacity, therefore sign unsuitable contract capacity constantly. When you ask the views of TAIPOWER, then they will give you some advices about contract capacity, but they won’t remind you that it’s suitability or not for the contract capacity in the future. Therefore, Best way to save cost is to check the appropriateness of contract capacity frequently. The thesis major is researching a dynamic optimal contract capacity. First, the research get optimal contract capacity by sequential search and use dynamic programming to adjust the monthly contract capacity, to calculate the optimal contract capacity. The research result, it’s effective to adjust the contract capacity for cost down by dynamic programming. The research could be a reference for seasonal and power of demand varies greatly user. To sure their own set of contract capacity is appropriate.

Style APA, Harvard, Vancouver, ISO itp.

29

Chou, Hsiang-Chai, i 周祥在. "Dynamic Recursive-Based Optimal Learning Fuzzy Systems Design－Constructing Financial Time-series Prediction System". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/19935391766580764909.

Pełny tekst źródła

Streszczenie:

碩士
國立金門技術學院
電資研究所
97
The tremendous sudden variation and complex nonlinear dimensionality of the stock prices is a big challenging task in stock markets predicting and trading problems. In this thesis, the dynamic recursive-based optimal learning algorithm is designed to build financial time-series fuzzy prediction systems. The fuzzy system with suitable human-beings decision rules is determined to identify behavior of the discussed time-series stock data. Containing with the adapt ability of the fuzzy inference system, the proposed fuzzy prediction systems can reduce the affect in some of non-quantifiable political, social noise and interference. The objective of this stock prediction system is to reduce investment risk, and enable investors to reap the maximum profit. The dynamic clustering method is first applied to configure the initial architecture of the fuzzy prediction system. Selected fuzzy-rules number is equal to the number of cluster centers and cluster center results will be assigned to locate the initial center position of the membership function. The particle swarm optimizationm (PSO) and recursive least square (RLS) methods are integrated to tune fuzzy system parameters for approximating toward the trade feature of the collected Taiwan's weighted index. Experiment results compared with other machine learning schemes for the prediction of the stock prices and tracking trend are illustrated to demonstrate our proposed system having better sell and buy winning stratagem.

Style APA, Harvard, Vancouver, ISO itp.

30

Chen, Kuan-Zung, i 陳冠榮. "Using credit assignment and GBF with dynamic learning rate to enhance the ability of low-dimensional CMAC". Thesis, 2004. http://ndltd.ncl.edu.tw/handle/14666150716669599215.

Pełny tekst źródła

Streszczenie:

碩士
大同大學
電機工程學系(所)
92
Conventional CMAC has a memory size problem at high dimensional input space. Using the neural network structure composed of small CMACs can efficiently solve this problem. By using the advantage of credit assignment and Gaussian basis function, we use it to improve the performance of the neural network structure composed of small CMACs. The simulation result shows that it still has weakness. The dynamic learning rate and repeat training are the concepts from conventional CMAC. We use the dynamic learning rate and repeat training to overcome the weakness. The simulation shows that dynamic learning rate indeed have better performance, and dynamic learning rate with Gaussian basis function could achieve the best result.

Style APA, Harvard, Vancouver, ISO itp.

31

Wang, Ko-Jie, i 王科傑. "Adaptive PD Fuzzy Control with Dynamic Learning Rate for Two-Wheeled Balancing Six Degrees of Freedom Robotic Arm". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/6grm9v.

Pełny tekst źródła

Streszczenie:

碩士
國立臺灣科技大學
電機工程系
102
This thesis puts forward a new design of robot system, two-wheeled balancing six degrees of freedom robotic arm with an adaptive PD fuzzy controller with dynamic learning rate. The robot system considered is to combine a six-degrees-of-freedom robotic arm and a two-wheeled balancing robot. The robot system is to equip the robotic am with mobility. But due to the motion of the arm, the stability issue becomes a challenge task for the system. This study employs the adaptive fuzzy control idea to provide suitable controller for the system. The proposed controller uses PD controllers for fuzzy system outputs. From our experiments, it is observed that to have nice performance, the learning rate is positive correlated to the extra speed. Thus, in this study, we propose to use a learning rate as a function of speed for the adaptive system. In order to show the superiority of the proposed controller, in our experiments, PD, fuzzy, adaptive fuzzy, PD fuzzy and adaptive PD fuzzy controllers are also employed. In the experiments, extra speed and extra moment of inertia with extra torque conditions are included. From those experiments, it can found the proposed approach indeed has better performance overall. In addition, the controller is used in the real system and it is compared to the use of PD controller which is commonly used. Again, our controller is better. In other words, the robot system can work with the proposed controller well to enhance control of balancing the robot.

Style APA, Harvard, Vancouver, ISO itp.

32

Sekgwelea, Sello Molefe. "Dynamic approach in the application of information communication technologies models in the provision of flexible learning for distance education". Thesis, 2007. http://hdl.handle.net/10500/2535.

Pełny tekst źródła

Streszczenie:

The main purpose of this research is to establish whether ICT models as implemented in distance education do help to render desirable results (increment in throughput, meeting clientele expectations, and reduction in learner drop-outs). If it is not the case, what could be done to overcome the established hindrance? The researcher employed programme evaluation (PE) which integrates both the positivistic and phenomenological aspects of research. The samples were drawn from the population group through probability and non-probability techniques. Different research strategies within PE such as discovery, inspection and auditing were at first employed to gauge the physical presence of what is being achieved by Unisa through use of myUnisa & DVC; followed by the use of the surveys (personal interviews, administered questionnaires, focus group interviews). The ultimate outcomes of the said research activities are audiovisual recordings, statistically analysed transcripts and questionnaire data. The researcher employed the following key questions in grappling with issues in this area; their findings are also given: i. Does the application of ICTs facilitate and enhance flexible learning at Unisa? With reference to flexible delivery as it relates to aspect of teaching and learning in Engineering, it has been established that minimal use is made of ICTs. Are the technologies correctly applied for teaching and learning? Based on the evidence of research findings it has been established that technology application is mainly used for administrative support rather than for teaching and learning. iii. Do the instructional design and technological applications meet the needs of their users? As matters stand, the study suggests that users' expectations through rating their perceptions and attitudes (academics, tutors, instructional designers, multimedia developers and learners), are far from being met (as all the critical parts of the models are not yet in place regarding the Engineering and other departments). According to the main finding, while there is some evidence of efforts aimed at proper implementation, underutilisation of the ICTs appears to be the main problem, as established at Unisa and elsewhere. The research is concluded through a number of recommendations based on the established findings.
Educational Studies
(D. Ed. (Curriculum Studies))

Style APA, Harvard, Vancouver, ISO itp.

33

Sindhu, P. R. "Algorithms for Product Pricing and Energy Allocation in Energy Harvesting Sensor Networks". Thesis, 2014. http://etd.iisc.ernet.in/2005/3505.

Pełny tekst źródła

Streszczenie:

In this thesis, we consider stochastic systems which arise in diﬀerent real-world application contexts. The ﬁrst problem we consider is based on product adoption and pricing. A monopolist selling a product has to appropriately price the product over time in order to maximize the aggregated proﬁt. The demand for a product is uncertain and is inﬂuenced by a number of factors, some of which are price, advertising, and product technology. We study the inﬂuence of price on the demand of a product and also how demand aﬀects future prices. Our approach involves mathematically modelling the variation in demand as a function of price and current sales. We present a simulation-based algorithm for computing the optimal price path of a product for a given period of time. The algorithm we propose uses a smoothed-functional based performance gradient descent method to ﬁnd a price sequence which maximizes the total proﬁt over a planning horizon. The second system we consider is in the domain of sensor networks. A sensor network is a collection of autonomous nodes, each of which senses the environment. Sensor nodes use energy for sensing and communication related tasks. We consider the problem of ﬁnding optimal energy sharing policies that maximize the network performance of a system comprising of multiple sensor nodes and a single energy harvesting(EH) source. Nodes periodically sense a random ﬁeld and generate data, which is stored in their respective data queues. The EH source harnesses energy from ambient energy sources and the generated energy is stored in a buﬀer. The nodes require energy for transmission of data and and they receive the energy for this purpose from the EH source. There is a need for eﬃciently sharing the stored energy in the EH source among the nodes in the system, in order to minimize average delay of data transmission over the long run. We formulate this problem in the framework of average cost inﬁnite-horizon Markov Decision Processes[3],[7]and provide algorithms for the same.

Style APA, Harvard, Vancouver, ISO itp.

34

(8072417), Braiden M. Frantz. "Active Shooter Mitigation for Open-Air Venues". Thesis, 2021.

Znajdź pełny tekst źródła

Streszczenie:

This dissertation examines the impact of active shooters upon patrons attending large outdoor events. There has been a spike in shooters targeting densely populated spaces in recent years, to include open-air venues. The 2019 Gilroy Garlic Festival was selected for modeling replication using AnyLogic software to test various experiments designed to reduce casualties in the event of an active shooter situation. Through achievement of validation to produce identical outcomes of the real-world Gilroy Garlic Festival shooting, the researcher established a reliable foundational model for experimental purposes. This active shooter research project identifies the need for rapid response efforts to neutralize the shooter(s) as quickly as possible to minimize casualties. Key findings include the importance of armed officers patrolling event grounds to reduce response time, the need for adequate exits during emergency evacuations, incorporation of modern technology to identify the shooter’s location, and applicability of a 1:548 police to patron ratio.

Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat „Dynamic optimal learning rate”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych