Siga este link para ver outros tipos de publicações sobre o tema: Learning for planning.

Teses / dissertações sobre o tema "Learning for planning"

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos

Selecione um tipo de fonte:

Veja os 50 melhores trabalhos (teses / dissertações) para estudos sobre o assunto "Learning for planning".

Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.

Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.

Veja as teses / dissertações das mais diversas áreas científicas e compile uma bibliografia correta.

1

Goodspeed, Robert (Robert Charles). "Planning support systems for spatial planning through social learning". Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/81739.

Texto completo da fonte
Resumo:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Urban Studies and Planning, 2013.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (p. 240-271).
This dissertation examines new professional practices in urban planning that utilize new types of spatial planning support systems (PSS) based on geographic information systems (GIS) software. Through a mixed-methods research design, the dissertation investigates the role of these new technologies in planning workshops, processes, and as metropolitan infrastructures. In particular, PSS are viewed as supporting social learning in spatial planning processes. The study includes cases in Boston, Kansas City, and Austin. The findings indicate high levels of social learning, broadly confirming the collaborative planning theory literature. Participants at planning workshops that incorporated embodied computing interaction designs reported higher levels of two forms of learning drawn from Argyris and Schöns' theory of organizational learning: single and double loop learning. Single loop learning is measured as reported learning. Double loop learning, characterized by deliberation about goals and values, is measured with a novel summative scale. These workshops utilized PSS to contribute indicators to the discussion through the use of paper maps for input and human operators for output. A regression analysis reveals that the PSS contributed to learning by encouraging imagination, engagement, and alignment. Participantsʼ perceived identities as planners, personality characteristics, and frequency of meeting attendance were also related to the learning outcomes. However, less learning was observed at workshops with many detailed maps and limited time for discussion, and exercises lacking PSS feedback. The development of PSS infrastructure is investigated by conducting a qualitative analysis of focus groups of professional planners, and a case where a PSS was planned but not implemented. The dissertation draws on the research literatures on learning, PSS and urban computer models, and planning theory. The research design is influenced by a sociotechnical perspective and design research paradigms from several fields. The dissertation argues social learning is required to achieve many normative goals in planning, such as institutional change and urban sustainability. The relationship between planning processes and outcomes, and implications of information technology trends for PSS and spatial planning are discussed.
by Robert Goodspeed.
Ph.D.
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Zettlemoyer, Luke S. (Luke Sean) 1978. "Learning probabilistic relational planning rules". Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87896.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Park, Sooho S. M. Massachusetts Institute of Technology. "Learning for informative path planning". Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/45887.

Texto completo da fonte
Resumo:
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
Includes bibliographical references (p. 104-108).
Through the combined use of regression techniques, we will learn models of the uncertainty propagation efficiently and accurately to replace computationally intensive Monte- Carlo simulations in informative path planning. This will enable us to decrease the uncertainty of the weather estimates more than current methods by enabling the evaluation of many more candidate paths given the same amount of resources. The learning method and the path planning method will be validated by the numerical experiments using the Lorenz-2003 model [32], an idealized weather model.
by Sooho Park.
S.M.
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

Junyent, Barbany Miquel. "Width-Based Planning and Learning". Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/672779.

Texto completo da fonte
Resumo:
Optimal sequential decision making is a fundamental problem to many diverse fields. In recent years, Reinforcement Learning (RL) methods have experienced unprecedented success, largely enabled by the use of deep learning models, reaching human-level performance in several domains, such as the Atari video games or the ancient game of Go. In contrast to the RL approach in which the agent learns a policy from environment interaction samples, ignoring the structure of the problem, the planning approach for decision making assumes known models for the agent's goals and domain dynamics, and focuses on determining how the agent should behave to achieve its objectives. Current planners are able to solve problem instances involving huge state spaces by precisely exploiting the problem structure that is defined in the state-action model. In this work we combine the two approaches, leveraging fast and compact policies from learning methods and the capacity to perform lookaheads in combinatorial problems from planning methods. In particular, we focus on a family of planners called width-based planners, that has demonstrated great success in recent years due to its ability to scale independently of the size of the state space. The basic algorithm, Iterated Width (IW), was originally proposed for classical planning problems, where a model for state transitions and goals, represented by sets of atoms, is fully determined. Nevertheless, width-based planners do not require a fully defined model of the environment, and can be used with simulators. For instance, they have been recently applied in pixel domains such as the Atari games. Despite its success, IW is purely exploratory, and does not leverage past reward information. Furthermore, it requires the state to be factored into features that need to be pre-defined for the particular task. Moreover, running the algorithm with a width larger than 1 in practice is usually computationally intractable, prohibiting IW from solving higher width problems. We begin this dissertation by studying the complexity of width-based methods when the state space is defined by multivalued features, as in the RL setting, instead of Boolean atoms. We provide a tight upper bound on the amount of nodes expanded by IW, as well as overall algorithmic complexity results. In order to deal with more challenging problems (i.e., those with a width higher than 1), we present a hierarchical algorithm that plans at two levels of abstraction. A high-level planner uses abstract features that are incrementally discovered from low-level pruning decisions. We illustrate this algorithm in classical planning PDDL domains as well as in pixel-based simulator domains. In classical planning, we show how IW(1) at two levels of abstraction can solve problems of width 2. To leverage past reward information, we extend width-based planning by incorporating an explicit policy in the action selection mechanism. Our method, called π-IW, interleaves width-based planning and policy learning using the state-actions visited by the planner. The policy estimate takes the form of a neural network and is in turn used to guide the planning step, thus reinforcing promising paths. Notably, the representation learned by the neural network can be used as a feature space for the width-based planner without degrading its performance, thus removing the requirement of pre-defined features for the planner. We compare π-IW with previous width-based methods and with AlphaZero, a method that also interleaves planning and learning, in simple environments, and show that π-IW has superior performance. We also show that the π-IW algorithm outperforms previous width-based methods in the pixel setting of Atari games suite. Finally, we show that the proposed hierarchical IW can be seamlessly integrated with our policy learning scheme, resulting in an algorithm that outperforms flat IW-based planners in Atari games with sparse rewards.
La presa seqüencial de decisions òptimes és un problema fonamental en diversos camps. En els últims anys, els mètodes d'aprenentatge per reforç (RL) han experimentat un èxit sense precedents, en gran part gràcies a l'ús de models d'aprenentatge profund, aconseguint un rendiment a nivell humà en diversos dominis, com els videojocs d'Atari o l'antic joc de Go. En contrast amb l'enfocament de RL, on l'agent aprèn una política a partir de mostres d'interacció amb l'entorn, ignorant l'estructura del problema, l'enfocament de planificació assumeix models coneguts per als objectius de l'agent i la dinàmica del domini, i es basa en determinar com ha de comportar-se l'agent per aconseguir els seus objectius. Els planificadors actuals són capaços de resoldre problemes que involucren grans espais d'estats precisament explotant l'estructura del problema, definida en el model estat-acció. En aquest treball combinem els dos enfocaments, aprofitant polítiques ràpides i compactes dels mètodes d'aprenentatge i la capacitat de fer cerques en problemes combinatoris dels mètodes de planificació. En particular, ens enfoquem en una família de planificadors basats en el width (ample), que han tingut molt èxit en els últims anys gràcies a que la seva escalabilitat és independent de la mida de l'espai d'estats. L'algorisme bàsic, Iterated Width (IW), es va proposar originalment per problemes de planificació clàssica, on el model de transicions d'estat i objectius ve completament determinat, representat per conjunts d'àtoms. No obstant, els planificadors basats en width no requereixen un model de l'entorn completament definit i es poden utilitzar amb simuladors. Per exemple, s'han aplicat recentment a dominis gràfics com els jocs d'Atari. Malgrat el seu èxit, IW és un algorisme purament exploratori i no aprofita la informació de recompenses anteriors. A més, requereix que l'estat estigui factoritzat en característiques, que han de predefinirse per a la tasca en concret. A més, executar l'algorisme amb un width superior a 1 sol ser computacionalment intractable a la pràctica, el que impedeix que IW resolgui problemes de width superior. Comencem aquesta tesi estudiant la complexitat dels mètodes basats en width quan l'espai d'estats està definit per característiques multivalor, com en els problemes de RL, en lloc d'àtoms booleans. Proporcionem un límit superior més precís en la quantitat de nodes expandits per IW, així com resultats generals de complexitat algorísmica. Per fer front a problemes més complexos (és a dir, aquells amb un width superior a 1), presentem un algorisme jeràrquic que planifica en dos nivells d'abstracció. El planificador d'alt nivell utilitza característiques abstractes que es van descobrint gradualment a partir de decisions de poda en l'arbre de baix nivell. Il·lustrem aquest algorisme en dominis PDDL de planificació clàssica, així com en dominis de simuladors gràfics. En planificació clàssica, mostrem com IW(1) en dos nivells d'abstracció pot resoldre problemes de width 2. Per aprofitar la informació de recompenses passades, incorporem una política explícita en el mecanisme de selecció d'accions. El nostre mètode, anomenat π-IW, intercala la planificació basada en width i l'aprenentatge de la política usant les accions visitades pel planificador. Representem la política amb una xarxa neuronal que, al seu torn, s'utilitza per guiar la planificació, reforçant així camins prometedors. A més, la representació apresa per la xarxa neuronal es pot utilitzar com a característiques per al planificador sense degradar el seu rendiment, eliminant així el requisit d'usar característiques predefinides. Comparem π-IW amb mètodes anteriors basats en width i amb AlphaZero, un mètode que també intercala planificació i aprenentatge, i mostrem que π-IW té un rendiment superior en entorns simples. També mostrem que l'algorisme π-IW supera altres mètodes basats en width en els jocs d'Atari. Finalment, mostrem que el mètode IW jeràrquic proposat pot integrar-se fàcilment amb el nostre esquema d'aprenentatge de la política, donant com a resultat un algorisme que supera els planificadors no jeràrquics basats en IW en els jocs d'Atari amb recompenses distants.
La toma secuencial de decisiones óptimas es un problema fundamental en diversos campos. En los últimos años, los métodos de aprendizaje por refuerzo (RL) han experimentado un éxito sin precedentes, en gran parte gracias al uso de modelos de aprendizaje profundo, alcanzando un rendimiento a nivel humano en varios dominios, como los videojuegos de Atari o el antiguo juego de Go. En contraste con el enfoque de RL, donde el agente aprende una política a partir de muestras de interacción con el entorno, ignorando la estructura del problema, el enfoque de planificación asume modelos conocidos para los objetivos del agente y la dinámica del dominio, y se basa en determinar cómo debe comportarse el agente para lograr sus objetivos. Los planificadores actuales son capaces de resolver problemas que involucran grandes espacios de estados precisamente explotando la estructura del problema, definida en el modelo estado-acción. En este trabajo combinamos los dos enfoques, aprovechando políticas rápidas y compactas de los métodos de aprendizaje y la capacidad de realizar búsquedas en problemas combinatorios de los métodos de planificación. En particular, nos enfocamos en una familia de planificadores basados en el width (ancho), que han demostrado un gran éxito en los últimos años debido a que su escalabilidad es independiente del tamaño del espacio de estados. El algoritmo básico, Iterated Width (IW), se propuso originalmente para problemas de planificación clásica, donde el modelo de transiciones de estado y objetivos viene completamente determinado, representado por conjuntos de átomos. Sin embargo, los planificadores basados en width no requieren un modelo del entorno completamente definido y se pueden utilizar con simuladores. Por ejemplo, se han aplicado recientemente en dominios gráficos como los juegos de Atari. A pesar de su éxito, IW es un algoritmo puramente exploratorio y no aprovecha la información de recompensas anteriores. Además, requiere que el estado esté factorizado en características, que deben predefinirse para la tarea en concreto. Además, ejecutar el algoritmo con un width superior a 1 suele ser computacionalmente intratable en la práctica, lo que impide que IW resuelva problemas de width superior. Empezamos esta tesis estudiando la complejidad de los métodos basados en width cuando el espacio de estados está definido por características multivalor, como en los problemas de RL, en lugar de átomos booleanos. Proporcionamos un límite superior más preciso en la cantidad de nodos expandidos por IW, así como resultados generales de complejidad algorítmica. Para hacer frente a problemas más complejos (es decir, aquellos con un width superior a 1), presentamos un algoritmo jerárquico que planifica en dos niveles de abstracción. El planificador de alto nivel utiliza características abstractas que se van descubriendo gradualmente a partir de decisiones de poda en el árbol de bajo nivel. Ilustramos este algoritmo en dominios PDDL de planificación clásica, así como en dominios de simuladores gráficos. En planificación clásica, mostramos cómo IW(1) en dos niveles de abstracción puede resolver problemas de width 2. Para aprovechar la información de recompensas pasadas, incorporamos una política explícita en el mecanismo de selección de acciones. Nuestro método, llamado π-IW, intercala la planificación basada en width y el aprendizaje de la política usando las acciones visitadas por el planificador. Representamos la política con una red neuronal que, a su vez, se utiliza para guiar la planificación, reforzando así caminos prometedores. Además, la representación aprendida por la red neuronal se puede utilizar como características para el planificador sin degradar su rendimiento, eliminando así el requisito de usar características predefinidas. Comparamos π-IW con métodos anteriores basados en width y con AlphaZero, un método que también intercala planificación y aprendizaje, y mostramos que π-IW tiene un rendimiento superior en entornos simples. También mostramos que el algoritmo π-IW supera otros métodos basados en width en los juegos de Atari. Finalmente, mostramos que el IW jerárquico propuesto puede integrarse fácilmente con nuestro esquema de aprendizaje de la política, dando como resultado un algoritmo que supera a los planificadores no jerárquicos basados en IW en los juegos de Atari con recompensas distantes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Dearden, Richard W. "Learning and planning in structured worlds". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0020/NQ56531.pdf.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
6

Madigan-Concannon, Liam. "Planning for life : involving adults with learning disabilities in service planning". Thesis, London School of Economics and Political Science (University of London), 2003. http://etheses.lse.ac.uk/2664/.

Texto completo da fonte
Resumo:
Policies for people with learning disabilities, as in the case with other groups of service users, have increasingly emphasised the importance of their involvement in the planning of their own services, and at a more general level in the development of their local authority community care plan and commissioning strategies. This thesis seeks to begin to explore some of the difficulties that may arise in attempting to implement such a policy through a case study of practices in one inner London Borough. The study includes a number of important interrelated themes including: the complexities of communication, normalisation, the nature of choice, citizenship and free will, and asks does social policy reform provision or create unrealistic expectations and burdens for social service professionals and service users. It is essentially a study about communication and its impact on choice and social inclusion. Focusing on communication between professionals and service users, their carers and advocates, the field study investigates the Council's strategic planning procedures in order to explore the relationship between service development and the preferences expressed by users. The findings are presented within a legislative framework, with particular interest paid to the government's White Papers 'Modernising Social Services,' 'Valuing People,' and the Best Value initiative. The study combines an historical account of policy development, and investigates social policies that have attempted to bring about change, while also exposing the contradictions within and between them. Because of this there are many challenges attached to this enterprise, and as a consequence the study is inevitably on a small scale and the answers it produces are tentative. Nevertheless it provides an indication of the nature and scale of the difficulties which social services will have to overcome if they are to make a reality of government policy in this area by engaging effectively with the personal experiences and lives of adults with learning disabilities and their carers.
Estilos ABNT, Harvard, Vancouver, APA, etc.
7

Mäntysalo, R. (Raine). "Land-use planning as inter-organizational learning". Doctoral thesis, University of Oulu, 2000. http://urn.fi/urn:isbn:9514258444.

Texto completo da fonte
Resumo:
Abstract The aim of the study is to reveal the nature of learning in local land-use planning activity and to examine the possibilities for the development of planning as a form of learning activity. The theoretical approach draws on the pragmatist and dialectical reorientation of systems theory and the related theory of learning organizations. The traditional, positivist systems approach to land-use planning is considered both to depoliticize planning and to make it unreflective. Critical theory as a basis of planning theory is also shown to be inadequate. Communicative planning theories that draw on critical theory are rather theories of emancipation in the context of planning than theories of planning per se. An alternative systems-theoretical view to land-use planning activity is presented, where critical and constructive aspects as well as ethical and pragmatic aspects are interlinked in the dialectical dynamics of planning as organizational and inter-organizational learning activity. Three subsystems within the system of local land-use planning are identified: expertise, politics and economics. The subsystems of land-use planning build upon the basic distinction between legitimate and illegitimate conduct. For each subsystem, the context of its existence is formed by the interaction of all subsystems. By acting, each subsystem inevitably changes its dialectical relationship to this context. Harmful changes are felt within the subsystem as inner contradictions that interfere with its decision-making activity. If the subsystem is unable to face these contradictions but instead resorts to the use of pathological power, they may develop into paralyzing double bind situations. The resolution of a double bind situation requires expansive learning by the subsystem. However, there are also contradictions in land-use planning that the subsystems are unable to resolve by expansive learning. Such inter-systemic contradictions stem from the dialectical relationship between the overriding requirement of legitimacy on one hand and the basic goals of expert knowledge and economic profit on the other. In the study a hypothesis is formulated, according to which these basic - and, in the conditions of modern society, permanent - contradictions in local land-use planning require such inter-organizational learning, which enables the creation of planning solutions that provide means for their task-related harmonization, and, in the longer term, contributes to the emergence of a participative planning culture where the contradictions can be handled legitimately, if not resolved.
Estilos ABNT, Harvard, Vancouver, APA, etc.
8

Grant, Timothy John. "Inductive learning of knowledge-based planning operators". [Maastricht : Maastricht : Rijksuniversiteit Limburg] ; University Library, Maastricht University [Host], 1996. http://arno.unimaas.nl/show.cgi?fid=6686.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
9

Baldassarre, Gianluca. "Planning with neural networks and reinforcement learning". Thesis, University of Essex, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.252285.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
10

Newton, Muhammad Abdul Hakim. "Wizard : learning macro-actions comprehensively for planning". Thesis, University of Strathclyde, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.501841.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
11

Tynong, Anton. "Machine learning for planning in warehouse management". Thesis, Linköpings universitet, Kommunikations- och transportsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178108.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
12

Morere, Philippe. "Bayesian Optimisation for Planning And Reinforcement Learning". Thesis, The University of Sydney, 2019. https://hdl.handle.net/2123/21230.

Texto completo da fonte
Resumo:
This thesis addresses the problem of achieving efficient non-myopic decision making by explicitly balancing exploration and exploitation. Decision making, both in planning and reinforcement learning (RL), enables agents or robots to complete tasks by acting on their environments. Complexity arises when completing objectives requires sacrificing short-term performance in order to achieve better long-term performance. Decision making algorithms with this characteristic are known as non-myopic, and require long sequences of actions to be evaluated, thereby greatly increasing the search space size. Optimal behaviours need balance two key quantities: exploration and exploitation. Exploitation takes advantage of previously acquired information or high performing solutions, whereas exploration focuses on acquiring more informative data. The balance between these quantities is crucial in both RL and planning. This thesis brings the following contributions: Firstly, a reward function trading off exploration and exploitation of gradients for sequential planning is proposed. It is based on Bayesian optimisation (BO) and is combined to a non-myopic planner to achieve efficient spatial monitoring. Secondly, the algorithm is extended to continuous actions spaces, called continuous belief tree search (CBTS), and uses BO to dynamically sample actions within a tree search, balancing high-performing actions and novelty. Finally, the framework is extended to RL, for which a multi-objective methodology for explicit exploration and exploitation balance is proposed. The two objectives are modelled explicitly and balanced at a policy level, as in BO. This allows for online exploration strategies, as well as a data-efficient model-free RL algorithm achieving exploration by minimising the uncertainty of Q-values (EMU-Q). The proposed algorithms are evaluated on different simulated and real-world robotics problems, displaying superior performance in terms of sample efficiency and exploration.
Estilos ABNT, Harvard, Vancouver, APA, etc.
13

Weber, Christopher H. "Planning, Acting, and Learning in Incomplete Domains". DigitalCommons@USU, 2012. https://digitalcommons.usu.edu/etd/1168.

Texto completo da fonte
Resumo:
The engineering of complete planning domain descriptions is often very costly because of human error or lack of domain knowledge. Learning complete domain descriptions is also very challenging because many features are irrelevant to achieving the goals and data may be scarce. Given incomplete knowledge of their actions, agents can ignore the incompleteness, plan around it, ask questions of a domain expert, or learn through trial and error. Our agent Goalie learns about the preconditions and effects of its incompletely-specified actions by monitoring the environment state. In conjunction with the plan failure explanations generated by its planner DeFault, Goalie diagnoses past and future action failures. DeFault computes failure explanations for each action and state in the plan and counts the number of incomplete domain interpretations wherein failure will occur. The questionasking strategies employed by our extended Goalie agent using these conjunctive normal form-based plan failure explanations are goal-directed and attempt to approach always successful execution while asking the fewest questions possible. In sum, Goalie: i) interleaves acting, planning, and question-asking; ii) synthesizes plans that avoid execution failure due to ignorance of the domain model; iii) uses these plans to identify relevant (goal-directed) questions; iv) passively learns about the domain model during execution to improve later replanning attempts; v) and employs various targeted (goal-directed) strategies to ask questions (actively learn). Our planner DeFault is the first reason about a domain's incompleteness to avoid potential plan failure. We show that DeFault performs best by counting prime implicants (failure diagnoses) rather than propositional models. Further, we show that by reasoning about incompleteness in planning (as opposed to ignoring it), Goalie fails and replans less often, and executes fewer actions. Finally, we show that goal-directed knowledge acquisition - prioritizing questions based on plan failure diagnoses - leads to fewer questions, lower overall planning and replanning time, and higher success rates than approaches that naively ask many questions or learn by trial and error.
Estilos ABNT, Harvard, Vancouver, APA, etc.
14

Dos, Santos De Oliveira Rafael. "Bayesian Optimisation for Planning under Uncertainty". Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/20762.

Texto completo da fonte
Resumo:
Under an increasing demand for data to understand critical processes in our world, robots have become powerful tools to automatically gather data and interact with their environments. In this context, this thesis addresses planning problems where limited prior information leads to uncertainty about the outcomes of a robot's decisions. The methods are based on Bayesian optimisation (BO), which provides a framework to solve planning problems under uncertainty by means of probabilistic modelling. As a first contribution, the thesis provides a method to find energy-efficient paths over unknown terrains. The method applies a Gaussian process (GP) model to learn online how a robot's power consumption varies as a function of its configuration while moving over the terrain. BO is applied to optimise trajectories over the GP model being learnt so that they are informative and energetically efficient. The method was tested in experiments on simulated and physical environments. A second contribution addresses the problem of policy search in high-dimensional parameter spaces. To deal with high dimensionality the method combines BO with a coordinate-descent scheme that greatly improves BO's performance when compared to conventional approaches. The method was applied to optimise a control policy for a race car in a simulated environment and shown to outperform other optimisation approaches. Finally, the thesis provides two methods to address planning problems involving uncertainty in the inputs space. The first method is applied to actively learn terrain roughness models via proprioceptive sensing with a mobile robot under localisation uncertainty. Experiments demonstrate the method's performance in both simulations and a physical environment. The second method is derived for more general optimisation problems. In particular, this method is provided with theoretical guarantees and empirical performance comparisons against other approaches in simulated environments.
Estilos ABNT, Harvard, Vancouver, APA, etc.
15

Kao, Hai Feng. "Optimal planning with approximate model-based reinforcement learning". Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/39889.

Texto completo da fonte
Resumo:
Model-based reinforcement learning methods make efficient use of samples by building a model of the environment and planning with it. Compared to model-free methods, they usually take fewer samples to converge to the optimal policy. Despite that efficiency, model-based methods may not learn the optimal policy due to structural modeling assumptions. In this thesis, we show that by combining model- based methods with hierarchically optimal recursive Q-learning (HORDQ) under a hierarchical reinforcement learning framework, the proposed approach learns the optimal policy even when the assumptions of the model are not all satisfied. The effectiveness of our approach is demonstrated with the Bus domain and Infinite Mario – a Java implementation of Nintendo’s Super Mario Brothers.
Estilos ABNT, Harvard, Vancouver, APA, etc.
16

Kochenderfer, Mykel J. "Adaptive modelling and planning for learning intelligent behaviour". Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/1408.

Texto completo da fonte
Resumo:
An intelligent agent must be capable of using its past experience to develop an understanding of how its actions affect the world in which it is situated. Given some objective, the agent must be able to effectively use its understanding of the world to produce a plan that is robust to the uncertainty present in the world. This thesis presents a novel computational framework called the Adaptive Modelling and Planning System (AMPS) that aims to meet these requirements for intelligence. The challenge of the agent is to use its experience in the world to generate a model. In problems with large state and action spaces, the agent can generalise from limited experience by grouping together similar states and actions, effectively partitioning the state and action spaces into finite sets of regions. This process is called abstraction. Several different abstraction approaches have been proposed in the literature, but the existing algorithms have many limitations. They generally only increase resolution, require a large amount of data before changing the abstraction, do not generalise over actions, and are computationally expensive. AMPS aims to solve these problems using a new kind of approach. AMPS splits and merges existing regions in its abstraction according to a set of heuristics. The system introduces splits using a mechanism related to supervised learning and is defined in a general way, allowing AMPS to leverage a wide variety of representations. The system merges existing regions when an analysis of the current plan indicates that doing so could be useful. Because several different regions may require revision at any given time, AMPS prioritises revision to best utilise whatever computational resources are available. Changes in the abstraction lead to changes in the model, requiring changes to the plan. AMPS prioritises the planning process, and when the agent has time, it replans in high-priority regions. This thesis demonstrates the flexibility and strength of this approach in learning intelligent behaviour from limited experience.
Estilos ABNT, Harvard, Vancouver, APA, etc.
17

Holst, Gustav. "Route Planning of Transfer Buses Using Reinforcement Learning". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281286.

Texto completo da fonte
Resumo:
In route planning the goal is to obtain the best route between a set of locations, which becomes a very complex task as the number of locations increase. This study will consider the problem of transfer bus route planning and examines the feasibility of applying a reinforcement learning method in this specific real-world context. In recent research, reinforcement learning methods have emerged as a promising alternative to classical optimization algorithms when solving similar problems. This due to their positive properties in terms of scalability and generalization. However, the majority of said research has been performed on strictly theoretical problems, not using real-world data. This study implements an existing reinforcement learning model and adapts it to fit the realms of transfer bus route planning. The model is trained to generate optimized routes in terms of time and cost consumption. Then, routes generated by the trained model are evaluated by comparing them to corresponding manually planned routes. The reinforcement learning model produces routes that outperforms manually planned routes with regards to both examined metrics. However, due to delimitations and assumptions made during the implementation, the explicit differences in consumptions are considered promising but cannot be taken as definite results. The main finding is the overarching behavior of the model, implying a proof of concept; reinforcement learning models are usable tools in the context of real-world transfer bus route planning.
Inom ruttplanering är målet att erhålla den bästa färdvägen mellan en uppsättning platser, vilket blir en mycket komplicerad uppgift i takt med att antalet platser ökar. Denna studie kommer att behandla problemet gällande ruttplanering av transferbussar och undersöker genomförbarheten av att tillämpa en förstärkningsinlärningsmetod på detta verkliga problem. I nutida forskning har förstärkningsinlärningsmetoder framträtt som ett lovande alternativ till klassiska optimeringsalgoritmer för lösandet av liknande problem. Detta på grund utav deras positiva egenskaper gällande skalbarhet och generalisering. Emellertid har majoriteten av den nämnda forskningen utförts på strikt teoretiska problem. Denna studie implementerar en befintlig förstärkningsinlärningsmodell och anpassar den till att passa problemet med ruttplanering av transferbussar. Modellen tränas för att generera optimerade rutter, gällande tids- och kostnadskonsumtion. Därefter utvärderas rutterna, som genererats av den tränade modellen, mot motsvarande  manuellt planerade rutter. Förstärkningsinlärningsmodellen producerar rutter som överträffar de manuellt planerade rutterna med avseende på de båda undersökta mätvärdena. På grund av avgränsningar och antagandet som gjorts under implementeringen anses emellertid de explicita konsumtionsskillnaderna vara lovande men kan inte ses som definitiva resultat. Huvudfyndet är modellens övergripande beteende, vilket antyder en konceptvalidering; förstärkningsinlärningsmodeller är användbara som verktyg i sammanhanget gällande verklig ruttplanering av transferbussar.
Estilos ABNT, Harvard, Vancouver, APA, etc.
18

Wickman, Axel. "Exploring feasibility of reinforcement learning flight route planning". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178314.

Texto completo da fonte
Resumo:
This thesis explores and compares traditional and reinforcement learning (RL) methods of performing 2D flight path planning in 3D space. A wide overview of natural, classic, and learning approaches to planning s done in conjunction with a review of some general recurring problems and tradeoffs that appear within planning. This general background then serves as a basis for motivating different possible solutions for this specific problem. These solutions are implemented, together with a testbed inform of a parallelizable simulation environment. This environment makes use of random world generation and physics combined with an aerodynamical model. An A* planner, a local RL planner, and a global RL planner are developed and compared against each other in terms of performance, speed, and general behavior. An autopilot model is also trained and used both to measure flight feasibility and to constrain the planners to followable paths. All planners were partially successful, with the global planner exhibiting the highest overall performance. The RL planners were also found to be more reliable in terms of both speed and followability because of their ability to leave difficult decisions to the autopilot. From this it is concluded that machine learning in general, and reinforcement learning in particular, is a promising future avenue for solving the problem of flight route planning in dangerous environments.
Estilos ABNT, Harvard, Vancouver, APA, etc.
19

FABBIANI, EMANUELE. "Machine Learning Approaches for Energy Distribution and Planning". Doctoral thesis, Università degli studi di Pavia, 2021. http://hdl.handle.net/11571/1436275.

Texto completo da fonte
Resumo:
La transizione verso un paradigma più sostenibile di generazione, trasporto e stoccaggio dell'energia sarà una delle sfide più critiche dei prossimi decenni. Seguendo le tendenze globali, sia l'accademia che l'industia stanno sfruttando tutti gli strumenti a loro disposizione per facilitare e accelerare tale processo. Il machine learning è uno di tali strumenti: negli ultimi anni, numerose e rilevanti innovazioni hanno portato ad un numero sempre crescente di applicazioni, che ormai comprendono ogni aspetto della produzione e del trasporto dell'energia. Abbiamo scelto di investigare due problemi che ben si prestano ad essere risolti con tecniche di machine learning: da un lato, in collaborazione con A2A, la terza utility italiana, abbiamo studiato la previsione della domanda nazionale di gas natuale; dall'altro, in collaborazione con l'Ecole Polytechnique Fédérale de Lausanne, abbiamo affrontato l'identificazione della topologia e dei parametri delle reti elettriche di distribuzione. Entrambi gli ambiti offrono immediate applicazioni. Diverse nazioni -- inclusa l'Italia -- pianificano di dismettere i generatori a carbone o olio combustibile: gli impianti a gas naturale diventano quindi gli ideali candidati a complementare fonti rinnovabili intermittenti. Inoltre, il gas naturale copre attualemente una larga porzione del fabbisogno primario dei complessi industriali e residenziali. Previsioni accurate della domanda costituiscono un elemento fondamentale nei processi delle utility e dei gestori di rete e promettono di rendere più efficiente il trasporto e lo stoccaggio, diminuendo così i costi finanziari e ambientali. L'identificazione delle reti elettriche, invece, è necessaria agli algoritmi di controllo della generazione distributa, a loro volta moduli fondamentali in strutture ad alta efficienza e basso impatto ambientale, come microgrid e smart grid. In questo lavoro, affrontiamo la previsione della domanda italiana di gas naturale ad uso residenziale, industriale e termoelettrico. La nostra discussione si apre con un'analisi esplorativa, tesa a guidare la scelta e la creazione delle variabili. Prosegue quindi con la trasformazione della previsione in un problema di regressione e la comparazione di diversi modelli base. Per la prima volta, applichiamo poi in questo ambito la tecnica dell'ensembling e dimostriamo come questa produca predittori più accurati e robusti. Infine, proponiamo un originale modello probabilistico per l'impatto dell'inaccuratezza delle previsioni meteo sull'errore nella previsione della domanda residenziale. Per quanto concerne l'identificazione delle reti elettriche di distribuzione, proponiamo una nuova procedura, che complementa un algoritmo di apprendimento online con la tecnica del design of experiment. Tale approccio ha due vantaggi rispetto ai metodi esistenti: sfrutta i generatori controllabili per massimizzare l'informazione contenuta nelle misure, senza tuttavia compromettere l'operatività o la sicurezza della rete, ed è capace di adattare la stima a cambiamenti di configurazione, molto comuni nelle microgrid. L'efficacia del metodo viene comprovata da numerose simulazioni numeriche. Infine, nel corso di tutta la tesi, sottolineamo le applicazioni concrete dei nostri contributi e forniamo indicazioni per possibili sviluppi futuri.
The shift towards more sustainable energy generation, transportation, and storage will be a major challenge in the next decades. Following the global trend, both academic and industrial communities are exploiting all the available tools to facilitate the transition. Machine learning is undoubtedly one such tool: substantial advancements in the last years enabled its application to several aspects of energy production and management. We selected two problems that can be addressed with machine learning. In collaboration with A2A, the third largest Italian utility, we studied the prediction of natural gas demand; with the Ecole Polytechnique Fédérale de Lausanne, we tackled the identification of the topology and the electrical parameters of distribution power networks. Both topics have deep practical implications. As nations are decommissioning coal and oil plants, natural gas becomes the ideal candidate to complement renewable yet intermittent power sources. Moreover, natural gas covers a relevant portion of the energy consumption of residential and industrial buildings. The accurate prediction of the demand can make both transportation and storage more efficient, reducing environmental and financial costs. As the electrification of transportation and domestic heating gains traction, power networks are put under heavy stress. Moreover, the bidirectional power flows created by distributed generation must be carefully managed. New paradigms, such as microgrids and smart grids, are set to replace the current infrastructure. Yet, the complex control algorithms required by such designs require complete knowledge of the network structure. We deal with the prediction of residential, industrial and thermoelectric gas demand at country level. We present a comprehensive explorative study, which lays the foundation for feature selection and engineering. We then cast a regression problem and compare several base models, highlighting the strengths and weaknesses of each one. For the first time, we propose to apply ensembling, showing how it yields more accurate predictors. Finally, we design a novel model for the influence of weather forecasting errors on the accuracy of residential gas demand predictors, and we demonstrate its effectiveness with experimental evidence. We propose to solve the identification of distribution networks by means of a novel procedure, complementing an online estimation algorithm with a sequential design of experiment. The approach has two main advantages with respect to traditional methods: it exploits controllable generators to maximize the information content of the samples, and it can seamlessly adapt to changes in topology, which are especially frequent in microgrids. The effectiveness of the proposed approach is substantiated by simulations on standard testbeds. With respect to both topics, throughout the thesis we highlight the concrete industrial applications of our work and provide directions for future developments.
Estilos ABNT, Harvard, Vancouver, APA, etc.
20

Wolfaardt, Ddolores. "Facilitating learning: An investigation of the language policy of Namibian schools". University of the Western Cape, 2001. http://hdl.handle.net/11394/8452.

Texto completo da fonte
Resumo:
Doctor Educationis
This research has sought to investigate the language policy of Namibian schools against the background of international literature on the advantages of mother tongue as medium of instruction during the initial years of school. The historical background of the formulation and implementation of the current policy is dealt with in Chapter 2. The theoretical aspects of language planning as explained in the literature will focus on aspects like the underlying principles for language planning. This chapter will furthermore discuss information regarding the status and the use of the mother tongue as medium of instruction in Namibia during the first three years of school. In Chapter 4 a literature review of Cummins's linguistic interdependence principle, as well as the different options or models for a bilingual language approach in education, is discussed in detail and compared to the Namibian situation to find the best possible model which could be adapted for Namibia. Chapter 5 investigates the results of a survey that has been conducted in Namibia to determine the level of English language proficiency of teachers. These findings are compared to find a relation between repetition rates of learners, Grade 10 examination results per region, as well as the teacher qualifications per region. Chapter 6 proposes a gradual bilingual language model for Namibia. First the rationale will be dealt with, followed by a detailed description of the model and how it is to be implemented. Chapters 7 and 8 deal with the research methodology that was undertaken in the form of a questionnaire and interviews with educationists regarding the use of the real medium of instruction, the perceptions of educationists on the language policy, and their proposals to change the language policy. Their perceptions of the proposed language model are discussed in order to identify ideas on how to streamline it. In Chapter 9 questions concerning the implications of implementing a bilingual language policy with regard to what is possible, practicable, and affordable will be dealt with. The last chapter, Chapter 10, will compare the current language policy, a policy proposed by NIED, and the model proposed here, before a number of recommendations are made.
Estilos ABNT, Harvard, Vancouver, APA, etc.
21

Latief, Shahnaz. "Time and school learning". Master's thesis, University of Cape Town, 2002. http://hdl.handle.net/11427/7948.

Texto completo da fonte
Resumo:
Bibliography: leaves 67-71.
This study, conducted at Poor Man's Friend Secondary School (fictitious name), describes the use of Time Tabled School time. In fact, it quantifies the Time spent on Instruction and relates it to Learner Engagement-rates. Cumulatively, these variables impact on Learner Outcomes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
22

Whale, Alyssa Morgan. "An e-learning environment for enterprise resource planning systems". Thesis, Nelson Mandela Metropolitan University, 2016. http://hdl.handle.net/10948/13182.

Texto completo da fonte
Resumo:
Enterprise Resource Planning (ERP) education can positively impact the success of an ERP implementation. Incorporating new tools and technologies into the learning process can potentially alleviate the evident problems with ERP education. Blended learning and e-learning environments both offer opportunities for improvement in education. However, there are various factors and components that need to be in place for such an environment to be successful. The aim of this research is to provide an ERP e-Learning Environment (ERPeL) that can assist with ERP education in terms of creating an integrated and comprehensive learning environment for novice ERP users. In order to achieve this aim, this study followed the Design-Based Research (DBR) methodology which is specific to educational technology research and was applied in iterative cycles where various components of the environment were evaluated by different participants. Quantitative and qualitative data was collected by means of field studies (interviews, focus groups and questionnaires). The proposed ERPeL underwent several iterations of feedback and improvement. In order to determine the success of e-learning, various critical success factors and evaluation criteria were investigated. Field studies were conducted in order to validate the theory in a real-world context. An initial field study was conducted with third year Nelson Mandela Metropolitan University (NMMU) students who were enrolled in the 2014 ERP systems’ module in the Department of Computing Sciences. Many of the problems identified in theory were found to be prevalent in the real-world context. One of the DBR process cycles involved the implementation of specific components of the ERPeL at the Developing and Strengthening Industry-driven Knowledge-transfer between developing Countries (DASIK) introduction to ERP systems course. Participants were either NMMU students, academic staff or industry delegates. The components evaluated included videos, learning content, badges, assessment and the SYSPRO Latte m-learning application. Additional components of a leader board, live chats, peer reviewing, expert reviews, user generated content, consultancy with experts and SYSPRO ERP certification were implemented in the subsequent cycle where participants were 2015 third year NMMU ERP systems students. The criteria used to evaluate the success of the ERPeL and its e-learning components were adapted from literature and a new set of evaluation criteria for e-learning was proposed. The ERPeL is made up of Moodle, the SYSPRO ERP System, the SYSPRO e-Learning System, the SYSPRO Latte m-learning application, learning content and components. Overall the ERPeL was positively received by the various sample groups. The research results indicate that the use of an e-learning environment for ERP systems was positively received. The most positive aspects reported were the implementation of e-learning components such as the interactive videos, simulations and m-learning. In support of this Masters dissertation, the following three papers have been published and presented at two local conferences and one international conference: 1. SACLA 2014, Port Elizabeth (South Africa); 2. SAICSIT 2015, Stellenbosch (South Africa); and 3. IDIA 2015, Zanzibar (Tanzania).
Estilos ABNT, Harvard, Vancouver, APA, etc.
23

Mott, Bradford Wayne. "Decision-Theoretic Narrative Planning for Guided Exploratory Learning Environments". NCSU, 2006. http://www.lib.ncsu.edu/theses/available/etd-03292006-110906/.

Texto completo da fonte
Resumo:
Interactive narrative environments have been the focus of increasing attention in recent years. A key challenge posed by these environments is narrative planning, in which a director agent orchestrates all of the events in an interactive virtual world. To create effective interactions, the director agent must cope with the task's inherent uncertainty, including uncertainty about the user's intentions. Moreover, director agents must be efficient so they can operate in real time. To address these issues, we present U-DIRECTOR, a decision-theoretic narrative planning architecture that dynamically models narrative objectives (e.g., plot progress, narrative flow), storyworld state (e.g., physical state, plot focus), and user state (e.g., goals, beliefs) with a dynamic decision network (DDN) that continually selects storyworld actions to maximize narrative utility on an ongoing basis. DDNs extend decision networks by introducing the ability to model attributes whose values change over time; decision networks extend Bayesian networks by supporting utility-based rational decision making. The U-DIRECTOR architecture also employs an n-gram goal recognition model that exploits knowledge of narrative structure to recognize users' goals and an HTN planner that operates in two coordinated planning spaces to integrate narrative and tutorial planning. U-DIRECTOR has been implemented in a narrative planner for an interactive narrative learning environment in the domain of microbiology in which a user plays the role of a medical detective solving a science mystery. Formal evaluations suggest that the U-DIRECTOR architecture satisfies the real-time constraints of interactive narrative environments and creates engaging experiences.
Estilos ABNT, Harvard, Vancouver, APA, etc.
24

Furmston, T. J. "Applications of probabilistic inference to planning & reinforcement learning". Thesis, University College London (University of London), 2013. http://discovery.ucl.ac.uk/1389368/.

Texto completo da fonte
Resumo:
Optimal control is a profound and fascinating subject that regularly attracts interest from numerous scien- tific disciplines, including both pure and applied Mathematics, Computer Science, Artificial Intelligence, Psychology, Neuroscience and Economics. In 1960 Rudolf Kalman discovered that there exists a dual- ity between the problems of filtering and optimal control in linear systems [84]. This is now regarded as a seminal piece of work and it has since motivated a large amount of research into the discovery of similar dualities between optimal control and statistical inference. This is especially true of recent years where there has been much research into recasting problems of optimal control into problems of statis- tical/approximate inference. Broadly speaking this is the perspective that we take in this work and in particular we present various applications of methods from the fields of statistical/approximate inference to optimal control, planning and Reinforcement Learning. Some of the methods would be more accu- rately described to originate from other fields of research, such as the dual decomposition techniques used in chapter(5) which originate from convex optimisation. However, the original motivation for the application of these techniques was from the field of approximate inference. The study of dualities be- tween optimal control and statistical inference has been a subject of research for over 50 years and we do not claim to encompass the entire subject. Instead, we present what we consider to be a range of interesting and novel applications from this field of research
Estilos ABNT, Harvard, Vancouver, APA, etc.
25

Grounds, Matthew Jon. "Scaling-up reinforcement learning using parallelization and symbolic planning". Thesis, University of York, 2007. http://etheses.whiterose.ac.uk/11009/.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
26

Gay, Juliana Siqueira. "Learning spatial inequalities: an approach to support transportation planning". Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/3/3138/tde-03052018-103817/.

Texto completo da fonte
Resumo:
Part of the literature of transportation planning understand transportation infrastructure as a mean of distributing people and opportunities across the territory. Therefore, the spatial inequalities become a relevant issue in transportation and land use planning. To meet the challenge of evaluating the heterogeneity of transportation provision and land use in the urban environment, this work aims at identifying and describing patterns hidden the distribution of accessibility to different urban facilities and socioeconomic information using Machine Learning (ML) techniques to inform the decision making of transportation plans. To feature the current consideration of spatial inequalities measures in the practice of transportation planning in Brazil, nine mobility plans were reviewed. For investigating the potentialities and restrictions of ML application, unsupervised and supervised analysis of income and accessibility indicators to health, education and leisure were performed. The data of the São Paulo municipality from the years of 2000 and 2010 was explored. The analyzed plans do not present measures for evaluating spatial inequalities. It is possible to identify that the low-income population has low accessibility to all facilities, especially, hospital and cultural centers. The east zone of the city presents a low-income group with intermediate level to public schools and sports centers, revealing the heterogeneity in regions out of the city center. Finally, a framework is proposed to incorporate spatial inequalities by using ML techniques in transportation plans.
Parte da literatura de planejamento de transportes conceitua a infraestrutura de transportes como uma forma de distribuir pessoas e oportunidades no território. Portanto, as desigualdades espaciais tornaram-se uma questão relevante a ser endereçada no planejamento de transportes e uso do solo. De maneira a contribuir com o desafio de avaliar desigualdades e sua heterogeneidade no ambiente urbano, esse trabalho tem como objetivo identificar e descrever padrões existentes na distribuição acessibilidade a diferentes equipamentos urbanos e dados socioeconômicos por meio de técnicas de Aprendizagem de Máquina (AM) para informar a tomada de decisão em planos de transportes. De forma a caracterizar a atual consideração de métricas de desigualdades espaciais na prática do planejamento de transportes no Brasil, nove planos de mobilidade foram revisados. Para investigar as potencialidades e restrições da aplicação de AM, análises supervisionadas e não supervisionadas de indicadores de renda e acessibilidade a saúde, educação e lazer foram realizadas. Os dados do município de São Paulo dos anos de 2000 e 2010 foram explorados. Os Planos de Mobilidade analisados não apresentam medidas para avaliação de desigualdades espaciais. Além disso, é possível identificar que a população de baixa renda tem baixa acessibilidade a todos os equipamentos urbanos, especialmente hospitais e centros culturais. A zona leste da cidade apresenta um grupo de baixa renda com nível intermediário de acessibilidade a escolas públicas e centros esportivos, evidenciando a heterogeneidade nas regiões periféricas da cidade. Finalmente, um quadro de referência é proposto para incorporação de técnicas de AM no planejamento de transportes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
27

Zhou, Tianyu. "Deep Learning Models for Route Planning in Road Networks". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235216.

Texto completo da fonte
Resumo:
Traditional shortest path algorithms can efficiently find the optimal paths in graphs using simple heuristics. However, formulating a simple heuristic is challenging under the road network setting since there are multiple factors to consider, such as road segment length, edge centrality, and speed limit. This study investigates how a neural network can learn to take these factors as inputs and yield a path given a pair of origin and destination. The research question is formulated as: Are neural networks applicable to real-time route planning tasks in a roadnetwork?. The proposed metric to evaluate the effectiveness of the neural network is arrival rate. The quality of generated paths is evaluated by time efficiency. The real-time performance of the model is also compared between pathfinding in dynamic and static graphs, using theabove metrics. A staggered approach is applied in progressing this investigation. The first step is to generate random graphs, which allows us to monitor the size and properties of the training graph without caring too many details in a road network. The next step is to determine, as a proof of concept, if a neural network can learn to traverse simple graphs with multiple strategies, given that road networks are in effect complex graphs. Finally, we scale up by including factors that might affect the pathfinding in real road networks. Overall, the training data is optimal paths in a graph generated by a shortest path algorithm. The model is then applied to new graphs to generate a path given a pair of origin and destination. The arrival rate and time efficiency are calculated and compared with that of the corresponding optimal path. Experimental results show that the effectiveness, i.e., arrival rate ofthe model is 90% and the path quality, i.e., time efficiency has a medianof 0.88 and a large variance. The experiment shows that the model has better performance in dynamic graphs than in static graphs. Overall, the answer to the research question is positive. However, there is still room to improve the effectiveness of the model and the paths generated by the model. This work shows that a neural network trained to make locally optimal choices can hardly give a globally optimal solution. We also show that our method, only making locally optimal choices, can adapt to dynamic graphs with little performance overhead.
Traditionella algoritmer för att hitta den kortaste vägen kan effektivt hitta de optimala vägarna i grafer med enkel heuristik. Att formulera en enkel heuristik är dock utmanande för vägnätverk eftersom det finns flera faktorer att överväga, såsom vägsegmentlängd, kantcentralitet och hastighetsbegränsningar. Denna studie undersöker hur ett neuralt nätverk kan lära sig att ta dessa faktorer som indata och finna en väg utifrån start- och slutpunkt. Forskningsfrågan är formulerad som: Är neuronnätverket tillämpliga på realtidsplaneringsuppgifter i ett vägnät?. Det föreslagna måttet för att utvärdera effektiviteten hos det neuronnätverket är ankomstgrad. Kvaliteten på genererade vägar utvärderas av tidseffektivitet. Prestandan hos modellen jämförs också mellan sökningen i dynamiska och statiska grafer, med hjälp av ovanstående mätvärden. Undersökningen bedrivs i flera steg. Det första steget är att generera slumpmässiga grafer, vilket gör det möjligt för oss att övervaka träningsdiagrammets storlek och egenskaper utan att ta hand om för många detaljer i ett vägnät. Nästa steg är att, som ett bevis på konceptet, undersöka om ett neuronnätverk kan lära sig att korsa enkla grafer med flera strategier, eftersom vägnätverk är i praktiken komplexa grafer. Slutligen skalas studien upp genom att inkludera faktorer som kan påverka sökningen i riktiga vägnät. Träningsdata utgörs av optimala vägar i en graf som genereras av en algoritm för att finna den kortaste vägen. Modellen appliceras sedan i nya grafer för att hitta en väg mellan start och slutpunkt. Ankomstgrad och tidseffektivitet beräknas och jämförs med den motsvarande optimala sökvägen. De experimentella resultaten visar att effektiviteten, dvs ankomstgraden av modellen är 90% och vägkvaliteten dvs tidseffektiviteten har en median på 0,88 och en stor varians. Experimentet visar att modellen har bättre prestanda i dynamiska grafer än i statiska grafer. Sammantaget är svaret på forskningsfrågan positivt. Det finns dock fortfarande utrymme att förbättra modellens effektivitet och de vägar som genereras av modellen. Detta arbete visar att ett neuronnätverk tränat för att göra lokalt optimala val knappast kan ge globalt optimal lösning. Vi visar också att vår metod, som bara gör lokalt optimala val, kan anpassa sig till dynamiska grafer med begränsad prestandaförlust.
Estilos ABNT, Harvard, Vancouver, APA, etc.
28

Zhang, Tianfang. "Machine learning multicriteria optimization in radiation therapy treatment planning". Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257509.

Texto completo da fonte
Resumo:
In radiation therapy treatment planning, recent works have used machine learning based on historically delivered plans to automate the process of producing clinically acceptable plans. Compared to traditional approaches such as repeated weighted-sum optimization or multicriteria optimization (MCO), automated planning methods have, in general, the benefits of low computational times and minimal user interaction, but on the other hand lack the flexibility associated with general-purpose frameworks such as MCO. Machine learning approaches can be especially sensitive to deviations in their dose prediction due to certain properties of the optimization functions usually used for dose mimicking and, moreover, suffer from the fact that there exists no general causality between prediction accuracy and optimized plan quality.In this thesis, we present a means of unifying ideas from machine learning planning methods with the well-established MCO framework. More precisely, given prior knowledge in the form of either a previously optimized plan or a set of historically delivered clinical plans, we are able to automatically generate Pareto optimal plans spanning a dose region corresponding to plans which are achievable as well as clinically acceptable. For the former case, this is achieved by introducing dose--volume constraints; for the latter case, this is achieved by fitting a weighted-data Gaussian mixture model on pre-defined dose statistics using the expectation--maximization algorithm, modifying it with exponential tilting and using specially developed optimization functions to take into account prediction uncertainties.Numerical results for conceptual demonstration are obtained for a prostate cancer case with treatment delivered by a volumetric-modulated arc therapy technique, where it is shown that the methods developed in the thesis are successful in automatically generating Pareto optimal plans of satisfactory quality and diversity, while excluding clinically irrelevant dose regions. For the case of using historical plans as prior knowledge, the computational times are significantly shorter than those typical of conventional MCO.
Inom strålterapiplanering har den senaste forskningen använt maskininlärning baserat på historiskt levererade planer för att automatisera den process i vilken kliniskt acceptabla planer produceras. Jämfört med traditionella angreppssätt, såsom upprepad optimering av en viktad målfunktion eller flermålsoptimering (MCO), har automatiska planeringsmetoder generellt sett fördelarna av lägre beräkningstider och minimal användarinteraktion, men saknar däremot flexibiliteten hos allmänna ramverk som exempelvis MCO. Maskininlärningsmetoder kan vara speciellt känsliga för avvikelser i dosprediktionssteget på grund av särskilda egenskaper hos de optimeringsfunktioner som vanligtvis används för att återskapa dosfördelningar, och lider dessutom av problemet att det inte finns något allmängiltigt orsakssamband mellan prediktionsnoggrannhet och kvalitet hos optimerad plan. I detta arbete presenterar vi ett sätt att förena idéer från maskininlärningsbaserade planeringsmetoder med det väletablerade MCO-ramverket. Mer precist kan vi, givet förkunskaper i form av antingen en tidigare optimerad plan eller en uppsättning av historiskt levererade kliniska planer, automatiskt generera Paretooptimala planer som täcker en dosregion motsvarande uppnåeliga såväl som kliniskt acceptabla planer. I det förra fallet görs detta genom att introducera dos--volym-bivillkor; i det senare fallet görs detta genom att anpassa en gaussisk blandningsmodell med viktade data med förväntning--maximering-algoritmen, modifiera den med exponentiell lutning och sedan använda speciellt utvecklade optimeringsfunktioner för att ta hänsyn till prediktionsosäkerheter.Numeriska resultat för konceptuell demonstration erhålls för ett fall av prostatacancer varvid behandlingen levererades med volymetriskt modulerad bågterapi, där det visas att metoderna utvecklade i detta arbete är framgångsrika i att automatiskt generera Paretooptimala planer med tillfredsställande kvalitet och variation medan kliniskt irrelevanta dosregioner utesluts. I fallet då historiska planer används som förkunskap är beräkningstiderna markant kortare än för konventionell MCO.
Estilos ABNT, Harvard, Vancouver, APA, etc.
29

Varnai, Peter. "Reinforcement Learning Endowed Robot Planning under Spatiotemporal Logic Specifications". Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263611.

Texto completo da fonte
Resumo:
Recent advances in artificial intelligence are producing fascinating results in the field of computer science. Motivated by these successes, the desire to transfer and implement learning methods on real-life systems is growing as well. The increased level of autonomy and intelligence of the resulting systems in carrying out complex tasks can be expected to revolutionize both the industry and our everyday lives. This thesis takes a step towards this goal by studying reinforcement learning methods for solving optimal control problems with task satisfaction constraints. More specifically, spatiotemporal tasks given in the expressive language of signal temporal logic are considered. We begin by introducing our proposed solution to the task constrained optimal control problem, which is based on blending traditional control methods with more recent, data-driven approaches. We adopt the paradigm that the two approaches should be considered as endpoints of a continuous spectrum, and incorporate partial knowledge of system dynamics into the learning process in the form of guidance controllers. These guidance controllers aid in enforcing the task satisfaction constraint, allowing the system to explore towards finding optimal trajectories in a more sample-efficient manner. The proposed solution algorithm is termed guided policy improvement with path integrals (G-PI2). We also propose a framework for deriving effective guidance controllers, and the importance of this guidance is illustrated through a simulation case study. The thesis also considers a diverse range of enhancements to the developed G-PI2 algorithm. First, the effectiveness of the guidance laws is increased by continuously updating their parameters throughout the learning process using so-called funnel adaptation. Second, we explore a learning framework for gathering and storing experiences gained from previously solved problems in order to efficiently tackle changes in initial conditions or task specifications in future missions. Finally, we look at how so-called robustness metrics, which quantify the extent of task satisfaction for signal temporal logic, can be explicitly defined in order to aid the learning process towards finding task satisfying trajectories. The multidisciplinary nature of the examined task constrained optimal control problem offers a broad range of additional research directions to consider in future work, which are discussed in detail as well.

QC 20191111

Estilos ABNT, Harvard, Vancouver, APA, etc.
30

Gritsenko, Artem. "Learning From Demonstrations in Changing Environments: Learning Cost Functions and Constraints for Motion Planning". Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-theses/1246.

Texto completo da fonte
Resumo:
"We address the problem of performing complex tasks for a robot operating in changing environments. We propose two approaches to the following problem: 1) define task-specific cost functions for motion planning that represent path quality by learning from an expert's preferences and 2) using constraint-based representation of the task inside learning from demonstration paradigm. In the first approach, we generate a set of paths for a given task using a motion planner and collect data about their features (path length, distance from obstacles, etc.). We provide these paths to an expert as a set of pairwise comparisons. We then form a ranking of the paths from the expert's comparisons. This ranking is used as training data for learning algorithms, which attempt to produce a cost function that maps path feature values to a cost that is consistent with the expert's ranking. We test our method on two simulated car-maintenance tasks with the PR2 robot: removing a tire and extracting an oil filter. We found that learning methods which produce non-linear combinations of the features are better able to capture expert preferences for the tasks than methods which produce linear combinations. This result suggests that the linear combinations used in previous work on this topic may be too simple to capture the preferences of experts for complex tasks. In the second approach, we propose to introduce a constraint-based description of the task that can be used together with the motion planner to produce the trajectories. The description is automatically created from the demonstration by performing segmentation and extracting constraints from the motion. The constraints are represented with the Task Space Regions (TSR) that are extracted from the demonstration and used to produce a desired motion. To account for the parts of the motion where constraints are different a segmentation of the demonstrated motion is performed using TSRs. The proposed approach allows performing tasks on robot from human demonstration in changing environments, where obstacle distribution or poses of the objects could change between demonstration and execution. The experimental evaluation on two example motions was performed to estimate the ability of our approach to produce the desired motion and recover a demonstrated trajectory."
Estilos ABNT, Harvard, Vancouver, APA, etc.
31

Lee, Yang Won. "Institutional learning--the public housing process". Thesis, Massachusetts Institute of Technology, 1988. http://hdl.handle.net/1721.1/75997.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
32

Forbes, Charles L. (Charles Lockwood). "Organizational learning--from information to knowledge". Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/10605.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
33

DESHPANDE, AMIT A. "Virtual Enterprise Resource Planning for Production Planning and Control Education". University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1211238271.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
34

Barnes, Sheri K. "Evidence of heterarchial planning within higher education institutions : learning garden planning and development at Rowan University /". Full text available online, 2007. http://www.lib.rowan.edu/find/theses.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
35

Cervera, Mateu Enric. "Perception-Based Learning for Fine Motion Planning in Robot Manipulation". Doctoral thesis, Universitat Jaume I, 1997. http://hdl.handle.net/10803/10377.

Texto completo da fonte
Resumo:
Robots must successfully execute tasks in the presence of uncertainty.
The main sources of uncertainty are modeling, sensing, and control. Fine motion problems involve a small-scale space and contact between objects.
Though modern manipulators are very precise and repetitive, complex tasks may be difficult --or even impossible-- to model at the desired degree of exactitude; moreover, in real-world situations, the environment is not known a-priori and visual sensing does not provide enough accuracy.
In order to develop successful strategies, it is necessary to understand what can be perceived, what action can be learnt --associated-- according to the perception, and how can the robot optimize its actions with regard to defined criteria.
The thesis describes a robot programming architecture for learning fine motion tasks.
Learning is an autonomous process of experience repetition, and the target is to achieve the goal in the minimum number of steps. Uncertainty in the location is assumed, and the robot is guided mainly by the sensory information acquired by a force sensor.
The sensor space is analyzed by an unsupervised process which extracts features related with the probability distribution of the input samples. Such features are used to build a discrete state of the task to which an optimal action is associated, according to the past experience. The thesis also includes simulations of different sensory-based tasks to illustrate some aspects of the learning processes.
The learning architecture is implemented on a real robot arm with force sensing capabilities. The task is a peg-in-hole insertion with both cylindrical and non-cylindrical workpieces.
Estilos ABNT, Harvard, Vancouver, APA, etc.
36

Martínez, Martínez David. "Learning relational models with human interaction for planning in robotics". Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/458884.

Texto completo da fonte
Resumo:
Automated planning has proven to be useful to solve problems where an agent has to maximize a reward function by executing actions. As planners have been improved to salve more expressive and difficult problems, there is an increasing interest in using planning to improve efficiency in robotic tasks. However, planners rely on a domain model, which has to be either handcrafted or learned. Although learning domain models can be very costly, recent approaches provide generalization capabilities and integrate human feedback to reduce the amount of experiences required to learn. In this thesis we propase new methods that allow an agent with no previous knowledge to solve certain problems more efficiently by using task planning. First, we show how to apply probabilistic planning to improve robot performance in manipulation tasks (such as cleaning the dirt or clearing the tableware on a table). Planners obtain sequences of actions that get the best result in the long term, beating reactive strategies. Second, we introduce new reinforcement learning algorithms where the agent can actively request demonstrations from a teacher to learn new actions and speed up the learning process. In particular, we propase an algorithm that allows the user to set the mínimum quality to be achieved, where a better quality also implies that a larger number of demonstrations will be requested . Moreover, the learned model is analyzed to extract the unlearned or problematic parts of the model. This information allow the agent to provide guidance to the teacher when a demonstration is requested, and to avoid irrecoverable errors. Finally, a new domain model learner is introduced that, in addition to relational probabilistic action models, can also learn exogenous effects. This learner can be integrated with existing planners and reinforcement learning algorithms to salve a wide range of problems. In summary, we improve the use of learning and task planning to salve unknown tasks. The improvements allow an agent to obtain a larger benefit from planners, learn faster, balance the number of action executions and teacher demonstrations, avoid irrecoverable errors, interact with a teacher to solve difficult problems, and adapt to the behavior of other agents by learning their dynamics. All the proposed methods were compared with state-of-the-art approaches, and were also demonstrated in different scenarios, including challenging robotic tasks.
La planificación automática ha probado ser de gran utilidad para resolver problemas en los que un agente tiene que ejecutar acciones para maximizar una función de recompensa. A medida que los planificadores han sido capaces de resolver problemas cada vez más complejos, ha habido un creciente interés por utilizar dichos planificadores para mejorar la eficiencia de tareas robóticas. Sin embargo, los planificadores requieren un modelo del dominio, el cual puede ser creado a mano o aprendido. Aunque aprender modelos automáticamente puede ser costoso, recientemente han aparecido métodos que permiten la interacción persona-máquina y generalizan el conocimiento para reducir la cantidad de experiencias requeridas para aprender. En esta tesis proponemos nuevos métodos que permiten a un agente sin conocimiento previo de la tarea resolver problemas de forma más eficiente mediante el uso de planificación automática. Comenzaremos mostrando cómo aplicar planificación probabilística para mejorar la eficiencia de robots en tareas de manipulación (como limpiar suciedad o recoger una mesa). Los planificadores son capaces de obtener las secuencias de acciones que producen los mejores resultados a largo plazo, superando a las estrategias reactivas. Por otro lado, presentamos nuevos algoritmos de aprendizaje por refuerzo en los que el agente puede solicitar demostraciones a un profesor. Dichas demostraciones permiten al agente acelerar el aprendizaje o aprender nuevas acciones. En particular, proponemos un algoritmo que permite al usuario establecer la mínima suma de recompensas que es aceptable obtener, donde una recompensa más alta implica que se requerirán más demostraciones. Además, el modelo aprendido será analizado para identificar qué partes están incompletas o son problemáticas. Esta información permitirá al agente evitar errores irrecuperables y también guiar al profesor cuando se solicite una demostración. Finalmente, se ha introducido un nuevo método de aprendizaje para modelos de dominios que, además de obtener modelos relacionales de acciones probabilísticas, también puede aprender efectos exógenos. Mostraremos cómo integrar este método en algoritmos de aprendizaje por refuerzo para poder abordar una mayor cantidad de problemas. En resumen, hemos mejorado el uso de técnicas de aprendizaje y planificación para resolver tareas desconocidas a priori. Estas mejoras permiten a un agente aprovechar mejor los planificadores, aprender más rápido, elegir entre reducir el número de acciones ejecutadas o el número de demostraciones solicitadas, evitar errores irrecuperables, interactuar con un profesor para resolver problemas complejos, y adaptarse al comportamiento de otros agentes aprendiendo sus dinámicas. Todos los métodos propuestos han sido comparados con trabajos del estado del arte, y han sido evaluados en distintos escenarios, incluyendo tareas robóticas.
Estilos ABNT, Harvard, Vancouver, APA, etc.
37

Yik, Tak Fai Computer Science &amp Engineering Faculty of Engineering UNSW. "Locomotion of bipedal humanoid robots: planning and learning to walk". Awarded by:University of New South Wales. Computer Science & Engineering, 2007. http://handle.unsw.edu.au/1959.4/40446.

Texto completo da fonte
Resumo:
Pure reinforcement learning does not scale well to domains with many degrees of freedom and particularly to continuous domains. In this thesis, we introduce a hybrid method in which a symbolic planner constructs all approximate solution to a control problem.. Subsequently, a numerical optimisation algorithm is used to refine the qualitative plan into an operational policy. The method is demonstrated on the problem of learning a stable walking gait for a bipedal robot. The contributions of this thesis are as follows. Firstly, the thesis proposes a novel way to generate gait patterns by using a genetic algorithm to generate walking gaits for a humanoid robot using zero moment point as the stability criterion. This is validated on physical robot. Second, we propose an innovative generic learning method that utilises the trainer's domain knowledge about the task to accelerate learning and extend the capabilities of the learning algorithm. The proposed method, which takes advantage of domain knowledge and combines symbolic planning and learning to accelerate and reduce the search space of the learning problem, is tested on a bipedal humanoid robot learning to walk. Finally, it is shown that the extended capability of the learning algorithm handles high complexity learning tasks in the physical world with experimental verification on a physical robot.
Estilos ABNT, Harvard, Vancouver, APA, etc.
38

Liemhetcharat, Somchaya. "Representation, Planning, and Learning of Dynamic Ad Hoc Robot Teams". Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/304.

Texto completo da fonte
Resumo:
Forming an effective multi-robot team to perform a task is a key problem in many domains. The performance of a multi-robot team depends on the robots the team is composed of, where each robot has different capabilities. Team performance has previously been modeled as the sum of single-robot capabilities, and these capabilities are assumed to be known. Is team performance just the sum of single-robot capabilities? This thesis is motivated by instances where agents perform differently depending on their teammates, i.e., there is synergy in the team. For example, in human sports teams, a well-trained team performs better than an allstars team composed of top players from around the world. This thesis introduces a novel model of team synergy — the Synergy Graph model — where the performance of a team depends on each robot’s individual capabilities and a task-based relationship among them. Robots are capable of learning to collaborate and improving team performance over time, and this thesis explores how such robots are represented in the Synergy Graph Model. This thesis contributes a novel algorithm that allocates training instances for the robots to improve, so as to form an effective multi-robot team. The goal of team formation is the optimal selection of a subset of robots to perform the task, and this thesis contributes team formation algorithms that use a Synergy Graph to form an effective multi-robot team with high performance. In particular, the performance of a team is modeled with a Normal distribution to represent the nondeterminism of the robots’ actions in a dynamic world, and this thesis introduces the concept of a δ-optimal team that trades off risk versus reward. Further, robots may fail from time to time, and this thesis considers the formation of a robust multi-robot team that attains high performance even if failures occur. This thesis considers ad hoc teams, where the robots of the team have not collaborated together, and so their capabilities and synergy are initially unknown. This thesis contributes a novel learning algorithm that uses observations of team performance to learn a Synergy Graph that models the capabilities and synergy of the team. Further, new robots may become available, and this thesis introduces an algorithm that iteratively updates a Synergy Graph with new robots.
Estilos ABNT, Harvard, Vancouver, APA, etc.
39

Üre, Nazim Kemal. "Multiagent planning and learning using random decompositions and adaptive representations". Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/97359.

Texto completo da fonte
Resumo:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 129-139).
Multiagent planning problems are ubiquitous in engineering. Applications range from control of robotic missions and manufacturing processes to resource allocation and traffic monitoring problems. A common theme in all of these missions is the existence of stochastic dynamics that stem from the uncertainty in the environment and agent dynamics. The combinatorial nature of the problem and the exponential dependency of the planning space on the number of agents render many of the existing algorithms practically infeasible for real-life applications. A standard approach to improve the scalability of planning algorithms is to take advantage of the domain knowledge, such as decomposing the problem to a group of sub-problems and exploiting decouplings among the agents, but such domain knowledge is not always available. In addition, many existing multiagent planning algorithms rely on the existence of a model, but in many real-life situations models are often approximated, wrong, or just unavailable. The convergence rate of the multiagent learning process can be improved by sharing the learned models across the agents. However, many realistic applications involve heterogeneous teams, where the agents have dissimilar transition dynamics. Developing multiagent learning algorithms for such heterogeneous teams is significantly harder, since the learned models cannot be naively transferred across agents. This thesis develops scalable multiagent planning and learning algorithms for heterogeneous teams by using embedded optimization processes to automate the search for decouplings among agents, thus decreasing the dependency on the domain knowledge. Motivated by the low computational complexity and theoretical guarantees of the Bayesian Optimization Algorithm (BOA) as a meta-optimization method for tuning machine learning applications, the developed multiagent planning algorithm, Randomized Coordination Discovery (RCD) extends the BOA to automate the search for finding coordination structures among the agents in Multiagent Markov Decision Processes. The resulting planning algorithm infers how the problem can be decomposed among agents based on the sampled trajectories from the model, without needing any prior domain knowledge or heuristics. In addition, the algorithm is guaranteed to converge under mild assumptions and outperforms the compared multiagent planning methods across different large-scale multiagent planning problems. The multiagent learning algorithms developed in this thesis use adaptive representations and collaborative filtering methods to develop strategies for learning heterogeneous models. The goal of the multiagent learning algorithm is to accelerate the learning process by discovering the similar parts of agents transition models and enable the sharing of these learned models across the team. The proposed multiagent learning algorithms Decentralized Incremental Feature Dependency Discovery (Dec-iFDD) and its extension Collaborative Filtering Dec-iFDD (CF-Dec-iFDD) provide improved scalability and rapid learning for heterogeneous teams without having to rely on domain knowledge and extensive parameter tuning. Each agent learns a linear function approximation of the actual model, and the number of features is increased incrementally to automatically adjust the model complexity based on the observed data. These features are compact representations of the key characteristics in the environment dynamics, so it is these features that are shared between agents, rather than the models themselves. The agents obtain feedback from other agents on the model error reduction associated with the communicated features. Although this process increases the communication cost of exchanging features, it greatly improves the quality/utility of what is being exchanged, leading to improved convergence rate. Finally, the developed planning and learning algorithms are implemented on a variety of hardware flight missions, such as persistent multi-UAV health monitoring and forest fire management scenarios. The experimental results demonstrate the applicability of the proposed algorithms on complex multiagent planning and learning problems.
by Nazim Kemal Ure.
Ph. D.
Estilos ABNT, Harvard, Vancouver, APA, etc.
40

Robertson, Laura, Eric Dunlap, Ryan A. Nivens e Kelli Barnett. "Sailing into Integration: Planning and Implementing Integrated 5E Learning Cycles". Digital Commons @ East Tennessee State University, 2019. https://dc.etsu.edu/etsu-works/5924.

Texto completo da fonte
Resumo:
Robertson et al detail a 5E learning cycle integrating science and mathematics challenges and engages second-grade students in designing sail cars to solve an engineering problem. Students create and use line plots to organize their data to evaluate the strengths and weakness of their design solutions. Three strategies were used to plan for successful integration in a 5E learning cycle. Integrating the STEM fields can increase student interest and learning, and our experiences with this integrated project support this finding.
Estilos ABNT, Harvard, Vancouver, APA, etc.
41

Nowak, Hans II(Hans Antoon). "Strategic capacity planning using data science, optimization, and machine learning". Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/126914.

Texto completo da fonte
Resumo:
Thesis: M.B.A., Massachusetts Institute of Technology, Sloan School of Management, in conjunction with the Leaders for Global Operations Program at MIT, May, 2020
Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, in conjunction with the Leaders for Global Operations Program at MIT, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 101-104).
Raytheon's Circuit Card Assembly (CCA) factory in Andover, MA is Raytheon's largest factory and the largest Department of Defense (DOD) CCA manufacturer in the world. With over 500 operations, it manufactures over 7000 unique parts with a high degree of complexity and varying levels of demand. Recently, the factory has seen an increase in demand, making the ability to continuously analyze factory capacity and strategically plan for future operations much needed. This study seeks to develop a sustainable strategic capacity optimization model and capacity visualization tool that integrates demand data with historical manufacturing data. Through automated data mining algorithms of factory data sources, capacity utilization and overall equipment effectiveness (OEE) for factory operations are evaluated. Machine learning methods are then assessed to gain an accurate estimate of cycle time (CT) throughout the factory. Finally, a mixed-integer nonlinear program (MINLP) integrates the capacity utilization framework and machine learning predictions to compute the optimal strategic capacity planning decisions. Capacity utilization and OEE models are shown to be able to be generated through automated data mining algorithms. Machine learning models are shown to have a mean average error (MAE) of 1.55 on predictions for new data, which is 76.3% lower than the current CT prediction error. Finally, the MINLP is solved to optimality within a tolerance of 1.00e-04 and generates resource and production decisions that can be acted upon.
by Hans Nowak II.
M.B.A.
S.M.
M.B.A. Massachusetts Institute of Technology, Sloan School of Management
S.M. Massachusetts Institute of Technology, Department of Mechanical Engineering
Estilos ABNT, Harvard, Vancouver, APA, etc.
42

Fowler, Michael C. "Intelligent Knowledge Distribution for Multi-Agent Communication, Planning, and Learning". Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/97996.

Texto completo da fonte
Resumo:
This dissertation addresses a fundamental question of multi-agent coordination: what infor- mation should be sent to whom and when, with the limited resources available to each agent? Communication requirements for multi-agent systems can be rather high when an accurate picture of the environment and the state of other agents must be maintained. To reduce the impact of multi-agent coordination on networked systems, e.g., power and bandwidth, this dissertation introduces new concepts to enable Intelligent Knowledge Distribution (IKD), including Constrained-action POMDPs (CA-POMDP) and concurrent decentralized (CoDec) POMDPs for an agnostic plug-and-play capability for fully autonomous systems. Each agent runs a CoDec POMDP where all the decision making (motion planning, task allocation, asset monitoring, and communication) are separated into concurrent individual MDPs to reduce the combinatorial explosion of the action and state space while maintaining dependencies between the models. We also introduce the CA-POMDP with action-based constraints on partially observable Markov decision processes, rewards driven by the value of information, and probabilistic constraint satisfaction through discrete optimization and Markov chain Monte Carlo analysis. IKD is adapted real-time through machine learning of the actual environmental impacts on the behavior of the system, including collaboration strategies between autonomous agents, the true value of information between heterogeneous systems, observation probabilities and resource utilization.
Doctor of Philosophy
This dissertation addresses a fundamental question behind when multiple autonomous sys- tems, like drone swarms, in the field need to coordinate and share data: what information should be sent to whom and when, with the limited resources available to each agent? Intelligent Knowledge Distribution is a framework that answers these questions. Communication requirements for multi-agent systems can be rather high when an accurate picture of the environment and the state of other agents must be maintained. To reduce the impact of multi-agent coordination on networked systems, e.g., power and bandwidth, this dissertation introduces new concepts to enable Intelligent Knowledge Distribution (IKD), including Constrained-action POMDPs and concurrent decentralized (CoDec) POMDPs for an agnostic plug-and-play capability for fully autonomous systems. The IKD model was able to demonstrate its validity as a "plug-and-play" library that manages communications between agents that ensures the right information is being transmitted at the right time to the right agent to ensure mission success.
Estilos ABNT, Harvard, Vancouver, APA, etc.
43

Voshell, Martin G. "Planning Support for Running Large Scale Exercises as Learning Laboratories". The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1238162734.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
44

Farnan, Emma. "Community planning in Northern Ireland : learning from Scotland and Wales". Thesis, Ulster University, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.686635.

Texto completo da fonte
Resumo:
The research undertaken in this thesis presents an inquiry into community planning which is considered a new governance approach for improving public service arrangements. Northern Ireland is the last of the devolved administrations to introduce community planning with its introduction having taken place in April 2015. Community planning is a comprehensive approach that functions on the premise that enduring social problems are best addressed through assuming collaborative cross-sector partnership working arrangements. As such, it is subject to all the challenges of partnership working with the added challenges associated with participatory democracy - representation, inclusion and empowerment (Cowell, 2004). When this is taken in tandem with the actuality that governance arrangements in Northern Ireland are traditionally centralised and silo-like in nature, the scale of change required for engendering effective community planning is significant. Given its embryonic state, the research takes the view that community planning in Northern Ireland can be enhanced through drawing lessons from the experiences of its devolved counterparts. The scope of the research is twofold. Firstly, it theorises the emergence of community planning and conceptualises the approach. Secondly, it employs new institutionalism as an organisational frame and applies the concepts of lesson-drawing (Rose, 1993) and policy mobility to draw holistic and practical lessons from Scotland and Wales. A multiple case study strategy is employed to investigate the governance, policy and practice of community planning with case stUdies from Northern Ireland utilised to ascertain the receptivity of the proposed lessons. The thesis reports on the transferability of the lessons and asserts those contextual differences in ideology, the institutional environment, knowledge and experience, and resources suggest that the scale of change required to import lessons is considerable.
Estilos ABNT, Harvard, Vancouver, APA, etc.
45

Bailey, Shelley Henthorne Dunn Caroline. "Parent involvement in transition planning for students with learning disabilities". Auburn, Ala., 2009. http://hdl.handle.net/10415/1985.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
46

NICOLA, GIORGIO. "Human-Aware Task e Motion Planning attraverso Deep Reinforcement Learning". Doctoral thesis, Università degli studi di Padova, 2022. http://hdl.handle.net/11577/3445088.

Texto completo da fonte
Resumo:
Questa tesi indaga l'applicazione del Deep Reinforcement Learning per sviluppare task e motion planner human-aware. Le applicazioni uomo-robot introducono una serie di criticità al problema di Task e Motion Planning che è già complesso. Infatti, scenari uomo-robot sono non deterministici e altamente dinamici; quindi, è necessario calcolare i piani rapidamente e adattarsi a un ambiente in continua evoluzione. Pertanto, questa tesi ha studiato il problema della pianificazione come un problema di decisione sequenziale modellato come Markov Decision Process risolto tramite Reinforcement Learning. I processi decisionali di Markov sono una possibile risposta al problema degli ambienti non deterministici e dinamici. Infatti, da un lato, sono modelli stocastici, e d'altra parte, invece di calcolare un piano completo all'inizio di ogni attività, passo dopo passo, l'azione ottimale da eseguire viene calcolata sulla base di lo stato attuale dell'ambiente. In particolare, vengono dapprima indagati separatamente i problemi di task planning e di motion planning; successivamente, il problema combinato è studiato. Le soluzioni proposte si sono dimostrate in grado di calcolare piani di attività, piani di movimento, piani di attività e di movimento rapidi ed efficaci in applicazioni dinamiche e non deterministiche come la cooperazione uomo-robot. In tutte le applicazioni è stato notato che l'agente è stato in grado di identificare situazioni pericolose e ridurre al minimo il rischio, ad esempio, nella pianificazione delle attività scegliendo il compito con minore probabilità di guasto o nella pianificazione del movimento evitando regioni di spazio con un'alta probabilità di collisione. Inoltre, è stato possibile garantire la sicurezza combinando Task and Motion Plan consapevole delle persone con gli attuali standard di sicurezza del settore.
This thesis investigates the application of Deep Reinforcement Learning to develop humanaware task and motion planners. Human-robot applications introduce a set of criticalities to the problem of Task and Motion Planning that is already complex. Indeed, human-robot scenarios are non-determinism and highly dynamic; thus, it is necessary to compute plans quickly and adapt to an ever-changing environment. Therefore, this thesis studied the planning problem as a sequential decision-making problem modeled as Markov Decision Process solved via Reinforcement Learning. Markov Decision Processes are a possible answer to the problem of non-deterministic and dynamic environments. Indeed, on the one hand, are stochastic models, and on the other hand, rather than computing a complete plan at the beginning of each activity, step by step, the optimal action to perform is computed based on the current status of the environment. In particular, it is firstly investigated the task planning and the motion planning problems separately; subsequently, the combined problem is studied. The proposed solutions proved to be able to compute quick and effective task plans, motion plans, task and motion plans in dynamic e non-deterministic applications like humanrobot cooperation. In all the applications, it was noticed that the agent was able to identify hazardous situations and minimize the risk, for example, in task planning by choosing the task with lower failure probability or in motion planning by avoiding region of space with a high probability of collision. Furthermore, it was possible to ensure safety by combining human-aware Task and Motion Planning with current industry safety standards.
Estilos ABNT, Harvard, Vancouver, APA, etc.
47

Wight, John Bradford. "The territory/function dialectic : a social learning paradigm of regional development planning". Thesis, University of Aberdeen, 1985. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU361633.

Texto completo da fonte
Resumo:
A personal social learning experience in itself, the thesis articulates the territory/function dialectic as an alternative, social learning paradigm of regional development planning. The current crisis affecting this activity is firstly diagnosed, the underlying problem is then traced to the prevailing orthodoxy, and, in its place, a new paradigm is offered. The story behind the thesis is told via a characterisation of the overall study process as a transition from objective empiricism to empirical subjectivism. The story features highlights of the main case study experiences as well as those insights gained during the actual creation, that is, in the writing, of the ultimate thesis. After identifying the desirable qualities in a contending paradigm, and elaborating the basic elements of the territory/function dialectic, particular attention is given to the significance of territory. This is complemented by a discussion of the fundamental change in the thinking of John Friedmann, who must be credited with originating the subject dialectic. A literature review is presented featuring a consideration of competing paradigms. A detailed contrast of the centre-periphery and territory/function conceptualisations is also presented before concluding with some critical revelations and key insights. The territory/function dialectic is seen to possess the attributes of both a substantive and methodological paradigm. The special paradigm status is bolstered by a consideration of geography's role in relation to the key concept of territory. The paradigm as a whole is seen to underpin an alternative epistemology combining critical science and social learning. The lessons from a social learning experience are elaborated in a revisitation of the original objectives-cum-working hypotheses. These lessons feature: the pursuit of more real theory; the social value of underdevelopment theory; the explicit role of the state as manifest in official practice; and the significance of learning through collective action. The territory/function dialectic is seen to provide the necessary link between theory and practice in an all encompassing manner. The thesis concludes with a review of certain basic, dialectical, dualities. There is also specific consideration of planning and social learning, entailing further distinctions between not only theory and practice, but also between scientific practice and social practice.
Estilos ABNT, Harvard, Vancouver, APA, etc.
48

Sills, Elizabeth Schave. "Classroom negotiations : implementing new strategies for learning". Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/70290.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
49

McKinney, Shaune LaSheane. "Implementing Assistive Technology through Program Planning". ScholarWorks, 2015. https://scholarworks.waldenu.edu/dissertations/1448.

Texto completo da fonte
Resumo:
Special education (SPED) service providers in the military are often underprepared to use the needed assistive technology (AT) in the classroom. This concurrent mixed-method study sought to explore the attitudes, skills, and quality indicators of assistive technology (QIAT) among 19 currently employed military SPED certified multidisciplinary team members. The conceptual framework of this study was based on the professional learning community model, which holds that the team members work collaboratively to educate the families it serves. All team members completed a quantitative QIAT survey and open-ended questionnaire, and individual qualitative interviews were conducted with a subsample of 8 volunteer staff. QIAT survey data were descriptively analyzed, while questionnaire data were transcribed, open coded, and thematically analyzed. All data were triangulated and member checking and peer debriefing were used to strengthen validity and credibility of the findings. Survey data revealed teachers' willingness to utilize AT in the classroom, although qualitative data suggested that the multidisciplinary team lacked the knowledge to consistently and confidently utilize AT within their classes daily. Additional emergent themes included collaboration, viable resources, unifying guidelines, AT support, training, and guidance. Administrators at the local site can use these findings as guidance in the development of in-service and district AT trainings and support. Through consistent usage of these interventions, the military community can impact positive change in the lived experiences of SPED service providers and the families that it serves.
Estilos ABNT, Harvard, Vancouver, APA, etc.
50

Morales, Aguirre Marco Antonio. "Metrics for sampling-based motion planning". [College Station, Tex. : Texas A&M University, 2007. http://hdl.handle.net/1969.1/ETD-TAMU-2462.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
Oferecemos descontos em todos os planos premium para autores cujas obras estão incluídas em seleções literárias temáticas. Contate-nos para obter um código promocional único!

Vá para a bibliografia